Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KMeans clustering #403

Open
ZhenNan2016 opened this issue Nov 18, 2023 · 0 comments
Open

KMeans clustering #403

ZhenNan2016 opened this issue Nov 18, 2023 · 0 comments

Comments

@ZhenNan2016
Copy link

ZhenNan2016 commented Nov 18, 2023

Regarding spann, I would like to ask a few questions, as follows:

  1. Regarding KMeans clustering, what is the limit for each cluster center? If it exceeds this limit, will it be re divided into one or multiple layers?
  2. When does the centroid in memory need to be updated after clustering is completed?
  3. After completing clustering, should the new vector data be written directly into posting list in the disk or stored as centroids in memory?
  4. When will KMeans clustering be done again?
  5. There are too many clusters, will they be clustered with KMeans clustering algorithms again?
  6. What is the difference between sptag and sptag++ ?
  7. One question about Hierarchical data partition and partial search, as follows:
    Does each query require two steps: 1) Distributed dispatch and 2) Local Search?
    What are the transactions for these two steps?
    image

Looking forward to your reply.
Thanks very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant