Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hybrid search slows down upsert operation #601

Open
akset2X opened this issue Nov 18, 2023 · 1 comment
Open

Hybrid search slows down upsert operation #601

akset2X opened this issue Nov 18, 2023 · 1 comment

Comments

@akset2X
Copy link

akset2X commented Nov 18, 2023

I get to know that delete is not happening on hybrid index. Deleting the embeddings using its "id" is only deleting the embedding list but the parallely created "scoring.terms" and "scoring" remains untouched. This makes /add and /upsert to be very delayed (I guess so).

path: ./index
writable: True

embeddings:
  path: sentence-transformers/all-MiniLM-L6-v2
  content: True
  hybrid: True

I could see that I have 30k plus embedding count available, using /count API. With hybrid index even though search is faster it looks like /upsert is very slow as the data increases. I thought deleting some data would help speed up the upsert, but deleting the embedding doesn't reduce the file size of "scoring" and "scoring.terms" in the actual file directory.
Or is there anyway to access the files like documents, scoring.terms and scoring and delete something safely?
How can I speed up upsert as the count of documents increase?

@davidmezzetti
Copy link
Member

Thank you for the write up. I'll take a look and report back.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants