Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal Server Error while searching by text #311

Open
vishaal27 opened this issue Sep 11, 2023 · 2 comments
Open

Internal Server Error while searching by text #311

vishaal27 opened this issue Sep 11, 2023 · 2 comments
Labels
enhancement New feature or request

Comments

@vishaal27
Copy link

Hi,

I am trying to search the hosted clip-retrieval backend index by text and end up getting an internal server error.
Here is the code I used:

from clip_retrieval.clip_client import ClipClient, Modality
client = ClipClient(url="https://knn.laion.ai/knn-service", indice_name="laion5B-L-14", modality=Modality.TEXT)
results = client.query(text="an image of a cat")
print(results)

Output:

{'message': 'Internal Server Error'}

However, when I search by image it seems to be working fine:

from clip_retrieval.clip_client import ClipClient, Modality
client = ClipClient(url="https://knn.laion.ai/knn-service", indice_name="laion5B-L-14", modality=Modality.IMAGE)
results = client.query(text="an image of a cat")
print(results[0])

Output:

{'caption': 'orange cat with supicious look stock photo', 'url': 'https://media.istockphoto.com/photos/orange-cat-with-supicious-look-picture-id907595140?k=6&m=907595140&s=612x612&w=0&h=4CTvSxNvv4sxSCPxViryha4kAjuxDbrXM5vy4VPOuzk=', 'id': 518836491, 'similarity': 0.5591729879379272}

Please let me know if I am doing something incorrect, or if direct text-to-text similarity search is not enabled by default in clip-retrieval's client? Thanks :)

@rom1504
Copy link
Owner

rom1504 commented Sep 11, 2023 via email

@vishaal27
Copy link
Author

My main usecase: I want to do a text search for certain keywords in all the captions of the LAION-5B dataset, what would you recommend is the fastest way to do this (without locally downloading the parquet files)? I realise the clip index does not do direct text matching but rather cosine similarities. I would still be interested in pure text-to-text matching on the LAION-5B index.
For example, I want to search "gaussian noise" as keyword and check for exact matches / most similar text captions in the LAION-5B dataset.

@rom1504 rom1504 added the enhancement New feature or request label Jan 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants