Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(#111): support user-defined embedding dimensions #119

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

bernard-ng
Copy link
Contributor

support for dimensions for OpenAI Embeddings Generator

see: https://platform.openai.com/docs/api-reference/embeddings/create#embeddings-create-dimensions

dimensions integer Optional
The number of dimensions the resulting output embeddings should have. Only supported in text-embedding-3 and later models.

@MaximeThoonsen
Copy link
Contributor

MaximeThoonsen commented May 12, 2024

Hey @bernard-ng, thanks for your contribution on this one.

There is a bug when using with ada-002 : " This model does not support specifying dimensions.". It tries to call with dimension, you should handle the case ada-002.

Can you also change the interface signature everywhere for embedText? I don't understand why we don't have a type error but I guess we should 😅.

If you can add an Integration test (and use your openai key in local to run it), that would be better to prevent other bugs.
You can run it with ./vendor/bin/pest ./tests/Integration/Embeddings/EmbeddingGenerator/OpenAI/OpenAIEmbeddingGeneratorTest.php

Otherwise it's a great improvement! It deserves few lines in the doc. If you don't have time, I'll do it.

Thanks a lot,

Maxime

@bernard-ng
Copy link
Contributor Author

Hey @MaximeThoonsen noted, let's fix that 😅

@bernard-ng
Copy link
Contributor Author

@MaximeThoonsen some design considerations

while working on this issue, I realized that adding this functionality to the interface would mean that all embedding models would have to offer this feature, otherwise it would be like adding this “dimensions” option to all EmbeddingGenerator classes without it being used.

what direction should I take? given that this feature is specific to open ai, it doesn't really make sense for it to be available for mistral or ollama.

@MaximeThoonsen
Copy link
Contributor

Hey @bernard-ng ,

I suggest you throw an "not implemented" exception when it is not used with OpenAI.
Sound good to you?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants