Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sending additional parameters #128

Open
llermaly opened this issue Jan 31, 2024 · 6 comments
Open

Sending additional parameters #128

llermaly opened this issue Jan 31, 2024 · 6 comments

Comments

@llermaly
Copy link

llermaly commented Jan 31, 2024

Which connector is affected?

All sources that support filtering.

What would you like to see improved?

How possible is to send additional parameters for metadata filtering?

response = co.chat(  
	message="What is the chemical formula for glucose?",  
	connectors=[{"id": "my-connector", "params": {"some_field": "some_value"} }]  
)

The only way I can think of now is passing parameters on creation time:

created_connector = co.create_connector(
            name="Example connector",
            url="https://connector-example.com/search?some_field=some_value",
        )

But that's not very flexible.

Do you think calling the connector API directly with the filters, and then sending the results to the Cohere documents endpoint would do the trick?

curl --request POST  
    --url 'https://connector-example.com/search'
    --header 'Content-Type: application/json'  
    --data '{  
    "query": "How do I expense a meal?" ,
    "some_field": "some_value"
  }'

And then

            response = co.chat(
                message=message,
                documents=documents,
                conversation_id=self.conversation_id,
                stream=True,
            )

Is there a simpler way to achieve this filtering?

Thanks!

Additional information

No response

@tianjing-li
Copy link
Collaborator

tianjing-li commented Jan 31, 2024

This is a great question and a consideration I've had for a while as well. The difficulty in making a generic solution is that a lot of these 3rd party APIs either don't support metadata filtering, or have ways of filtering data that can vary.

For example, some APIs could require them as query parameters, others in the request body, others leverage a Python SDK, so we would have to call TheirSDKClient.search(query, myfilter1=value1, myfilter2=value2).

Now ideally from a user perspective, you can just pass in the filters like you've outlined in

response = co.chat(  
	message="What is the chemical formula for glucose?",  
	connectors=[{"id": "my-connector", "params": {"some_field": "some_value"} }]  
)

And be able to generated different search results per chat.

Solution 1 (Long-term - difficult)

We modify the connectors and update documentation as to what metadata fields can be used during query time, decide on a format to receive these values in when you pass them to the /chat endpoint in the connectors parameter.

The filtering logic would happen at the connector level. This would of course require a lot of work to do for all existing connectors. I would have to talk to the internal Coral API team as well, so that the field values you send with the connectors get sent to the /search request performed by the connector, but if we decide to go with a long-term solution this is probably the best route. The user shouldn't need to worry about how the search is filtered, only that they are able to.

Solution 2 (To unblock - easy)

As you've outlined, probably the easiest way is to retrieve the documents yourself and do all the metadata filtering prior to calling /chat.

@llermaly
Copy link
Author

@tianjing-li Thanks for your quick answer. I really love how many connectors do you have available is amazing.

I think you should discuss with the internal team the ability of sending arbitrary parameters via API , and then every dev can configure the connector accordingly.

For example, having to pre-select a single folder in google drive connector limits the possibilities a lot, if you enable us to send the array of document names we can easily implement that in the connector.

I will go with Solution 2. Is the effect the same in terms of chunking/ranking using the connector than using the docs API? if it's better to use the connectors we can evaluate to copy the documents we need to a folder before making the call until we can filter within the connector.

Thanks

@tianjing-li
Copy link
Collaborator

@llermaly the chunking/ranking portion is done by the .chat() call itself, so no worries about not going through the connector. I'll raise the parameter suggestion internally, I agree that there's alot of value we can add.

@llermaly
Copy link
Author

llermaly commented Feb 1, 2024

Thanks @tianjing-li we will be waiting. We can close this issue. Please let me know if you want me to give our feedback in the future

@tianjing-li
Copy link
Collaborator

@llermaly I raised the suggestion internally, perhaps we can keep this open for now. Tagging @walterbm-cohere and @daniel-cohere who work directly on the internal API

@tianjing-li
Copy link
Collaborator

tianjing-li commented Feb 6, 2024

@llermaly this is now in progress - from what I've gathered the connector's API will allow passing in an options JSON body that will need to be parsed by the connector, from your side you would be able to make any changes to connectors by referring to their docs and adding the filters appropriately.

Any PRs for these would be greatly appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants