New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client/server tool invocations with useChat and streamText #1514
Conversation
Thanks sm for working on this. When you say "parallel client/server tool calls", would it also support one either: (1) allowing a server-side tool call to call a client-side tool, or (2) passing the result of a server-side tool to the client? I've been trying to find a workable impl for either of the two above scenarios, but the closest thing I've seen is serializing the result of server-side tool calls via JSON and sending it back as an assistant message, which comes w/ its own set of caveats like the lack of a follow-up message from the assistant w/o an additional user message. |
It will support both client and server side tool calls, and they can be mixed and parallel. ExampleYou define 2 tools, one that is executed client side (i.e. has no execute method), and one that is executed server-side (i.e. with an execute method). The client side code then receives 3 related events:
On the client, you'd execute the client-side tool, do whatever you want with the server side tool result, and then use onFinish to send back both tool results to the server, so that the AI can continue the conversation (this is important for this to work at all, because OpenAI expects tool results for all tool calls it invokes). On the next roundtrip, the LLM will know the tool results and can come up with text responses or more tool calls. Please note that this is a WIP and the design might change. |
Nice |
@lgrammel this is awesome... |
I've been playing around w/ this branch since it got merged into the main package. Is it intentional that the model doesn't interpret the results of a pure server tool call? (i.e. no client / mixed calls, just server-side tools) The scenario I'm exploring is where a model is querying a vector DB to find similar documents (for Q&A). Before, the results of the server tool call would be "interpreted" by the model to formulate a coherent response, but right now the model will send back an empty content string in its message (but w/ a populated |
@rajdtta not sure what you mean. here is the example of a pure server side call: https://github.com/vercel/ai/blob/main/examples/next-openai/app/use-chat-tool-result/page.tsx That being said, feeding pure server-side calls back to the LLM automatically to generate the 2nd assistant response is not implemented yet. |
I'm assuming you could generate the 2nd assistant response with a callback or automated message workflow. We had a similar system as @rajdtta where we got a textual response on tool calls, but i think the However, I'm sure most users will intuitively expect a coherent text response on server tool calls so it may be good to implement anyways. It shows a very "magic-like" response for anyone who gets tools/function calls working |
Ah, this was exactly what I was trying to describe haha.
I agree that it might be good to have it built-in to the library (either as an opt-in or opt-out?). I'm not against making my own impl, but I'd want to make sure that I'm following best practices / not calling the API unnecessarily. On a side note, has anyone figured out how to persistently save / store tool calls when using |
@lgrammel @aychang95 @rajdtta checking to make sure I understand correctly - so How do we send back the tool results to the LLM in the server-side tool case without direct access to
|
From my understanding, As for having the LLM interpret server-side tool calls, that's something I'm still trying to figure out in my sandbox. Quickly subbing in the snippet you sent above just causes the model to repeatedly call the server tool in a loop, and trying to use |
@lgrammel have you looked at the |
It seems that all tool invocations are now exposed to the client. But what if I don't want this? In my case a tool call is something which happens internally and should not be exposed to the client because it contains private data. |
@webholics maybe some sort of flag could be implemented in the tool definition that controls whether a tool response / invocation gets sent to the client (or what aspects are sent to the client, e.g. only sending the tool name, only sending tool response, sending both, etc.) |
Or some kind of callback on the AI stream to be able to filter out or change messages programmatically. |
@webholics @rajdtta filtering tool calls needs to be handled very carefully, because it will impact the messages send to the provider in the next roundtrip. If they are left out, it would impact the LLM result, potentially leading to weird answer or in the case of inconsistent history, errors. |
I see. Thats why this whole topic is so complex. In pure server side tools we would need some kind of server side persistence for those calls and responses. Which is of course a bit out of scope for the AI SDK. Its just very likely now to run into security issues because the backend exposes a lot of information per default to the client which the developer may not expect. |
@webholics if you want to filter manually, you could use |
@lgrammel do you happen to know what the best practices are for the following:
Right now, I'm upserting the latest 2 messages sent to the API route whenever it's called (which, most of the time, includes the tool invocations/results). I'm not sure if this will work for a scenario where a tool error occurs & the model decides to re-call a tool after spitting out some filler message (i.e. "Sorry I was having a hard time doing this, let me try again...") const LATEST_2_MESSAGES = messages.slice(-2);
for (const message of LATEST_2_MESSAGES) {
await ChatService.postMessage({
conversationID,
message: {
id: message.id,
role: message.role,
message: message.content,
toolInvocations: message.toolInvocations ?? undefined,
},
});
}
My current implementation of this involves looking at the tool invocations for the latest message (if it exists), and checking if a specific flag is present in the tool results that determines whether a tool call should be interpreted by the model (and another flag to check whether it has already been processed or not). const LATEST_MESSAGE = messages.slice(-1);
if (LATEST_MESSAGE && LATEST_MESSAGE.toolInvocations) {
LATEST_MESSAGE.toolInvocations.forEach((invocation) => {
if ("result" in invocation && invocation.result.interpretable) {
if (!invocation.result.interpreted) {
experimental_addToolResult({
toolCallId: invocation.toolCallId,
result: {
...invocation.result,
interpreted: true,
},
});
}
}
});
} |
Summary
Adds support for client/server tool calls with
useChat
andstreamText
. Special focus is on enabling client side user interactions as tools.tool-call
andtool-result
stream partsstreamText
sends tool calls and tool results in AI streamuseChat
adds tool invocations to assistant messagesuseChat
(React) addsexperimental_addToolResult
helper to send tool results from e.g. user interactionsconvertToCoreMessages
converts UI messages with tool invocations to assistant and tool messages for AI/CoreExamples
examples/next-openai/app/use-chat-client-tool
: confirmation dialog and tool that requires confirmationexamples/next-openai/app/use-chat-tool-result
: display in-progress call & result for server side toolScreen.Recording.2024-05-13.at.15.22.47.mov
Subtasks
streamText
experimental_addToolResult
helper touseChat
convertToCoreMessages
helper