client/server tool invocations with useChat and streamText #1514

lgrammel · 2024-05-07T14:04:54Z

Summary

Adds support for client/server tool calls with useChat and streamText. Special focus is on enabling client side user interactions as tools.

Add tool-call and tool-result stream parts
streamText sends tool calls and tool results in AI stream
assistant messages (UI) can contain tool invocations (= tool results + tool calls, where tool results superseed tool calls)
useChat adds tool invocations to assistant messages
useChat (React) adds experimental_addToolResult helper to send tool results from e.g. user interactions
convertToCoreMessages converts UI messages with tool invocations to assistant and tool messages for AI/Core

Examples

examples/next-openai/app/use-chat-client-tool: confirmation dialog and tool that requires confirmation
examples/next-openai/app/use-chat-tool-result: display in-progress call & result for server side tool

Screen.Recording.2024-05-13.at.15.22.47.mov

Subtasks

rajdtta · 2024-05-07T16:08:20Z

Thanks sm for working on this.

When you say "parallel client/server tool calls", would it also support one either: (1) allowing a server-side tool call to call a client-side tool, or (2) passing the result of a server-side tool to the client?

I've been trying to find a workable impl for either of the two above scenarios, but the closest thing I've seen is serializing the result of server-side tool calls via JSON and sending it back as an assistant message, which comes w/ its own set of caveats like the lack of a follow-up message from the assistant w/o an additional user message.

lgrammel · 2024-05-07T16:36:49Z

When you say "parallel client/server tool calls", would it also support one either: (1) allowing a server-side tool call to call a client-side tool, or (2) passing the result of a server-side tool to the client?

It will support both client and server side tool calls, and they can be mixed and parallel.

Example

You define 2 tools, one that is executed client side (i.e. has no execute method), and one that is executed server-side (i.e. with an execute method). The client side code then receives 3 related events:

tool call for client side tool
tool call for server side tool
tool result for server side tool

On the client, you'd execute the client-side tool, do whatever you want with the server side tool result, and then use onFinish to send back both tool results to the server, so that the AI can continue the conversation (this is important for this to work at all, because OpenAI expects tool results for all tool calls it invokes).

On the next roundtrip, the LLM will know the tool results and can come up with text responses or more tool calls.

Please note that this is a WIP and the design might change.

Archimagus · 2024-05-08T15:10:00Z

Nice

aychang95 · 2024-05-13T18:27:26Z

@lgrammel this is awesome...
We've been extensively using experimental_ontoolcall for a while now, but i think this PR with v3.1.7 release might be enough to migrate to streamText for the tool invocation objects

rajdtta · 2024-05-13T18:27:44Z

When you say "parallel client/server tool calls", would it also support one either: (1) allowing a server-side tool call to call a client-side tool, or (2) passing the result of a server-side tool to the client?

It will support both client and server side tool calls, and they can be mixed and parallel.

Example

You define 2 tools, one that is executed client side (i.e. has no execute method), and one that is executed server-side (i.e. with an execute method). The client side code then receives 3 related events:

tool call for client side tool

tool call for server side tool

tool result for server side tool

On the client, you'd execute the client-side tool, do whatever you want with the server side tool result, and then use onFinish to send back both tool results to the server, so that the AI can continue the conversation (this is important for this to work at all, because OpenAI expects tool results for all tool calls it invokes).

On the next roundtrip, the LLM will know the tool results and can come up with text responses or more tool calls.

Please note that this is a WIP and the design might change.

I've been playing around w/ this branch since it got merged into the main package. Is it intentional that the model doesn't interpret the results of a pure server tool call? (i.e. no client / mixed calls, just server-side tools)

The scenario I'm exploring is where a model is querying a vector DB to find similar documents (for Q&A). Before, the results of the server tool call would be "interpreted" by the model to formulate a coherent response, but right now the model will send back an empty content string in its message (but w/ a populated toolInovcations array containing the call & the raw response).

lgrammel · 2024-05-13T18:40:56Z

@rajdtta not sure what you mean. here is the example of a pure server side call: https://github.com/vercel/ai/blob/main/examples/next-openai/app/use-chat-tool-result/page.tsx

That being said, feeding pure server-side calls back to the LLM automatically to generate the 2nd assistant response is not implemented yet.

aychang95 · 2024-05-13T18:47:53Z

I'm assuming you could generate the 2nd assistant response with a callback or automated message workflow. We had a similar system as @rajdtta where we got a textual response on tool calls, but i think the toolInvocations array provides better control even if it's an unintentional tradeoff.

However, I'm sure most users will intuitively expect a coherent text response on server tool calls so it may be good to implement anyways. It shows a very "magic-like" response for anyone who gets tools/function calls working

rajdtta · 2024-05-13T22:00:19Z

That being said, feeding pure server-side calls back to the LLM automatically to generate the 2nd assistant response is not implemented yet.

Ah, this was exactly what I was trying to describe haha.

However, I'm sure most users will intuitively expect a coherent text response on server tool calls so it may be good to implement anyways. It shows a very "magic-like" response for anyone who gets tools/function calls working

I agree that it might be good to have it built-in to the library (either as an opt-in or opt-out?). I'm not against making my own impl, but I'd want to make sure that I'm following best practices / not calling the API unnecessarily.

On a side note, has anyone figured out how to persistently save / store tool calls when using streamText? The onFinal callback recommended by the docs returns an empty string whenever a tool call occurs (and ofc will only trigger on the final tool call and not for any intermediaries, which would presumably be caught by onCompletion).

connorblack · 2024-05-14T15:06:00Z

@lgrammel @aychang95 @rajdtta checking to make sure I understand correctly - so experimental_addToolResult is meant to handle both server-side and client-side implementations for sending the results back to the LLM for the 2nd response, effectively replacing the experimental_onToolCall -> return ChatRequest flow? And the way we handle the 2nd invocation flow is now when the Message that's passed to onFinish contains toolInvocation results, we call experimental_addToolResult, for both client and server tools?

How do we send back the tool results to the LLM in the server-side tool case without direct access to triggerRequest - or does experimental_addToolResult handle de-duplicating the toolInvocation results array and we can do something like this:

  // handle sending response of server-side tool back to LLM
  
  const { experimental_addToolResult } = useChat({
    ....
    onFinish: async (message) => {
      if (
        message.role === 'assistant' &&
        message.toolInvocations &&
        message.toolInvocations.every(
          (toolInvocation) => 'result' in toolInvocation,
        )
      ) {
        return message.toolInvocations.forEach((toolInvocation) => {
          if ('result' in toolInvocation) {
            experimental_addToolResult({
              result: toolInvocation.result,
              toolCallId: toolInvocation.toolCallId,
            })
          }
        })
      }

      return onFinish(message)
    },
  ...
  })

rajdtta · 2024-05-14T15:58:01Z

@lgrammel @aychang95 @rajdtta checking to make sure I understand correctly - so experimental_addToolResult is meant to handle both server-side and client-side implementations for sending the results back to the LLM for the 2nd response, effectively replacing the experimental_onToolCall -> return ChatRequest flow? And the way we handle the 2nd invocation flow is now when the Message that's passed to onFinish contains toolInvocation results, we call experimental_addToolResult, for both client and server tools?

How do we send back the tool results to the LLM in the server-side tool case without direct access to triggerRequest - or does experimental_addToolResult handle de-duplicating the toolInvocation results array and we can do something like this:
  // handle sending response of server-side tool back to LLM
  
  const { experimental_addToolResult } = useChat({
    ....
    onFinish: async (message) => {
      if (
        message.role === 'assistant' &&
        message.toolInvocations &&
        message.toolInvocations.every(
          (toolInvocation) => 'result' in toolInvocation,
        )
      ) {
        return message.toolInvocations.forEach((toolInvocation) => {
          if ('result' in toolInvocation) {
            experimental_addToolResult({
              result: toolInvocation.result,
              toolCallId: toolInvocation.toolCallId,
            })
          }
        })
      }

      return onFinish(message)
    },
  ...
  })

From my understanding, experimental_addToolResult is used to pipe back results from client tool invocations back to the LLM (so that it's able to interpret the results and use it to decide its next step).

As for having the LLM interpret server-side tool calls, that's something I'm still trying to figure out in my sandbox. Quickly subbing in the snippet you sent above just causes the model to repeatedly call the server tool in a loop, and trying to use append to add a user message makes the model trigger a duplicate tool call (but interestingly, the result of the duplicate/second tool call doesn't get appended to the toolInvocation object despite it executing on the server-side).

connorblack · 2024-05-14T16:27:55Z

@lgrammel have you looked at the runTools helper from the openai-node package? I feel like it could be adapted for the server tool functionality:

webholics · 2024-05-16T12:49:17Z

It seems that all tool invocations are now exposed to the client. But what if I don't want this? In my case a tool call is something which happens internally and should not be exposed to the client because it contains private data.

rajdtta · 2024-05-16T20:45:55Z

It seems that all tool invocations are now exposed to the client. But what if I don't want this? In my case a tool call is something which happens internally and should not be exposed to the client because it contains private data.

@webholics maybe some sort of flag could be implemented in the tool definition that controls whether a tool response / invocation gets sent to the client (or what aspects are sent to the client, e.g. only sending the tool name, only sending tool response, sending both, etc.)

webholics · 2024-05-17T06:13:43Z

Or some kind of callback on the AI stream to be able to filter out or change messages programmatically.

lgrammel · 2024-05-17T06:24:19Z

@webholics @rajdtta filtering tool calls needs to be handled very carefully, because it will impact the messages send to the provider in the next roundtrip. If they are left out, it would impact the LLM result, potentially leading to weird answer or in the case of inconsistent history, errors.

webholics · 2024-05-17T06:34:30Z

@webholics @rajdtta filtering tool calls needs to be handled very carefully, because it will impact the messages send to the provider in the next roundtrip. If they are left out, it would impact the LLM result, potentially leading to weird answer or in the case of inconsistent history, errors.

I see. Thats why this whole topic is so complex. In pure server side tools we would need some kind of server side persistence for those calls and responses. Which is of course a bit out of scope for the AI SDK.

Its just very likely now to run into security issues because the backend exposes a lot of information per default to the client which the developer may not expect.

lgrammel · 2024-05-17T12:05:12Z

@webholics if you want to filter manually, you could use .fullStream from the streamText response and formatStreamPart to append to any stream you want. It's more complicated, but gives you full control.

rajdtta · 2024-05-21T23:51:57Z

@lgrammel do you happen to know what the best practices are for the following:

Storing server-side tool calls & results persistently as part of the conversation/message history.

Right now, I'm upserting the latest 2 messages sent to the API route whenever it's called (which, most of the time, includes the tool invocations/results). I'm not sure if this will work for a scenario where a tool error occurs & the model decides to re-call a tool after spitting out some filler message (i.e. "Sorry I was having a hard time doing this, let me try again...")

const LATEST_2_MESSAGES = messages.slice(-2);

for (const message of LATEST_2_MESSAGES) {
  await ChatService.postMessage({
    conversationID,
    message: {
      id: message.id,
      role: message.role,
      message: message.content,
      toolInvocations: message.toolInvocations ?? undefined,
    },
  });
}

Piping the result of a server-side tool call back to the LLM so that it can be interpreted by the model.

My current implementation of this involves looking at the tool invocations for the latest message (if it exists), and checking if a specific flag is present in the tool results that determines whether a tool call should be interpreted by the model (and another flag to check whether it has already been processed or not).

const LATEST_MESSAGE = messages.slice(-1);

if (LATEST_MESSAGE && LATEST_MESSAGE.toolInvocations) {
  LATEST_MESSAGE.toolInvocations.forEach((invocation) => {
    if ("result" in invocation && invocation.result.interpretable) {
      if (!invocation.result.interpreted) {
        experimental_addToolResult({
          toolCallId: invocation.toolCallId,
          result: {
            ...invocation.result,
            interpreted: true,
          },
        });
      }
    }
  });
}

Add tool_call stream part.

41cb052

lgrammel self-assigned this May 7, 2024

lgrammel mentioned this pull request May 7, 2024

Tool calls do not work as expected #1512

Open

lgrammel added 3 commits May 7, 2024 16:14

Add tool_result stream part.

e4709c9

Send tool call and tool result stream parts.

638859c

Add callback support.

3afc79d

lgrammel added 3 commits May 7, 2024 19:03

Merge branch 'main' into lg/ai-stream-tool-calls

11c8f02

Merge branch 'main' into lg/ai-stream-tool-calls

036bf15

Merge branch 'main' into lg/ai-stream-tool-calls

069894a

lgrammel mentioned this pull request May 8, 2024

Add tool call forwarding and handle unhandled tool calls #1123

Closed

lgrammel added 4 commits May 8, 2024 18:46

Merge branch 'main' into lg/ai-stream-tool-calls

8e0f0e3

Merge branch 'main' into lg/ai-stream-tool-calls

b410824

Merge branch 'main' into lg/ai-stream-tool-calls

4e0e0b2

Merge branch 'main' into lg/ai-stream-tool-calls

49cd870

lgrammel changed the title ~~[WIP] mixed parallel client/server tool call/result support for ai/ui & ai/core~~ [WIP] parallel client/server tool invocations for useChat / ai-core May 13, 2024

lgrammel added 3 commits May 13, 2024 15:25

Add prototype.

a08cc6d

Update examples.

c48c477

Make experimental.

1696ac5

lgrammel changed the title ~~[WIP] parallel client/server tool invocations for useChat / ai-core~~ client/server tool invocations with useChat and ai/core May 13, 2024

Add changeset

1b6a294

lgrammel changed the title ~~client/server tool invocations with useChat and ai/core~~ client/server tool invocations with useChat and streamText May 13, 2024

lgrammel requested a review from shuding May 13, 2024 14:00

lgrammel marked this pull request as ready for review May 13, 2024 14:00

Merge branch 'main' into lg/ai-stream-tool-calls

a622ebe

shuding approved these changes May 13, 2024

View reviewed changes

lgrammel merged commit f617b97 into main May 13, 2024
6 checks passed

lgrammel deleted the lg/ai-stream-tool-calls branch May 13, 2024 17:22

unstubbable mentioned this pull request May 16, 2024

refactoring (ai/core): use TextEncoderStream and toAIStream in streamText #1614

Merged

Rajaniraiyn mentioned this pull request May 22, 2024

client/server tool invocations with useChat and streamText within Solidjs #1669

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

client/server tool invocations with useChat and streamText #1514

client/server tool invocations with useChat and streamText #1514

lgrammel commented May 7, 2024 •

edited

rajdtta commented May 7, 2024

lgrammel commented May 7, 2024

Archimagus commented May 8, 2024

aychang95 commented May 13, 2024

rajdtta commented May 13, 2024 •

edited

Example

lgrammel commented May 13, 2024

aychang95 commented May 13, 2024

rajdtta commented May 13, 2024

connorblack commented May 14, 2024 •

edited

rajdtta commented May 14, 2024

connorblack commented May 14, 2024

webholics commented May 16, 2024

rajdtta commented May 16, 2024 •

edited

webholics commented May 17, 2024 •

edited

lgrammel commented May 17, 2024

webholics commented May 17, 2024

lgrammel commented May 17, 2024

rajdtta commented May 21, 2024

client/server tool invocations with useChat and streamText #1514

client/server tool invocations with useChat and streamText #1514

Conversation

lgrammel commented May 7, 2024 • edited

Summary

Examples

Subtasks

rajdtta commented May 7, 2024

lgrammel commented May 7, 2024

Example

Archimagus commented May 8, 2024

aychang95 commented May 13, 2024

rajdtta commented May 13, 2024 • edited

Example

lgrammel commented May 13, 2024

aychang95 commented May 13, 2024

rajdtta commented May 13, 2024

connorblack commented May 14, 2024 • edited

rajdtta commented May 14, 2024

connorblack commented May 14, 2024

webholics commented May 16, 2024

rajdtta commented May 16, 2024 • edited

webholics commented May 17, 2024 • edited

lgrammel commented May 17, 2024

webholics commented May 17, 2024

lgrammel commented May 17, 2024

rajdtta commented May 21, 2024

lgrammel commented May 7, 2024 •

edited

rajdtta commented May 13, 2024 •

edited

connorblack commented May 14, 2024 •

edited

rajdtta commented May 16, 2024 •

edited

webholics commented May 17, 2024 •

edited