Can we stream the response? #14

Justus-M · 2024-03-05T07:05:48Z

For most chatbot applications, it's important to stream the response, especially in cases where a tool isn't actually being used.

Not just that, but it's also important to know whether a tool is being used, or whether the bot is just being slow to respond.

Either way, the user will be waiting (in complex cases this might take several minutes) without knowing what's going on.

The bot might be generating a function call, or generating a message. As far as I can tell, when we call use_tools we don't know which of those 2 is happening until it's done, so we can't show the user what is happening.

And more importantly, if claude is generating a normal message, we can't stream the response to the user so they can read it as it is being created instead of waiting and then having to read it all at once. This costs the user lots of time overall.

If i'm wrong please let me know, otherwise I would say this is quite an important feature request, and I'll use the XML in the meantime.

nmarwell-anthropic · 2024-03-05T17:01:46Z

Hi @Justus-M , thanks for the feedback. A few notes for you below!

Most of what you are hoping to do can be accomplished by setting execution_mode='manual' when invoking tool_user.use_tools(), which will stop execution as soon as a single set of function calls or a message is returned. You can see more about it here:

anthropic-tools/README.md

Line 81 in a782267

    
           You can then make use of your ToolUser by calling its `use_tools()` method and passing in your desired prompt. Setting execution mode to "automatic" makes it execute the function; in the default "manual" mode it returns the function arguments back to the client to be executed there.

.

Re streaming, we know this is a blind spot and hope to address it when this package moves out of beta and likely to our API. If you would like to submit a PR to this repo to support it before then, we would welcome it and you are welcome to do so!

Let me know if you have any follow up questions.
Nick

Justus-M · 2024-03-06T05:55:28Z

Hi Nick, thanks for your quick response

When I said "generating a function call" I meant claude generating the function arguments, not my system actually executing the function. I am already using the manual option, and the execution tends to be much faster than the function input/text generation anyway.

Unfortunately manual doesn't make a difference in this case.

Regardless, I would argue that it doesn't get people "most" of the way there, as the (non-function call) messages not being streamed is arguably a bigger issue since this is something users can actually get value from instead of waiting, since they can read it. With this library, users will be sitting there wasting their time and being frustrated (especially for long messages) and then reading a long message all at once. It's hard to overstate how important streaming messages are for any applications that display these messages to the user, especially given that this is what people are already used to with chatgpt and claude.ai

Glad to know this is on your radar. I've already implemented an initial version using the raw XML and will consider using this repo when streaming is supported.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can we stream the response? #14

Can we stream the response? #14

Justus-M commented Mar 5, 2024 •

edited

nmarwell-anthropic commented Mar 5, 2024

Justus-M commented Mar 6, 2024 •

edited

Can we stream the response? #14

Can we stream the response? #14

Comments

Justus-M commented Mar 5, 2024 • edited

nmarwell-anthropic commented Mar 5, 2024

Justus-M commented Mar 6, 2024 • edited

Justus-M commented Mar 5, 2024 •

edited

Justus-M commented Mar 6, 2024 •

edited