Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can we stream the response? #14

Open
Justus-M opened this issue Mar 5, 2024 · 2 comments
Open

Can we stream the response? #14

Justus-M opened this issue Mar 5, 2024 · 2 comments

Comments

@Justus-M
Copy link

Justus-M commented Mar 5, 2024

For most chatbot applications, it's important to stream the response, especially in cases where a tool isn't actually being used.

Not just that, but it's also important to know whether a tool is being used, or whether the bot is just being slow to respond.

Either way, the user will be waiting (in complex cases this might take several minutes) without knowing what's going on.

The bot might be generating a function call, or generating a message. As far as I can tell, when we call use_tools we don't know which of those 2 is happening until it's done, so we can't show the user what is happening.

And more importantly, if claude is generating a normal message, we can't stream the response to the user so they can read it as it is being created instead of waiting and then having to read it all at once. This costs the user lots of time overall.

If i'm wrong please let me know, otherwise I would say this is quite an important feature request, and I'll use the XML in the meantime.

@nmarwell-anthropic
Copy link
Collaborator

Hi @Justus-M , thanks for the feedback. A few notes for you below!

  1. Most of what you are hoping to do can be accomplished by setting execution_mode='manual' when invoking tool_user.use_tools(), which will stop execution as soon as a single set of function calls or a message is returned. You can see more about it here:

    You can then make use of your ToolUser by calling its `use_tools()` method and passing in your desired prompt. Setting execution mode to "automatic" makes it execute the function; in the default "manual" mode it returns the function arguments back to the client to be executed there.
    .

  2. Re streaming, we know this is a blind spot and hope to address it when this package moves out of beta and likely to our API. If you would like to submit a PR to this repo to support it before then, we would welcome it and you are welcome to do so!

Let me know if you have any follow up questions.
Nick

@Justus-M
Copy link
Author

Justus-M commented Mar 6, 2024

Hi Nick, thanks for your quick response

When I said "generating a function call" I meant claude generating the function arguments, not my system actually executing the function. I am already using the manual option, and the execution tends to be much faster than the function input/text generation anyway.

Unfortunately manual doesn't make a difference in this case.

Regardless, I would argue that it doesn't get people "most" of the way there, as the (non-function call) messages not being streamed is arguably a bigger issue since this is something users can actually get value from instead of waiting, since they can read it. With this library, users will be sitting there wasting their time and being frustrated (especially for long messages) and then reading a long message all at once. It's hard to overstate how important streaming messages are for any applications that display these messages to the user, especially given that this is what people are already used to with chatgpt and claude.ai

Glad to know this is on your radar. I've already implemented an initial version using the raw XML and will consider using this repo when streaming is supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants