Allow for message history to be omitted when making request to llm #1468
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Feature: Add support for appending messages to existing conversations while providing the option to include only the latest message in the conversation history.
Change: Introduces an optional boolean flag, exclude_history. When set to true, only the most recent message in the conversation is passed instead of the complete history.
Benefits:
Use Case
In scenarios where back-end services impose payload size restrictions, this flag allows developers to selectively exclude potentially large conversation histories. A typical use case is sending inference requests containing only the latest user prompt, while managing a more extensive conversation history in the backend.
Important Note: Developers will still need to implement their own backend persistence mechanisms if they require LLM access to the complete conversation history beyond the latest user message.