-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated Ollama cost models to include LLaMa3 and Mistral/Mixtral Instruct series #3543
Conversation
Added Mistral and Mixtral Chat entries for Ollama.
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
@ishaan-jaff If you wouldn't mind review this sometime soon it would be helpful so that I don't have to keep working of a local fork of LiteLLM to use Ollama. Thanks! |
model_prices_and_context_window.json
Outdated
"max_input_tokens": 8192, | ||
"max_output_tokens": 8192, | ||
"input_cost_per_token": 0.00000010, | ||
"output_cost_per_token": 0.00000010, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does ollama have a hosted endpoint ? why is there a cost here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, an oversight during copy&paste - I'll fix that.
fixed typo with ollama/llama3 token cost (now set to 0)
litellm/utils.py
Outdated
@@ -6620,7 +6620,7 @@ def _get_max_position_embeddings(model_name): | |||
raise Exception() | |||
except: | |||
raise Exception( | |||
"This model isn't mapped yet. Add it here - https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json" | |||
f"Model {model} from provider {custom_llm_provider} isn't mapped yet. Add it here - https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kmheckel this would cause another error - if custom_llm_provider isn't found from 'get_llm_provider'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove the custom_llm_provider reference in that case.
changed error message
Title
Updated Ollama cost models to include LLaMa3 and Mistral Instruct series
Relevant issues
None currently open.
Type
🆕 New Feature
Changes
Updated model_prices_and_context_window.json to include chat endpoints for LLaMa3 and Mistral/Mixtral Intstruct.
This enables users to use any LLaMa3/Mistral/Mixtral finetunes in chat mode by doing
ollama cp <finetune name> <llama3/mistral-7B-Instruct-v0.1>
for example.Small error message modifcation to include model and provider name in the error message for custom LLM providers. This makes debugging easier for users as they can see if it's just a typo or if they need to go update the cost model file.
Testing
To test, instructions from this AutoGen tutorial mostly suffice: https://microsoft.github.io/autogen/docs/topics/non-openai-models/local-litellm-ollama/
ollama serve
Notes
From my understanding, this pull request removes the necessity to specify custom pricing models from command line via a config.yaml: https://docs.litellm.ai/docs/proxy/custom_pricing
Tested locally to confirm functionality, with streaming set to False as part of the LiteLLM interface to DataDreamer.
https://github.com/datadreamer-dev/DataDreamer/blob/0.35.0/src/llms/_litellm.py
No concerns or substantial modifications; update simply adjusts metadata for newer Ollama models.
Pre-Submission Checklist (optional but appreciated):
OS Tests (optional but appreciated):
Not tested on other operating systems but this change won't break/doesn't fix other open issues with Ollama-->LiteLLM.