Releases: BerriAI/litellm
v1.38.1
What's Changed
- [Fix] raise exception when creating team_id for existing team_id on
/team/new
by @ishaan-jaff in #3791
Full Changelog: v1.38.0...v1.38.1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.38.1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 8 | 9.746596574514632 | 1.5463887881502736 | 1.5463887881502736 | 463 | 463 | 6.58207899999752 | 165.79881100000193 |
/health/liveliness | Failed ❌ | 8 | 10.11517365815204 | 15.69100329747297 | 15.69100329747297 | 4698 | 4698 | 6.299295000019356 | 755.4609530000107 |
/health/readiness | Failed ❌ | 8 | 11.074278653607022 | 15.600825117602435 | 15.600825117602435 | 4671 | 4671 | 6.339177999961976 | 1489.8524319999638 |
Aggregated | Failed ❌ | 8 | 10.553469858726283 | 32.83821720322568 | 32.83821720322568 | 9832 | 9832 | 6.299295000019356 | 1489.8524319999638 |
v1.38.0-stable
What's Changed
- [Fix] raise exception when creating team_id for existing team_id on
/team/new
by @ishaan-jaff in #3791
Full Changelog: v1.38.0...v1.38.0-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.38.0-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
v1.38.0
🚨🚨 Minor Updates Made to DB Schema on this release
🚨 Change to LiteLLM Helm - remove run_gunicorn
from default HELM chart. This is to be compliant with our best practices https://docs.litellm.ai/docs/proxy/prod
What's Changed
- [Feat] add failure callbacks from DB to proxy by @ishaan-jaff in #3775
- [Fix] - don't use
gunicorn
on litellm helm by @ishaan-jaff in #3783 - [Feat] LiteLLM Proxy: Enforce End-User TPM, RPM Limits by @ishaan-jaff in #3785
- feat(schema.prisma): store model id + model group as part of spend logs allows precise model metrics by @krrishdholakia in #3789
- feat(proxy_server.py): enable admin to set tpm/rpm limits for end-users via UI by @krrishdholakia in #3787
- [Feat] Set Budgets for Users within a Team by @ishaan-jaff in #3790
Full Changelog: v1.37.20...v1.38.0
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.38.0
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
v1.37.20
What's Changed
- Upgrade Traceloop to version 0.18.2 by @elisalimli in #3727
- usage-based-routing-ttl-on-cache by @sumanth13131 in #3412
- Revert "Revert "Logfire Integration"" by @elisalimli in #3756
- docs - add bedrock meta llama3 by @ishaan-jaff in #3763
- [Cohere] Add request source to request by @BeatrixCohere in #3759
- [Fix] Bump OpenAI version on Litellm PIP package [OpenAI>=1.27.0] by @ishaan-jaff in #3765
- Support anthropic 'tool_choice' param by @krrishdholakia in #3771
- [Feat] Proxy - Create Keys that can only access
/spend
routes on Admin UI by @ishaan-jaff in #3772 - feat(lowest_latency.py): route by time to first token, for streaming requests (if available) by @krrishdholakia in #3768
- feat(router.py): filter out deployments which don't support request params w/ 'pre_call_checks=True' by @krrishdholakia in #3770
New Contributors
- @BeatrixCohere made their first contribution in #3759
Full Changelog: v1.37.19...v1.37.20
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.37.20
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
v1.37.19-stable
🚨 SSO on LiteLLM Proxy will be enforced behind a license from this release
- If you use SSO on the litellm admin UI + Proxy and want a license, meet with us here: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
What's Changed
- [Fix] only run
check_request_disconnection
logic for maximum 10 mins by @ishaan-jaff in #3741 - Adding decoding of base64 image data for gemini pro 1.5 by @hmcp22 in #3711
- [Feat] Enforce user has a valid license when using SSO on LiteLLM Proxy by @ishaan-jaff in #3742
- [FEAT] Async VertexAI Image Generation by @ishaan-jaff in #3739
- [Feat] Router/ Proxy - set cooldown_time based on Azure exception headers by @ishaan-jaff in #3716
- fix divide by 0 bug on slack alerting by @ishaan-jaff in #3745
- Standardize slack exception msg format by @ishaan-jaff in #3747
- Another dictionary changed size during iteration error by @phact in #3657
- feat(proxy_server.py): allow admin to return rejected response as string to user by @krrishdholakia in #3740
- [Fix] - raise 404 from
/team/info
when team does not exist by @ishaan-jaff in #3749 - webhook support for budget alerts by @krrishdholakia in #3748
- [Fix] - raise Exception when trying to update/delete a non-existent team by @ishaan-jaff in #3750
- [FEAT] - add litellm.Router -
abatch_completion_one_model_multiple_requests
by @ishaan-jaff in #3751
New Contributors
Full Changelog: v1.37.17...v1.37.19-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.37.19-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
v1.37.19
🚨 SSO on LiteLLM Proxy will be enforced behind a license from this release
- If you use SSO on the litellm admin UI + Proxy and want a license, meet with us here: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
What's Changed
- [Fix] only run
check_request_disconnection
logic for maximum 10 mins by @ishaan-jaff in #3741 - Adding decoding of base64 image data for gemini pro 1.5 by @hmcp22 in #3711
- [Feat] Enforce user has a valid license when using SSO on LiteLLM Proxy by @ishaan-jaff in #3742
- [FEAT] Async VertexAI Image Generation by @ishaan-jaff in #3739
- [Feat] Router/ Proxy - set cooldown_time based on Azure exception headers by @ishaan-jaff in #3716
- fix divide by 0 bug on slack alerting by @ishaan-jaff in #3745
- Standardize slack exception msg format by @ishaan-jaff in #3747
- Another dictionary changed size during iteration error by @phact in #3657
- feat(proxy_server.py): allow admin to return rejected response as string to user by @krrishdholakia in #3740
- [Fix] - raise 404 from
/team/info
when team does not exist by @ishaan-jaff in #3749 - webhook support for budget alerts by @krrishdholakia in #3748
- [Fix] - raise Exception when trying to update/delete a non-existent team by @ishaan-jaff in #3750
- [FEAT] - add litellm.Router -
abatch_completion_one_model_multiple_requests
by @ishaan-jaff in #3751
New Contributors
Full Changelog: v1.37.17...v1.37.19
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.37.19
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
v1.37.17
What's Changed
- fix(utils.py): drop response_format if 'drop_params=True' for gpt-4 by @krrishdholakia in #3724
- fix(vertex_ai.py): support passing in result of tool call to vertex by @krrishdholakia in #3729
- feat(proxy_cli.py): support json logs on proxy by @krrishdholakia in #3737
Full Changelog: v1.37.16...v1.37.17
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.37.17
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 19 | 21.889823638252498 | 1.60686072057369 | 1.60686072057369 | 481 | 481 | 17.351264000012634 | 105.27215500002285 |
/health/liveliness | Failed ❌ | 18 | 22.18578070962989 | 15.68108985940312 | 15.68108985940312 | 4694 | 4694 | 16.798868000023504 | 1270.1498669999864 |
/health/readiness | Failed ❌ | 18 | 22.51649214501728 | 15.687771192960598 | 15.687771192960598 | 4696 | 4696 | 16.936380999993617 | 1206.0035970000058 |
Aggregated | Failed ❌ | 18 | 22.328690804782045 | 32.975721772937405 | 32.975721772937405 | 9871 | 9871 | 16.798868000023504 | 1270.1498669999864 |
v1.37.16
What's Changed
- fix - allow non master key to access llm_utils_routes by @ishaan-jaff in #3710
- fix(bedrock_httpx.py): move anthropic bedrock calls to httpx by @krrishdholakia in #3708
- [Feat] Admin UI - use
base_model
for Slack Alerts by @ishaan-jaff in #3713 - [Admin UI] show max input tokens on UI by @ishaan-jaff in #3714
- fix(proxy_server.py): fix setting model id for db models by @krrishdholakia in #3715
Full Changelog: v1.37.14...v1.37.16
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.37.16
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 9 | 10.2880556709407 | 1.5629325106711098 | 1.5629325106711098 | 468 | 468 | 7.436624999968444 | 83.99098699999286 |
/health/liveliness | Failed ❌ | 8 | 10.80103857402248 | 15.632664706092875 | 15.632664706092875 | 4681 | 4681 | 6.298579000031168 | 1272.475381999982 |
/health/readiness | Failed ❌ | 8 | 10.780497224867714 | 15.712815091255495 | 15.712815091255495 | 4705 | 4705 | 6.286180000017794 | 650.4576310000232 |
Aggregated | Failed ❌ | 8 | 10.766867369799249 | 32.90841230801948 | 32.90841230801948 | 9854 | 9854 | 6.286180000017794 | 1272.475381999982 |
v1.37.14
What's Changed
Full Changelog: v1.37.13...v1.37.14
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.37.14
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 9 | 11.887553362704857 | 1.6297242673296988 | 1.6297242673296988 | 488 | 488 | 7.5520679999954154 | 178.089099999994 |
/health/liveliness | Failed ❌ | 8 | 10.902480935483862 | 15.529134924350613 | 15.529134924350613 | 4650 | 4650 | 6.3291929999991225 | 907.1240070000499 |
/health/readiness | Failed ❌ | 8 | 11.157714575899545 | 15.68609607304835 | 15.68609607304835 | 4697 | 4697 | 6.4579570000091735 | 1189.8105640000267 |
Aggregated | Failed ❌ | 8 | 11.073253457447953 | 32.84495526472866 | 32.84495526472866 | 9835 | 9835 | 6.3291929999991225 | 1189.8105640000267 |
v1.37.13-stable
Full Changelog: v1.37.13...v1.37.13-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.37.13-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat