Open WebUI as inference API offering OpenAI API conformity and load balancing #1984
arthurGrigo
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Describe the solution you'd like
The Open WebUI's load balancing feature, which supports multiple Ollama instances (or other OpenAI-conformant APIs), could be very useful for distributing API calls in an LLM application.
It would be helpful if Open WebUI could offer an OpenAI-conformant inference API that utilizes this load balancing.
Describe alternatives you've considered
An alternative could be to place LiteLLM between Open WebUI and Ollama but this complicates the setup for most people.
Beta Was this translation helpful? Give feedback.
All reactions