Replies: 3 comments 1 reply
-
the helm chart in kubernetes directory create a Statefulset and CusterIP service for ollma. Through a single url like http://ollama-service.open-webui.svc.cluster.local:11434, open-webui lost tracking of which instance have which model. So I think this is a mistake. Use a headless service with statefulset is common pratice in k8s, and open-webui can fetch all ollama instance ips throuch dns resolve. Did I miss something? |
Beta Was this translation helpful? Give feedback.
-
Works fine as-is. You can share volume across multiple instances of ollama, so that modle uploaded to one is visible to all of the instances. |
Beta Was this translation helpful? Give feedback.
-
Anyone got any idea on how to limit this properly for multiple users? |
Beta Was this translation helpful? Give feedback.
-
Maybe this should be regarded as a question.
I deployed open-webui & ollama with the helm chart in the source code, and I want to make sure If I run more than one replicas of the ollama server, they can be well loadbalanced and don`t cause any inconsistence.
Clearly, these ollama servers will be loadbalanced through cluster service. Does this mean that if I upload a model, it will be upload to any of them? And then if the chat request routed to a server without the loaded model, I will see a error reply? I known that there is a "Update all models" button, but I`m not sure this still works through the single clusterIP.
Beta Was this translation helpful? Give feedback.
All reactions