You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I deployed open-webui & ollama with the helm chart in the source code, and I want to make sure If I run more than one replicas of the ollama server, they can be well loadbalanced and don`t cause any inconsistence.
Clearly, these ollama servers will be loadbalanced through cluster service. Does this mean that if I upload a model, it will be upload to any of them? And then if the chat request routed to a server without the loaded model, I will see a error reply? I known that there is a "Update all models" button, but I`m not sure this still works through the single clusterIP.
The text was updated successfully, but these errors were encountered:
the helm chart in kubernetes directory create a Statefulset and CusterIP service for ollma. Through a single url like http://ollama-service.open-webui.svc.cluster.local:11434, open-webui lost tracking of which instance have which model. So I think this is a mistake. Use a headless service with statefulset is common pratice in k8s, and open-webui can fetch all ollama instance ips throuch dns resolve. Did I miss something?
Maybe this should be regarded as a question.
I deployed open-webui & ollama with the helm chart in the source code, and I want to make sure If I run more than one replicas of the ollama server, they can be well loadbalanced and don`t cause any inconsistence.
Clearly, these ollama servers will be loadbalanced through cluster service. Does this mean that if I upload a model, it will be upload to any of them? And then if the chat request routed to a server without the loaded model, I will see a error reply? I known that there is a "Update all models" button, but I`m not sure this still works through the single clusterIP.
The text was updated successfully, but these errors were encountered: