.Net: How does semantic kernel support load balancing by using multiple Azure OpenAI endpoints? #5774
-
How does semantic kernel support load balancing by using multiple Azure OpenAI endpoints? I would like to route my chat completion requests to different registered endpoints depending on the load. Pl. assist. |
Beta Was this translation helpful? Give feedback.
Answered by
markwallace-microsoft
Apr 4, 2024
Replies: 1 comment
-
This article explains how to use Azure API Management to load balance requests to multiple instances of the Azure OpenAI Service. You will need to customise the OpenAICient that the Semantic Kernel uses to call the Azure OpenAI Service. This sample shows how to do this. |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
sophialagerkranspandey
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This article explains how to use Azure API Management to load balance requests to multiple instances of the Azure OpenAI Service.
You will need to customise the OpenAICient that the Semantic Kernel uses to call the Azure OpenAI Service. This sample shows how to do this.