You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @Jeffwan ! KServe's position is still on LLM inference orchestration and serverless autoscaling, so we are focusing on supporting OpenAI protocol by integrating optimized LLM serving runtime like vLLM/ Tensorrt-LLM/TGI, improving LLM container cold startup time and enabling autoscaling based on custom metrics like the number of input/output tokens. We will update our roadmap with a list features we have planned.
Hi community,
I am wondering any specific optimization did in kserve to support LLM applications? Is there a feature list?
The text was updated successfully, but these errors were encountered: