Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load-balancing / auto-scaling for LLM serving on AWS #363

Open
VictorOdede opened this issue Oct 24, 2023 · 3 comments
Open

Load-balancing / auto-scaling for LLM serving on AWS #363

VictorOdede opened this issue Oct 24, 2023 · 3 comments
Labels

Comments

@VictorOdede
Copy link
Collaborator

No description provided.

@horahoradev
Copy link

horahoradev commented Oct 28, 2023

What did you have in mind here? when you refer to autoscaling, are you referring to horizontal scalability, or parallelism across a single host?

@VictorOdede VictorOdede changed the title Load-balancing / auto-scaling for REST API Load-balancing / auto-scaling for LLM serving Oct 31, 2023
@VictorOdede
Copy link
Collaborator Author

@horahoradev I was actually referring to horizontal scaling of LLM instances

@VictorOdede VictorOdede changed the title Load-balancing / auto-scaling for LLM serving Load-balancing / auto-scaling for LLM serving on AWS Oct 31, 2023
@lucylililiwang
Copy link

Hi, can I please work on this issue? Thank you!

@mmirman mmirman removed the $100 label Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants