Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ops: fractional GPU usage for embedding server within kubernetes #1472

Closed
cdxker opened this issue May 16, 2024 · 3 comments
Closed

ops: fractional GPU usage for embedding server within kubernetes #1472

cdxker opened this issue May 16, 2024 · 3 comments
Assignees
Labels
high priority ops Anything with "auto actions"

Comments

@cdxker
Copy link
Member

cdxker commented May 16, 2024

Description

Currently the embedding server is not ran in Kubernetes, this is an issue since we can't scale the embedding servers easily.

Community channels

Matrix is preferred. Reach out on discord or Matrix for further assistance.

@cdxker
Copy link
Member Author

cdxker commented May 16, 2024

Did find this example for fractional GPU usage here. https://github.com/DanielKneipp/aws-eks-share-gpu

@cdxker cdxker changed the title ops: Fractional GPU usage for embedding server within kubernetes ops: fractional GPU usage for embedding server within kubernetes May 16, 2024
@cdxker cdxker added high priority ops Anything with "auto actions" labels May 16, 2024
@cdxker cdxker self-assigned this May 16, 2024
@cdxker
Copy link
Member Author

cdxker commented May 20, 2024

Above link is out of date and needs to be updated. Looking into this to solve the issue
https://github.com/NVIDIA/k8s-device-plugin?tab=readme-ov-file#shared-access-to-gpus

@cdxker cdxker closed this as completed May 23, 2024
@cdxker
Copy link
Member Author

cdxker commented May 24, 2024

Gave up on attempting this in AWS, it worked instantly with google cloud platform

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high priority ops Anything with "auto actions"
Projects
None yet
Development

No branches or pull requests

1 participant