Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support volta architecture GPUs for the vLLM backend #930

Open
K-Mistele opened this issue Mar 12, 2024 · 0 comments
Open

feat: support volta architecture GPUs for the vLLM backend #930

K-Mistele opened this issue Mar 12, 2024 · 0 comments

Comments

@K-Mistele
Copy link

Feature request

It would be great if OpenLLM supported pre-Ampere architecture Cuda devices. In my case, I'm looking at the volta architecture.

The README currently indicates that an Ampere-architecture or newer GPU is required to use the vLLM backend, otherwise you're stuck with the torch backend.

As far as I can tell, this is not a vLLM-specific constraint - vLLM does not require that you use an ampere-arch device.

Motivation

I am trying to run OpenLLM on my Nvidia Tesla v100 (32GB) devices, but I cannot use the vLLM backend, as OpenLLM's vLLM backend does not support the volta architecture.

Other

I would love to help as best as I can, but I can't find any documentation for where this constraint comes from, other than in the README. I've gone through vLLM's docs, and they do not indicate that this is a vLLM constraint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant