Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU not utilized with standard llama-cpp-python #71

Closed
drphero opened this issue Apr 13, 2024 · 1 comment
Closed

GPU not utilized with standard llama-cpp-python #71

drphero opened this issue Apr 13, 2024 · 1 comment

Comments

@drphero
Copy link

drphero commented Apr 13, 2024

When using loading the llava models, you can see that BLAS = 0 in the information printed in the console. This is because llama-cpp-python requires a special install if you want GPU capabilities. I'm not sure if this affects linux users or not, but it does affect windows users.

To properly install llama-cpp-python for nvidia GPUs:

pip install --no-cache-dir llama-cpp-python -C cmake.args="-DLLAMA_CUDA=ON" -vv

This of course requires Visual Studio to be installed with the "Desktop development with C++" option selected or the VS C++ Build Tools.

I found the --no-cache-dir necessary in order to get it to actually build it, so I'm not sure how this can be automatically done via the requirements.txt.

Now when using the llava models, you should see BLAS = 1 in the console.

@gokayfem
Copy link
Owner

we install it with pre-built wheels, it automatically supports cuda for linux and windows. recently pre-built wheel support added for macOS, i didnt add it to repo yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants