Replies: 6 comments 3 replies
-
small correction, when on bare metal ollama is outputting long answers the CPU jumps from 3% to 12%. But again the whole reply is blazing fast (around 500 words) takes a second. By comparison with docker-compose the CPU jumpst to 75% and same reply takes about 20 seconds... |
Beta Was this translation helpful? Give feedback.
-
ok you can close this discussion. I was using different models. Thought I was using the same but not the case. Tried with the same model and very similar behaviour. |
Beta Was this translation helpful? Give feedback.
-
Hi Any ideas please ? |
Beta Was this translation helpful? Give feedback.
-
Can you do nvidia-smi ?
Are you sure CUDA is supported by your kernel?
Windows or Linux?
Mar 16, 2024 07:48:38 Jynra 'notifications at github.com' ***@***.***>:
…
This email was sent to ***@***.*** from ***@***.***
Click _here[https://app.addy.io/deactivate/022cc3db-6fd8-4177-8ee9-e7c2e50b7a0f?signature=c71858ffacf8d3e9be43c2aa0cb6d980030174651ae2fda2b9e769c21abf2a88]_ to deactivate this alias
Hi
I tried these lines to detect gpu, but i get this error :
failed to deploy a stack: ollama Pulling ollama Pulled Container ollama Recreate Container ollama Recreated Container open-webui Recreate Container open-webui Recreated Container ollama Starting Error response from daemon: could not select device driver "nvidia" with capabilities: [[gpu]]
Any ideas please ?
—
Reply to this email directly, view it on GitHub[#1017 (comment)], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AHRKLJ5EK5MTFSZYWYXEOK3YYRLTXAVCNFSM6AAAAABEEHFQYOVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DQMJSGIZTA].
You are receiving this because you modified the open/close state.
[Tracking image][https://github.com/notifications/beacon/AHRKLJ2IMT6BGLX6QPBUEJLYYRLTXA5CNFSM6AAAAABEEHFQYOWGG33NNVSW45C7OR4XAZNRIRUXGY3VONZWS33OINXW23LFNZ2KUY3PNVWWK3TUL5UWJTQAQZ3MM.gif]
|
Beta Was this translation helpful? Give feedback.
-
Running CUDA in docker requires additional config.
I didn't use this but directives from nvidia site which I can't find back but this seems close: https://linuxconfig.org/setting-up-nvidia-cuda-toolkit-in-a-docker-container-on-debian-ubuntu
Give it a try. If no go, try to find the docs on the nvidia site.
Good luck.
Mar 16, 2024 08:52:18 Jynra 'notifications at github.com' ***@***.***>:
…
This email was sent to ***@***.*** from ***@***.***
Click _here[https://app.addy.io/deactivate/022cc3db-6fd8-4177-8ee9-e7c2e50b7a0f?signature=c71858ffacf8d3e9be43c2aa0cb6d980030174651ae2fda2b9e769c21abf2a88]_ to deactivate this alias
Yes I have RTX 3080 installed and nvidia-smi work. Linux
I've just installed Nvidia-container-toolkit
But I get the same error
—
Reply to this email directly, view it on GitHub[#1017 (reply in thread)], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AHRKLJ4LLADLJYJEDT5RW5LYYRTCRAVCNFSM6AAAAABEEHFQYOVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DQMJSGY2DQ].
You are receiving this because you modified the open/close state.
[Tracking image][https://github.com/notifications/beacon/AHRKLJ5YNEJ6N3DPAG74UI3YYRTCRA5CNFSM6AAAAABEEHFQYOWGG33NNVSW45C7OR4XAZNRIRUXGY3VONZWS33OINXW23LFNZ2KUY3PNVWWK3TUL5UWJTQAQZ4GQ.gif]
|
Beta Was this translation helpful? Give feedback.
-
If you can run it from cli and nvidia-smi shows that stuff is loaded in GPU this means that CUDA works for your distribution and that the kernel is supported.
Maybe try to run ollama from your host (I think this site or the ollama one tell you how to do that) this means the problem is getting CUDA to run with docker.
In any case before trying to run CUDA in docker you have to make sure you can run it on the host first (which would seem to be the case of you can run nvidia-smi).
Maybe the docker version??
Find out your docker version and look on the nvidia site to see if supported.
Not sure what else to tell you.
You can look here: https://linuxconfig.org/setting-up-nvidia-cuda-toolkit-in-a-docker-container-on-debian-ubuntu
And here:https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html
Again good luck!
Mar 16, 2024 09:18:19 Jynra 'notifications at github.com' ***@***.***>:
…
This email was sent to ***@***.*** from ***@***.***
Click _here[https://app.addy.io/deactivate/022cc3db-6fd8-4177-8ee9-e7c2e50b7a0f?signature=c71858ffacf8d3e9be43c2aa0cb6d980030174651ae2fda2b9e769c21abf2a88]_ to deactivate this alias
Didn't work.
Do you think it cause i'm on ubuntu 23.10 ?
—
Reply to this email directly, view it on GitHub[#1017 (reply in thread)], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AHRKLJ4QGHKHLBRDDHVCBSTYYRWEDAVCNFSM6AAAAABEEHFQYOVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DQMJSG44DE].
You are receiving this because you modified the open/close state.
[Tracking image][https://github.com/notifications/beacon/AHRKLJZOL3C4OLBX7UT7Y2TYYRWEDA5CNFSM6AAAAABEEHFQYOWGG33NNVSW45C7OR4XAZNRIRUXGY3VONZWS33OINXW23LFNZ2KUY3PNVWWK3TUL5UWJTQAQZ4O4.gif]
|
Beta Was this translation helpful? Give feedback.
-
I've installed ollama on bare metal and in the terminal it runs blazingly fast, and I watch the CPU curve and notice that it's not used at all when ollama is busy answering queries.
I didn't succeed in having open-webui communicate with ollama on the host so I tried to run both ollama and open-webui in docker.
Here is my docker-compose file:
The docker logs state that the GPU is being detected and if I do: nvidia-smi I see that the ollama code is loaded in VRAM.
But when ollama start answering queries, first it's much slower than bare metal and I see the CPU shoot to 75% use (from initial 3%). If fact there's almost no difference (if at all) between running docker-compose with the GPU section or without.
Running all this on Ubuntu 22.04 LTS.
Docker version 24.0.5, build ced0996
docker-compose version 1.29.2, build unknown
Again the logs do say that (if GPU section is included) the GPU is detected, and I verified that it is loaded in the GPU but the CPU usage and sluggishness of the output tell a different story.
Maybe I'm doing something wrong?
Beta Was this translation helpful? Give feedback.
All reactions