docker-compose GPU used but only a little #1017

airdogvan · 2024-03-03T19:21:06Z

airdogvan
Mar 3, 2024

I've installed ollama on bare metal and in the terminal it runs blazingly fast, and I watch the CPU curve and notice that it's not used at all when ollama is busy answering queries.

I didn't succeed in having open-webui communicate with ollama on the host so I tried to run both ollama and open-webui in docker.
Here is my docker-compose file:

version: '3.8'

services:
  ollama:
    volumes:
      - ./ollama:/root/.ollama
    container_name: ollama
    pull_policy: always
    tty: true
    restart: unless-stopped
    image: ollama/ollama:latest
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]


  open-webui:
    build:
      context: .
      args:
        OLLAMA_API_BASE_URL: '/ollama/api'
      dockerfile: Dockerfile
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    volumes:
      - ./open-webui:/app/backend/data
    depends_on:
      - ollama
    ports:
      - ${OPEN_WEBUI_PORT-3000}:8080
    environment:
      - 'OLLAMA_API_BASE_URL=http://ollama:11434/api'
      - 'WEBUI_SECRET_KEY='
    extra_hosts:
      - host.docker.internal:host-gateway
    restart: unless-stopped

#volumes:
#  ollama: {}
#  open-webui: {}

The docker logs state that the GPU is being detected and if I do: nvidia-smi I see that the ollama code is loaded in VRAM.

But when ollama start answering queries, first it's much slower than bare metal and I see the CPU shoot to 75% use (from initial 3%). If fact there's almost no difference (if at all) between running docker-compose with the GPU section or without.

Running all this on Ubuntu 22.04 LTS.
Docker version 24.0.5, build ced0996
docker-compose version 1.29.2, build unknown

Again the logs do say that (if GPU section is included) the GPU is detected, and I verified that it is loaded in the GPU but the CPU usage and sluggishness of the output tell a different story.
Maybe I'm doing something wrong?

airdogvan · 2024-03-03T20:05:05Z

airdogvan
Mar 3, 2024
Author

small correction, when on bare metal ollama is outputting long answers the CPU jumps from 3% to 12%. But again the whole reply is blazing fast (around 500 words) takes a second.

By comparison with docker-compose the CPU jumpst to 75% and same reply takes about 20 seconds...

0 replies

airdogvan · 2024-03-03T20:24:33Z

airdogvan
Mar 3, 2024
Author

ok you can close this discussion. I was using different models. Thought I was using the same but not the case. Tried with the same model and very similar behaviour.

0 replies

jynrock · 2024-03-16T14:48:07Z

jynrock
Mar 16, 2024

Hi
I tried these lines to detect gpu, but i get this error :
failed to deploy a stack: ollama Pulling ollama Pulled Container ollama Recreate Container ollama Recreated Container open-webui Recreate Container open-webui Recreated Container ollama Starting Error response from daemon: could not select device driver "nvidia" with capabilities: [[gpu]]

Any ideas please ?

0 replies

airdogvan · 2024-03-16T15:28:55Z

airdogvan
Mar 16, 2024
Author

Can you do nvidia-smi ? Are you sure CUDA is supported by your kernel? Windows or Linux? Mar 16, 2024 07:48:38 Jynra 'notifications at github.com' ***@***.***>:

…

This email was sent to ***@***.*** from ***@***.*** Click _here[https://app.addy.io/deactivate/022cc3db-6fd8-4177-8ee9-e7c2e50b7a0f?signature=c71858ffacf8d3e9be43c2aa0cb6d980030174651ae2fda2b9e769c21abf2a88]_ to deactivate this alias Hi I tried these lines to detect gpu, but i get this error : failed to deploy a stack: ollama Pulling ollama Pulled Container ollama Recreate Container ollama Recreated Container open-webui Recreate Container open-webui Recreated Container ollama Starting Error response from daemon: could not select device driver "nvidia" with capabilities: [[gpu]] Any ideas please ? — Reply to this email directly, view it on GitHub[#1017 (comment)], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AHRKLJ5EK5MTFSZYWYXEOK3YYRLTXAVCNFSM6AAAAABEEHFQYOVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DQMJSGIZTA]. You are receiving this because you modified the open/close state. [Tracking image][https://github.com/notifications/beacon/AHRKLJ2IMT6BGLX6QPBUEJLYYRLTXA5CNFSM6AAAAABEEHFQYOWGG33NNVSW45C7OR4XAZNRIRUXGY3VONZWS33OINXW23LFNZ2KUY3PNVWWK3TUL5UWJTQAQZ3MM.gif]

1 reply

jynrock Mar 16, 2024

Yes I have RTX 3080 installed and nvidia-smi work. Linux
I've just installed Nvidia-container-toolkit
But I get the same error

airdogvan · 2024-03-16T16:07:55Z

airdogvan
Mar 16, 2024
Author

Running CUDA in docker requires additional config. I didn't use this but directives from nvidia site which I can't find back but this seems close: https://linuxconfig.org/setting-up-nvidia-cuda-toolkit-in-a-docker-container-on-debian-ubuntu Give it a try. If no go, try to find the docs on the nvidia site. Good luck. Mar 16, 2024 08:52:18 Jynra 'notifications at github.com' ***@***.***>:

…

This email was sent to ***@***.*** from ***@***.*** Click _here[https://app.addy.io/deactivate/022cc3db-6fd8-4177-8ee9-e7c2e50b7a0f?signature=c71858ffacf8d3e9be43c2aa0cb6d980030174651ae2fda2b9e769c21abf2a88]_ to deactivate this alias Yes I have RTX 3080 installed and nvidia-smi work. Linux I've just installed Nvidia-container-toolkit But I get the same error — Reply to this email directly, view it on GitHub[#1017 (reply in thread)], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AHRKLJ4LLADLJYJEDT5RW5LYYRTCRAVCNFSM6AAAAABEEHFQYOVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DQMJSGY2DQ]. You are receiving this because you modified the open/close state. [Tracking image][https://github.com/notifications/beacon/AHRKLJ5YNEJ6N3DPAG74UI3YYRTCRA5CNFSM6AAAAABEEHFQYOWGG33NNVSW45C7OR4XAZNRIRUXGY3VONZWS33OINXW23LFNZ2KUY3PNVWWK3TUL5UWJTQAQZ4GQ.gif]

1 reply

jynrock Mar 16, 2024

Didn't work.
Do you think it cause i'm on ubuntu 23.10 ?

airdogvan · 2024-03-16T16:33:04Z

airdogvan
Mar 16, 2024
Author

If you can run it from cli and nvidia-smi shows that stuff is loaded in GPU this means that CUDA works for your distribution and that the kernel is supported. Maybe try to run ollama from your host (I think this site or the ollama one tell you how to do that) this means the problem is getting CUDA to run with docker. In any case before trying to run CUDA in docker you have to make sure you can run it on the host first (which would seem to be the case of you can run nvidia-smi). Maybe the docker version?? Find out your docker version and look on the nvidia site to see if supported. Not sure what else to tell you. You can look here: https://linuxconfig.org/setting-up-nvidia-cuda-toolkit-in-a-docker-container-on-debian-ubuntu And here:https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html Again good luck! Mar 16, 2024 09:18:19 Jynra 'notifications at github.com' ***@***.***>:

…

This email was sent to ***@***.*** from ***@***.*** Click _here[https://app.addy.io/deactivate/022cc3db-6fd8-4177-8ee9-e7c2e50b7a0f?signature=c71858ffacf8d3e9be43c2aa0cb6d980030174651ae2fda2b9e769c21abf2a88]_ to deactivate this alias Didn't work. Do you think it cause i'm on ubuntu 23.10 ? — Reply to this email directly, view it on GitHub[#1017 (reply in thread)], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AHRKLJ4QGHKHLBRDDHVCBSTYYRWEDAVCNFSM6AAAAABEEHFQYOVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DQMJSG44DE]. You are receiving this because you modified the open/close state. [Tracking image][https://github.com/notifications/beacon/AHRKLJZOL3C4OLBX7UT7Y2TYYRWEDA5CNFSM6AAAAABEEHFQYOWGG33NNVSW45C7OR4XAZNRIRUXGY3VONZWS33OINXW23LFNZ2KUY3PNVWWK3TUL5UWJTQAQZ4O4.gif]

1 reply

jynrock Mar 16, 2024

Sorry, i'm totaly blind
ubuntu 23.10 is not supported by nvidia-container-toolkit
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/supported-platforms.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docker-compose GPU used but only a little #1017

{{title}}

Replies: 6 comments 3 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

docker-compose GPU used but only a little #1017

airdogvan Mar 3, 2024

Replies: 6 comments · 3 replies

airdogvan Mar 3, 2024 Author

airdogvan Mar 3, 2024 Author

jynrock Mar 16, 2024

airdogvan Mar 16, 2024 Author

jynrock Mar 16, 2024

airdogvan Mar 16, 2024 Author

jynrock Mar 16, 2024

airdogvan Mar 16, 2024 Author

jynrock Mar 16, 2024

airdogvan
Mar 3, 2024

Replies: 6 comments 3 replies

airdogvan
Mar 3, 2024
Author

airdogvan
Mar 3, 2024
Author

jynrock
Mar 16, 2024

airdogvan
Mar 16, 2024
Author

airdogvan
Mar 16, 2024
Author

airdogvan
Mar 16, 2024
Author