Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Fedora 40 with rocm #3877

Closed
oatmealm opened this issue Apr 24, 2024 · 7 comments · Fixed by #4090
Closed

Support for Fedora 40 with rocm #3877

oatmealm opened this issue Apr 24, 2024 · 7 comments · Fixed by #4090
Assignees
Labels
amd Issues relating to AMD GPUs and ROCm feature request New feature or request

Comments

@oatmealm
Copy link

Since F40 has rocm6 now, it'd be useful if it can be picked up. I have this to block the installer from downloading the libararies:

lrwxrwxrwx. 1 root root   28 Apr 24 16:08 libamd_comgr.so.2 -> /usr/lib64/libamd_comgr.so.2
lrwxrwxrwx. 1 root root   27 Apr 24 16:08 libamdhip64.so.6 -> /usr/lib64/libamdhip64.so.6
lrwxrwxrwx. 1 root root   29 Apr 24 16:08 libdrm_amdgpu.so.1 -> /usr/lib64/libdrm_amdgpu.so.1
lrwxrwxrwx. 1 root root   26 Apr 24 16:09 libhipblas.so.2 -> /usr/lib64/libhipblas.so.2
lrwxrwxrwx. 1 root root   32 Apr 24 16:09 libhsa-runtime64.so.1 -> /usr/lib64/libhsa-runtime64.so.1
lrwxrwxrwx. 1 root root   26 Apr 24 16:09 librocblas.so.4 -> /usr/lib64/librocblas.so.4
lrwxrwxrwx. 1 root root   28 Apr 24 16:09 librocsolver.so.0 -> /usr/lib64/librocsolver.so.0
lrwxrwxrwx. 1 root root   28 Apr 24 16:10 librocsparse.so.1 -> /usr/lib64/librocsparse.so.1
lrwxrwxrwx. 1 root root   24 Apr 24 16:10 libtinfo.so.6 -> /usr/lib64/libtinfo.so.6
lrwxrwxrwx. 1 root root   19 Apr 24 16:13 rocblas -> /usr/lib64/rocblas/

Though I do see problems with support for gfx900 which I can't avoid with HSA_OVERRIDE_GFX_VERSION=9.0.0...

@oatmealm oatmealm added the feature request New feature or request label Apr 24, 2024
@dhiltgen dhiltgen self-assigned this Apr 24, 2024
@dhiltgen
Copy link
Collaborator

We do have code to try to detect a pre-existing ROCm install. From your description, I take it this logic isn't working on your system, correct?
https://github.com/ollama/ollama/blob/main/scripts/install.sh#L169-L175

Where is ROCm actually installed?

@dhiltgen dhiltgen added the amd Issues relating to AMD GPUs and ROCm label Apr 24, 2024
@oatmealm
Copy link
Author

Yes, I've used the installer to figure out where it's looking for stuff. Fedora 40 include ROCm now in standard locations as far as I can tell. So the paths used are not useful. I'm not sure, but I guess it would never look inside /usr/lib64/rocm/

I'm seeing this:

sudo rpm -ql rocm-hip
/usr/lib/.build-id
/usr/lib/.build-id/1e
/usr/lib/.build-id/1e/9f86f22a6c0383de4237a6b043d062c643f195
/usr/lib/.build-id/d3
/usr/lib/.build-id/d3/fc97c90d4450e3a2ae67cf1261b26270b66d2c
/usr/lib/.build-id/f3
/usr/lib/.build-id/f3/db88eecffb15fec1e97d086d3630fb8cf8768f
/usr/lib64/libamdhip64.so.6
/usr/lib64/libamdhip64.so.6.0.32831
/usr/lib64/libhiprtc-builtins.so.6
/usr/lib64/libhiprtc-builtins.so.6.0.32831
/usr/lib64/libhiprtc.so.6
/usr/lib64/libhiprtc.so.6.0.32831
/usr/share/doc/rocm-hip
/usr/share/doc/rocm-hip/README.md
/usr/share/hip
/usr/share/hip/version
/usr/share/licenses/rocm-hip
/usr/share/licenses/rocm-hip/LICENSE.txt
sudo rpm -ql rocm-rpm-macros-modules
/usr/lib64/rocm
/usr/lib64/rocm/gfx10
/usr/lib64/rocm/gfx10/bin
/usr/lib64/rocm/gfx10/lib
/usr/lib64/rocm/gfx10/lib/cmake
/usr/lib64/rocm/gfx11
/usr/lib64/rocm/gfx11/bin
/usr/lib64/rocm/gfx11/lib
/usr/lib64/rocm/gfx11/lib/cmake
/usr/lib64/rocm/gfx1100
/usr/lib64/rocm/gfx1100/bin
/usr/lib64/rocm/gfx1100/lib
/usr/lib64/rocm/gfx1100/lib/cmake
/usr/lib64/rocm/gfx1101
/usr/lib64/rocm/gfx1101/bin
/usr/lib64/rocm/gfx1101/lib
/usr/lib64/rocm/gfx1101/lib/cmake
/usr/lib64/rocm/gfx1102
/usr/lib64/rocm/gfx1102/bin
/usr/lib64/rocm/gfx1102/lib
/usr/lib64/rocm/gfx1102/lib/cmake
/usr/lib64/rocm/gfx1103
/usr/lib64/rocm/gfx1103/bin
/usr/lib64/rocm/gfx1103/lib
/usr/lib64/rocm/gfx1103/lib/cmake
/usr/lib64/rocm/gfx8
/usr/lib64/rocm/gfx8/bin
/usr/lib64/rocm/gfx8/lib
/usr/lib64/rocm/gfx8/lib/cmake
/usr/lib64/rocm/gfx9
/usr/lib64/rocm/gfx9/bin
/usr/lib64/rocm/gfx9/lib
/usr/lib64/rocm/gfx9/lib/cmake
/usr/lib64/rocm/gfx906
/usr/lib64/rocm/gfx906/bin
/usr/lib64/rocm/gfx906/lib
/usr/lib64/rocm/gfx906/lib/cmake
/usr/lib64/rocm/gfx908
/usr/lib64/rocm/gfx908/bin
/usr/lib64/rocm/gfx908/lib
/usr/lib64/rocm/gfx908/lib/cmake
/usr/lib64/rocm/gfx90a
/usr/lib64/rocm/gfx90a/bin
/usr/lib64/rocm/gfx90a/lib
/usr/lib64/rocm/gfx90a/lib/cmake
/usr/share/licenses/rocm-rpm-macros-modules
/usr/share/licenses/rocm-rpm-macros-modules/GPL
/usr/share/modulefiles/rocm
/usr/share/modulefiles/rocm/default
/usr/share/modulefiles/rocm/gfx10
/usr/share/modulefiles/rocm/gfx11
/usr/share/modulefiles/rocm/gfx1100
/usr/share/modulefiles/rocm/gfx1101
/usr/share/modulefiles/rocm/gfx1102
/usr/share/modulefiles/rocm/gfx1103
/usr/share/modulefiles/rocm/gfx8
/usr/share/modulefiles/rocm/gfx9
/usr/share/modulefiles/rocm/gfx906
/usr/share/modulefiles/rocm/gfx908
/usr/share/modulefiles/rocm/gfx90a
sudo rpm -ql rocm-runtime
/usr/lib/.build-id
/usr/lib/.build-id/62
/usr/lib/.build-id/62/15c729b690f61e15498acb9fa87145530bacdb
/usr/lib64/libhsa-runtime64.so.1
/usr/lib64/libhsa-runtime64.so.1.12.0
/usr/share/doc/rocm-runtime
/usr/share/doc/rocm-runtime/README.md
/usr/share/licenses/rocm-runtime
/usr/share/licenses/rocm-runtime/LICENSE.txt

@rbjorklin
Copy link

rbjorklin commented Apr 27, 2024

It looks like the install script is looking for libhipblas.so.2 which can be found here:

❯ sudo rpm -ql hipblas
/usr/lib/.build-id
/usr/lib/.build-id/5a
/usr/lib/.build-id/5a/9cf985f0be40040b9036e11a4097d8a34024f0
/usr/lib/.build-id/5a/9cf985f0be40040b9036e11a4097d8a34024f0.1
/usr/lib/.build-id/bd
/usr/lib/.build-id/bd/807e0490bbc890eba4223ba83ae485a4f2a4c7
/usr/lib/.build-id/bd/807e0490bbc890eba4223ba83ae485a4f2a4c7.1
/usr/lib/.build-id/d6
/usr/lib/.build-id/d6/f667173d86eb5bf8cac21667d6a66b657e720a
/usr/lib64/cmake/hipblas
/usr/lib64/libhipblas.so.2
/usr/lib64/libhipblas.so.2.0
/usr/lib64/rocm/gfx10/lib/libhipblas.so.2
/usr/lib64/rocm/gfx10/lib/libhipblas.so.2.0
/usr/lib64/rocm/gfx11/lib/libhipblas.so.2
/usr/lib64/rocm/gfx11/lib/libhipblas.so.2.0
/usr/lib64/rocm/gfx8/lib/libhipblas.so.2
/usr/lib64/rocm/gfx8/lib/libhipblas.so.2.0
/usr/lib64/rocm/gfx9/lib/libhipblas.so.2
/usr/lib64/rocm/gfx9/lib/libhipblas.so.2.0
/usr/share/licenses/hipblas
/usr/share/licenses/hipblas/LICENSE.md

EDIT: For what it's worth this seems to work for me:

docker run -d --device /dev/kfd --device /dev/dri -v /usr/lib64:/opt/lib64:ro -e HIP_PATH=/opt/lib64/rocm -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama:0.1.33-rc5-rocm

@jontyms
Copy link

jontyms commented Apr 28, 2024

I have the new fedora 40 rocm packages installed
I am trying to set the HIP_PATH in a systemd service, and it is still not detecting rocm

$ sudo ls /usr/lib64/rocm/
gfx10  gfx11  gfx8  gfx9

Systemd service

[Unit]
Description=Ollama Service
After=network-online.target

[Service]
ExecStart=/usr/bin/ollama serve
User=ollama
Group=ollama
Environment="HIP_PATH=/usr/lib64/rocm"
Environment="OLLAMA_DEBUG=1"
Restart=always
RestartSec=3
[Install]
WantedBy=default.target

logs

Apr 27 23:43:06 fw13-fedora ollama[64253]: time=2024-04-27T23:43:06.287-04:00 level=WARN source=amd_linux.go:53 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers: amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory"
Apr 27 23:43:06 fw13-fedora ollama[64253]: time=2024-04-27T23:43:06.287-04:00 level=INFO source=amd_linux.go:88 msg="detected amdgpu versions [gfx1103]"
Apr 27 23:43:06 fw13-fedora ollama[64253]: time=2024-04-27T23:43:06.287-04:00 level=DEBUG source=amd_common.go:16 msg="evaluating potential rocm lib dir /tmp/ollama1572317086/rocm"
Apr 27 23:43:06 fw13-fedora ollama[64253]: time=2024-04-27T23:43:06.287-04:00 level=DEBUG source=amd_common.go:16 msg="evaluating potential rocm lib dir /usr/bin"
Apr 27 23:43:06 fw13-fedora ollama[64253]: time=2024-04-27T23:43:06.291-04:00 level=DEBUG source=amd_common.go:16 msg="evaluating potential rocm lib dir /usr/bin/rocm"
Apr 27 23:43:06 fw13-fedora ollama[64253]: time=2024-04-27T23:43:06.291-04:00 level=DEBUG source=amd_common.go:16 msg="evaluating potential rocm lib dir /usr/share/ollama/lib/rocm"
Apr 27 23:43:06 fw13-fedora ollama[64253]: time=2024-04-27T23:43:06.291-04:00 level=DEBUG source=amd_common.go:16 msg="evaluating potential rocm lib dir /usr/lib64/rocm/lib"
Apr 27 23:43:06 fw13-fedora ollama[64253]: time=2024-04-27T23:43:06.291-04:00 level=DEBUG source=amd_common.go:16 msg="evaluating potential rocm lib dir /"
Apr 27 23:43:06 fw13-fedora ollama[64253]: time=2024-04-27T23:43:06.292-04:00 level=DEBUG source=amd_common.go:16 msg="evaluating potential rocm lib dir /opt/rocm/lib"
Apr 27 23:43:06 fw13-fedora ollama[64253]: time=2024-04-27T23:43:06.292-04:00 level=WARN source=amd_linux.go:367 msg="amdgpu detected, but no compatible rocm library found.  Either install rocm v6, or follow manual install instructions at https://github.com/ollama/ollama/blob/main/docs/linux.md#manual-install"
Apr 27 23:43:06 fw13-fedora ollama[64253]: time=2024-04-27T23:43:06.292-04:00 level=WARN source=amd_linux.go:99 msg="unable to verify rocm library, will use cpu: no suitable rocm found, falling back to CPU"
Apr 27 23:43:06 fw13-fedora ollama[64253]: time=2024-04-27T23:43:06.292-04:00 level=INFO source=routes.go:1164 msg="no GPU detected"

thank you for your help

@dhiltgen
Copy link
Collaborator

@jontyms try setting LD_LIBRARY_PATH to include the directory. HIP_PATH typically points to the root of the ROCm install, with ./bin/ and ./lib/ subdirectories. We can probably refine our search algo to try with and without adding lib to support this better, but you should be able to get it working with LD_LIBRARY_PATH.

@oatmealm thanks for those paths. Where are the rocblas files installed? (e.g., rocblas/library/TensileLibrary_lazy_gfx*.dat)

@rbjorklin
Copy link

rbjorklin commented Apr 28, 2024

❯ rpm -ql rocblas | grep -E ".*_lazy_.*\.dat"
/usr/lib64/rocblas/library/TensileLibrary_lazy_gfx1030.dat
/usr/lib64/rocblas/library/TensileLibrary_lazy_gfx1100.dat
/usr/lib64/rocblas/library/TensileLibrary_lazy_gfx1101.dat
/usr/lib64/rocblas/library/TensileLibrary_lazy_gfx1102.dat
/usr/lib64/rocm/gfx10/lib/rocblas/library/TensileLibrary_lazy_gfx1030.dat
/usr/lib64/rocm/gfx11/lib/rocblas/library/TensileLibrary_lazy_gfx1100.dat
/usr/lib64/rocm/gfx11/lib/rocblas/library/TensileLibrary_lazy_gfx1101.dat
/usr/lib64/rocm/gfx11/lib/rocblas/library/TensileLibrary_lazy_gfx1102.dat
/usr/lib64/rocm/gfx8/lib/rocblas/library/TensileLibrary_lazy_gfx803.dat
/usr/lib64/rocm/gfx9/lib/rocblas/library/TensileLibrary_lazy_gfx900.dat
/usr/lib64/rocm/gfx9/lib/rocblas/library/TensileLibrary_lazy_gfx906.dat
/usr/lib64/rocm/gfx9/lib/rocblas/library/TensileLibrary_lazy_gfx908.dat
/usr/lib64/rocm/gfx9/lib/rocblas/library/TensileLibrary_lazy_gfx90a.dat
/usr/lib64/rocm/gfx9/lib/rocblas/library/TensileLibrary_lazy_gfx940.dat
/usr/lib64/rocm/gfx9/lib/rocblas/library/TensileLibrary_lazy_gfx941.dat
/usr/lib64/rocm/gfx9/lib/rocblas/library/TensileLibrary_lazy_gfx942.dat

❯ sha256sum /usr/lib64/rocblas/library/TensileLibrary_lazy_gfx1030.dat
ba0a3d8a1000d4d6b24b515d2f5ccc16646dc45df72872213167ab5aeb23e775  /usr/lib64/rocblas/library/TensileLibrary_lazy_gfx1030.dat

❯ sha256sum /usr/lib64/rocm/gfx10/lib/rocblas/library/TensileLibrary_lazy_gfx1030.dat
ba0a3d8a1000d4d6b24b515d2f5ccc16646dc45df72872213167ab5aeb23e775  /usr/lib64/rocm/gfx10/lib/rocblas/library/TensileLibrary_lazy_gfx1030.dat

@vorburger
Copy link

Since F40 has rocm6 now
Fedora 40 include ROCm now in standard locations as far as I can tell.
It looks like the install script is looking for libhipblas.so.2 which can be found here:

It can be found there AFTER one learns to do sudo dnf install "hipblas rocm-*"...

See https://github.com/vorburger/vorburger.ch-Notes/blob/develop/ml/ollama1.md for context.

#4527 raised to suggest adding this to documentation.

vorburger added a commit to vorburger/vorburger.ch-Notes that referenced this issue May 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
amd Issues relating to AMD GPUs and ROCm feature request New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants