Crash in hipDriverGetVersion on windows #4094

ggjk616 · 2024-05-02T06:33:48Z

What is the issue?

Can you help me,In the documentation, I noticed the following statement: “You can set OLLAMA_LLM_LIBRARY to any of the available LLM libraries to bypass autodetection, so for example, if you have a CUDA card, but want to force the CPU LLM library with AVX2 vector support, use:
OLLAMA_LLM_LIBRARY="cpu_avx2" ollama serve”
But After setting OLLAMA_LLM_LIBRARY=“cpu_avx2”, the program still detects my GPU when loading the model, resulting in an error: Error: Post “https://127.0.0.1:11434/api/chat”: read tcp 127.0.0.1:56915->127.0.0.1:11434: wsarecv: An existing connection was forcibly closed by the remote host.

OS

Windows

GPU

AMD

CPU

Intel

Ollama version

No response

dhiltgen · 2024-05-02T17:48:55Z

Ollama is a client server architecture. My suspicion is you're setting this flag in the client, not the server.

On windows, typically you set this as a system environment variable. See https://github.com/ollama/ollama/blob/main/docs/faq.md#setting-environment-variables-on-windows

That said, it shouldn't crash when running on the GPU. Can you share the server log for your crash scenario? https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues

ggjk616 · 2024-05-03T02:45:11Z

Certainly, here is the translation of your issue into English:

In fact, the server crash was caused by my old GPU (Radeon520). I noticed that when I did not disable it, as soon as I used the command: ollama run models_name, it would cause the server to crash with an error: Error: Post "https://127.0.0.1:11434/api/chat": read tcp 127.0.0.1:56915->127.0.0.1:11434: wsarecv: An existing connection was forcibly closed by the remote host.
This is the log information when the problem occurs:

time=2024-05-01T10:06:51.721+02:00 level=INFO source=gpu.go:268 msg="Searching for GPU management library nvml.dll"
time=2024-05-01T10:06:51.734+02:00 level=INFO source=gpu.go:314 msg="Discovered GPU libraries:[]"
time=2024-05-01T10:06:51.734+02:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
Exception 0xc0000005 0x8 0x256f1f30780 0x256f1f30780
PC= 0x256f1f30780
signal arrived during external code execution

When I manually disable it (Radeon520), ollama can successfully load the model and run the model using the CPU. The relevant log information is as follows:

time=2024-05-01T10:07:59.188+02:00 level=INFO source=gpu.go:268 msg="Searching for GPU management library nvml.dll"
time=2024-05-01T10:07:59.198+02:00 level=INFO source=gpu.go:314 msg="Discovered GPU libraries:[]"
time=2024-05-01T10:07:59.198+02:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
time=2024-05-01T10:07:59.235+02:00 level=INFO source=amd_windows.go:40 msg="AMD Driver:324007"
time=2024-05-01T10:07:59.235+02:00 level=INFO source=amd_hip_windows.go:97 msg="AMD ROCm reports no devices found"

So after reading the documentation, I thought by setting the environment variable OLLAMA_LLM_LIBRARY to force the use of the AVX2 library and not let it detect my GPU, I should be able to load the model smoothly, but it seems that it did not work. Could you please tell me where I went wrong?

dhiltgen · 2024-05-04T21:34:11Z

You didn't mention what version you were running. On versions before 0.1.33 we didn't handle spaces and quotes on the OLLAMA_LLM_LIBRARY variable properly, so it's possible you may have included quotes and that could explain it not working. Please give 0.1.33 a try and if it still isn't respecting your OLLAMA_LLM_LIBRARY setting, share you server.log so we can see more details. It may be helpful to set OLLAMA_DEBUg=1 as well

ggjk616 · 2024-05-05T09:22:06Z

I have updated to the latest version, here is the relevant log information,
serves.log.txt

dhiltgen · 2024-05-05T22:47:47Z

Thanks for the server log @ggjk616

It looks like we're crashing while trying to call an AMD Driver API to check the version via hipDriverGetVersion

Exception 0xc0000005 0x8 0x142a70baf60 0x142a70baf60
PC=0x142a70baf60
signal arrived during external code execution

runtime.cgocall(0x923c20, 0x20d66c0)
	runtime/cgocall.go:157 +0x3e fp=0xc00051d140 sp=0xc00051d108 pc=0x8b92fe
syscall.SyscallN(0x7fffe5b8dd00?, {0xc00019f710?, 0x1?, 0x7fffe5880000?})
	runtime/syscall_windows.go:544 +0x107 fp=0xc00051d1b8 sp=0xc00051d140 pc=0x91f147
github.com/ollama/ollama/gpu.(*HipLib).AMDDriverVersion(0xc0000feab0)
	github.com/ollama/ollama/gpu/amd_hip_windows.go:82 +0x69 fp=0xc00051d228 sp=0xc00051d1b8 pc=0xd4f489
github.com/ollama/ollama/gpu.AMDGetGPUInfo()

Can you share some more information about your system? Which version of Windows? Home/Pro? Is your AMD Driver up to date? Do other GPU apps work correctly on your GPU?

ggjk616 · 2024-05-06T16:26:01Z

After reading your reply, I checked my drivers and indeed the issue was caused by the drivers not being the latest version. After updating the drivers, I was able to load the model without any issues. Thank you very much for your help! However, I'm still a bit curious as to why setting the OLLAMA_LLM_LIBRARY environment variable didn't work. You can now close this issue, and once again, thank you for your assistance!

dhiltgen · 2024-05-06T22:08:31Z

The next release should have better parsing of quotes and spaces around our env vars.

ggjk616 added the bug Something isn't working label May 2, 2024

dhiltgen added the windows label May 2, 2024

dhiltgen self-assigned this May 2, 2024

dhiltgen added the amd Issues relating to AMD GPUs and ROCm label May 5, 2024

dhiltgen changed the title ~~Is there a problem with the document?~~ Crash in hipDriverGetVersion on windows May 5, 2024

dhiltgen closed this as completed May 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crash in hipDriverGetVersion on windows #4094

Crash in hipDriverGetVersion on windows #4094

ggjk616 commented May 2, 2024

dhiltgen commented May 2, 2024

ggjk616 commented May 3, 2024

dhiltgen commented May 4, 2024

ggjk616 commented May 5, 2024

dhiltgen commented May 5, 2024

ggjk616 commented May 6, 2024

dhiltgen commented May 6, 2024

Crash in hipDriverGetVersion on windows #4094

Crash in hipDriverGetVersion on windows #4094

Comments

ggjk616 commented May 2, 2024

What is the issue?

OS

GPU

CPU

Ollama version

dhiltgen commented May 2, 2024

ggjk616 commented May 3, 2024

Certainly, here is the translation of your issue into English:

dhiltgen commented May 4, 2024

ggjk616 commented May 5, 2024

dhiltgen commented May 5, 2024

ggjk616 commented May 6, 2024

dhiltgen commented May 6, 2024