-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Cannot dlopen some GPU libraries." does not List what Libraries Failed to Load #66987
Comments
Hi @stellarpower , You need to install GPU driver manually.After that you need to set LD_LIBRARY_PATH to the path where nvidia libraries installed. You may refer this comment . Please refer #63362 for more details. Thanks |
Thanks; I had done all this previously. But I have opened as an issue irrespective of my own setup, because I believe it should be possible to get more information from the error message. Without knowing what libraries failed to be opened, just re-installing and following the instructions again isn't a particularly efficient way to debug what happened. |
Yes, I agree. I am running into the same issue now. This is particularly frustrating because of the arcane versioning of CUDA-related toolsets (i.e. the Python packages vs. CUDA vs. the dependency matrix in the documentation). For example:
and relevant links in the docs only seem to link out to Docker-related stuff, like https://www.tensorflow.org/install/source so the vast majority of information on the internet is out of date. Is there any clearer guidance for how to get TensorFlow working on GPUs assuming your CUDA install is non-standard, i.e., not installed out of the Ubuntu package repo (which is infeasible in many academic settings)? Thanks very much in advance. EDIT: I was able to resolve this by using the |
@wjno thanks - I resolved the underlying problem in the end, and from memory thought I had increased the log verbosity as high as it would go, but maybe I had not. If I encounter some library problems again I'll give it a go. Cheers! |
Hi @wjn0 , AFAIK the setting Thanks for the info. |
This issue is stale because it has been open for 7 days with no activity. It will be closed if no further activity occurs. Thank you. |
This issue was closed because it has been inactive for 7 days since being marked as stale. Please reopen if you'd like to work on this further. |
Issue type
Feature Request
Have you reproduced the bug with TensorFlow Nightly?
Yes
Source
binary
TensorFlow version
tf-nightly 2.17.0.dev20240504
Custom code
No
OS platform and distribution
Ubuntu Jammy
Mobile device
No response
Python version
3.12
Bazel version
No response
GCC/compiler version
No response
CUDA/cuDNN version
No response
GPU model and memory
No response
Current behavior?
I have installed tf-nightly from the official PYPI package, like so:
When I load Tensorflow, it isn't seeing my GPU, and I get the message
I normally use the conda-forge packages, in part precisely because it should handle some of these things for me so I don't have to worry. But I saw pip installing a large number of CUDA libraries during the process, so I'd expect most of what I need to be there.
The function
MaybeTryDlopenGPULibraries()
is responsible for attempting to load the required libraries in at runtime, however, it doesn't tell me what libraries it tried to find, what search path it was using, etc. As I've followed the steps in the guide at that URL, it's not the most helpful diagnostic message without further information.Whilst short, and therefore not cluttering the screen (which may be good for many situations), the message isn't that helpful to try and work out what the problem is. Obviously on modern complex systems, library search paths can be pretty finicky to work out, so if not the default behaviour, I'd at least like to see a flag/environment variable I can set to see output of what library loads were attempted, what succeeded (and the path), and what was missing, in addition to other debugging output. If the short form of the message is kept as the default behaviour, then it would be good for this to print out how to set this option so that I can go round again and get more verbose output.
Thanks
Standalone code to reproduce the issue
Relevant log output
No response
The text was updated successfully, but these errors were encountered: