Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal: Blas GEMM launch failed when running classifier for URLs #1406

Open
loginName1 opened this issue Apr 22, 2024 · 0 comments
Open

Internal: Blas GEMM launch failed when running classifier for URLs #1406

loginName1 opened this issue Apr 22, 2024 · 0 comments

Comments

@loginName1
Copy link

System information:

  • os: Windows 11
  • gpu: Nvidia GeForce RTX 3080 TI (12GB)
  • Tensor flow: tensorflow-gpu v 1.14.0
  • cuda: v 10.0 (but i have other version installed: 12.4 and 9.0 which dont have the required .dll file, all of them are in my PATH)
  • python: 3.7 (for the purposes of protobuf)

The error:

2024-04-22 15:35:14.641625: E tensorflow/stream_executor/cuda/cuda_blas.cc:428] failed to run cuBLAS routine: CUBLAS_STATUS_EXECUTION_FAILED
ERROR:tensorflow:Error recorded from training_loop: 2 root error(s) found.
  (0) Internal: Blas GEMM launch failed : a.shape=(4096, 2), b.shape=(2, 768), m=4096, n=768, k=2
         [[node bert/embeddings/MatMul (defined at D:\Faks\UM-Mag 23-25\Drugi semester\JT\google-bert\modeling.py:487) ]]
         [[loss/Mean/_4861]]
  (1) Internal: Blas GEMM launch failed : a.shape=(4096, 2), b.shape=(2, 768), m=4096, n=768, k=2
         [[node bert/embeddings/MatMul (defined at D:\Faks\UM-Mag 23-25\Drugi semester\JT\google-bert\modeling.py:487) ]]
0 successful operations.
0 derived errors ignored.

What i've tried:
I tried checking nvidia-smi.exe to see if i had something running on the GPU while training but got the following result:

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 552.22                 Driver Version: 552.22         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                     TCC/WDDM  | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3080 Ti   WDDM  |   00000000:0A:00.0 Off |                  N/A |
|  0%   36C    P8             24W /  350W |    1598MiB /  12288MiB |      3%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      9072    C+G   C:\Windows\explorer.exe                     N/A      |
|    0   N/A  N/A     10652    C+G   ...al\Discord\app-1.0.9042\Discord.exe      N/A      |
|    0   N/A  N/A     10688    C+G   ...ekyb3d8bbwe\PhoneExperienceHost.exe      N/A      |
|    0   N/A  N/A     10820    C+G   ...nt.CBS_cw5n1h2txyewy\SearchHost.exe      N/A      |
|    0   N/A  N/A     10844    C+G   ...2txyewy\StartMenuExperienceHost.exe      N/A      |
|    0   N/A  N/A     14572    C+G   ...\cef\cef.win7x64\steamwebhelper.exe      N/A      |
|    0   N/A  N/A     14744    C+G   ...GeForce Experience\NVIDIA Share.exe      N/A      |
|    0   N/A  N/A     14824    C+G   ...1.0_x64__8wekyb3d8bbwe\Video.UI.exe      N/A      |
|    0   N/A  N/A     15072    C+G   ...t.LockApp_cw5n1h2txyewy\LockApp.exe      N/A      |
|    0   N/A  N/A     17164    C+G   ...CBS_cw5n1h2txyewy\TextInputHost.exe      N/A      |
|    0   N/A  N/A     18100    C+G   ...les\Microsoft OneDrive\OneDrive.exe      N/A      |
|    0   N/A  N/A     18936    C+G   ...5n1h2txyewy\ShellExperienceHost.exe      N/A      |
|    0   N/A  N/A     19308    C+G   ...air\Corsair iCUE5 Software\iCUE.exe      N/A      |
|    0   N/A  N/A     19848    C+G   ...crosoft\Edge\Application\msedge.exe      N/A      |
|    0   N/A  N/A     23840    C+G   ..._x64__kzf8qxf38zg5c\Skype\Skype.exe      N/A      |
|    0   N/A  N/A     24040    C+G   ...lf\0.248.120.19\OverwolfBrowser.exe      N/A      |
|    0   N/A  N/A     25276    C+G   ...on\123.0.2420.97\msedgewebview2.exe      N/A      |
|    0   N/A  N/A     25604    C+G   ...ejd91yc\AdobeNotificationClient.exe      N/A      |
|    0   N/A  N/A     25636    C+G   ...509_x64__8wekyb3d8bbwe\ms-teams.exe      N/A      |
|    0   N/A  N/A     25920    C+G   ...ktop\EA Desktop\EACefSubProcess.exe      N/A      |
|    0   N/A  N/A     25972    C+G   ...\GOG Galaxy\GalaxyClient Helper.exe      N/A      |
|    0   N/A  N/A     26180    C+G   ...EA Desktop\EA Desktop\EADesktop.exe      N/A      |
|    0   N/A  N/A     28536    C+G   ...cks-services\BlueStacksServices.exe      N/A      |
|    0   N/A  N/A     29332    C+G   ...aam7r\AcrobatNotificationClient.exe      N/A      |
|    0   N/A  N/A     31120    C+G   ...on\123.0.2420.97\msedgewebview2.exe      N/A      |
|    0   N/A  N/A     31468    C+G   ..._x64__kzf8qxf38zg5c\Skype\Skype.exe      N/A      |
|    0   N/A  N/A     32340    C+G   ...m Files\Mozilla Firefox\firefox.exe      N/A      |
|    0   N/A  N/A     33104    C+G   ...on\HEX\Creative Cloud UI Helper.exe      N/A      |
|    0   N/A  N/A     33208    C+G   ...on\123.0.2420.97\msedgewebview2.exe      N/A      |
|    0   N/A  N/A     35396    C+G   ...wekyb3d8bbwe\XboxGameBarWidgets.exe      N/A      |
|    0   N/A  N/A     39264    C+G   ...m Files\Mozilla Firefox\firefox.exe      N/A      |
+-----------------------------------------------------------------------------------------+

Then i tried googling for other similar issues and found this and when following this answers instructions, adding the lines to the modeling.py after the imports i received the same error.

I didn't find any other possible solutions and i'm unsure as to what i'm doing wrong. Did i add the memory growth lines in the wrong file or did i go about solving the issue completely wrong? Any help is appreciated.

I am running the 3.7 kernel in an virtual environment and the data i am feeding the model is properly formatted. I am using the BERT base uncased model downloaded from this repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant