Program memory segmentation error (core dumped) for training LibriMix #2460

wenyuc55 · 2024-03-15T08:23:10Z

Describe the bug

I execute the contents of the LibriMix folder.
However, I encountered the problem of program memory segmentation error (core dumped) during training.

I have also figured out the path in the yaml file.

Expected behaviour

I am expecting it to run through all the required Epochs and complete the training.

To Reproduce

python train.py hparams/sepformer-libri2mix.yaml

Environment Details

I am executing on ubuntu20.04.
cuda installed 11.8

Relevant Log Output

(003-env) wenyu@wenyu:~/桌面/sb_mix/speechbrain-develop/recipes/LibriMix/separation$ gdb python
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python...
(No debugging symbols found in python)
(gdb) run train.py /home/wenyu/桌面/speechbrain/speechbrain-develop/recipes/LibriMix/separation/hparams/
sepformer-libri2mix.yaml
Starting program: /home/wenyu/桌面/sb_mix/003-env/bin/python train.py /home/wenyu/桌面/speechbrain/speechbrain-develop/recipes/LibriMix/separation/hparams/sepformer-libri2mix.yaml
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[Detaching after fork from child process 245899]
[New Thread 0x7fff40bac700 (LWP 245918)]
[New Thread 0x7fff3e3ab700 (LWP 245919)]
[New Thread 0x7fff3dbaa700 (LWP 245920)]
[New Thread 0x7fff393a9700 (LWP 245921)]
[New Thread 0x7fff36ba8700 (LWP 245922)]
[New Thread 0x7fff343a7700 (LWP 245923)]
[New Thread 0x7fff33ba6700 (LWP 245924)]
[New Thread 0x7fff2f3a5700 (LWP 245925)]
[New Thread 0x7fff2eba4700 (LWP 245926)]
[New Thread 0x7fff2c3a3700 (LWP 245927)]
[New Thread 0x7fff27ba2700 (LWP 245928)]
[New Thread 0x7fff253a1700 (LWP 245929)]
[New Thread 0x7fff22ba0700 (LWP 245930)]
[New Thread 0x7fff2039f700 (LWP 245931)]
[New Thread 0x7fff1db9e700 (LWP 245932)]
[New Thread 0x7fff1b39d700 (LWP 245933)]
[New Thread 0x7fff1ab9c700 (LWP 245934)]
[New Thread 0x7fff1639b700 (LWP 245935)]
[New Thread 0x7fff13b9a700 (LWP 245936)]
[Thread 0x7fff13b9a700 (LWP 245936) exited]
[Thread 0x7fff1639b700 (LWP 245935) exited]
[Thread 0x7fff1ab9c700 (LWP 245934) exited]
[Thread 0x7fff1b39d700 (LWP 245933) exited]
[Thread 0x7fff1db9e700 (LWP 245932) exited]
[Thread 0x7fff2039f700 (LWP 245931) exited]
[Thread 0x7fff22ba0700 (LWP 245930) exited]
[Thread 0x7fff253a1700 (LWP 245929) exited]
[Thread 0x7fff2c3a3700 (LWP 245927) exited]
[Thread 0x7fff33ba6700 (LWP 245924) exited]
[Thread 0x7fff36ba8700 (LWP 245922) exited]
[Thread 0x7fff393a9700 (LWP 245921) exited]
[Thread 0x7fff3dbaa700 (LWP 245920) exited]
[Thread 0x7fff3e3ab700 (LWP 245919) exited]
[Thread 0x7fff40bac700 (LWP 245918) exited]
[Thread 0x7fff27ba2700 (LWP 245928) exited]
[Thread 0x7fff2eba4700 (LWP 245926) exited]
[Thread 0x7fff2f3a5700 (LWP 245925) exited]
[Thread 0x7fff343a7700 (LWP 245923) exited]
[Detaching after fork from child process 245937]
/home/wenyu/桌面/sb_mix/003-env/lib/python3.8/site-packages/speechbrain/dataio/dataio.py:26: UserWarning: torchaudio._backend.set_audio_backend has been deprecated. With dispatcher enabled, this function is no-op. You can remove the function call.
  torchaudio.set_audio_backend(torchaudio_backend)
/home/wenyu/桌面/sb_mix/003-env/lib/python3.8/site-packages/speechbrain/nnet/loss/stoi_loss.py:13: UserWarning: torchaudio._backend.set_audio_backend has been deprecated. With dispatcher enabled, this function is no-op. You can remove the function call.
  torchaudio.set_audio_backend(torchaudio_backend)
Traceback (most recent call last):
  File "train.py", line 557, in <module>
    hparams = load_hyperpyyaml(fin, overrides)
  File "/home/wenyu/桌面/sb_mix/003-env/lib/python3.8/site-packages/hyperpyyaml/core.py", line 157, in load_hyperpyyaml
    yaml_stream = resolve_references(yaml_stream, overrides, overrides_must_match)
  File "/home/wenyu/桌面/sb_mix/003-env/lib/python3.8/site-packages/hyperpyyaml/core.py", line 325, in resolve_references
    _walk_tree_and_resolve("root", preview, preview, file_path)
  File "/home/wenyu/桌面/sb_mix/003-env/lib/python3.8/site-packages/hyperpyyaml/core.py", line 372, in _walk_tree_and_resolve
    current_node[k] = _walk_tree_and_resolve(sub_key, sub_node, tree, file_path)
  File "/home/wenyu/桌面/sb_mix/003-env/lib/python3.8/site-packages/hyperpyyaml/core.py", line 380, in _walk_tree_and_resolve
    raise ValueError(f"'{key}' is a !PLACEHOLDER and must be replaced.")
ValueError: 'data_folder' is a !PLACEHOLDER and must be replaced.
--Type <RET> for more, q to quit, c to continue without paging--c
[Inferior 1 (process 245895) exited with code 01]
(gdb) 
[1]+  已停止               gdb python
(003-env) wenyu@wenyu:~/桌面/sb_mix/speechbrain-develop/recipes/LibriMix/separation$ pip list
Package                  Version     
------------------------ ------------
certifi                  2024.2.2    
cffi                     1.16.0      
charset-normalizer       3.3.2       
ffmpeg                   1.4         
filelock                 3.9.0       
fsspec                   2024.2.0    
future                   1.0.0       
huggingface-hub          0.21.4      
HyperPyYAML              1.2.2       
idna                     3.6         
Jinja2                   3.1.2       
joblib                   1.3.2       
MarkupSafe               2.1.3       
mir-eval                 0.6         
mpmath                   1.3.0       
networkx                 3.1         
numpy                    1.24.4      
nvidia-cublas-cu11       11.11.3.6   
nvidia-cuda-cupti-cu11   11.8.87     
nvidia-cuda-nvrtc-cu11   11.8.89     
nvidia-cuda-runtime-cu11 11.8.89     
nvidia-cudnn-cu11        8.7.0.84    
nvidia-cufft-cu11        10.9.0.58   
nvidia-curand-cu11       10.3.0.86   
nvidia-cusolver-cu11     11.4.1.48   
nvidia-cusparse-cu11     11.7.5.86   
nvidia-nccl-cu11         2.19.3      
nvidia-nvtx-cu11         11.8.86     
packaging                23.2        
pillow                   10.2.0      
pip                      20.0.2      
pkg-resources            0.0.0       
pycparser                2.21        
pyloudnorm               0.1.1       
PySoundFile              0.9.0.post1 
PyYAML                   6.0.1       
requests                 2.31.0      
ruamel.yaml              0.18.6      
ruamel.yaml.clib         0.2.8       
scipy                    1.10.1      
sentencepiece            0.2.0       
setuptools               44.0.0      
six                      1.16.0      
sox                      1.4.1       
speechbrain              0.5.16      
sympy                    1.12        
torch                    2.2.1+cu118 
torchaudio               2.2.1+cu118 
torchvision              0.17.1+cu118
tqdm                     4.66.2      
triton                   2.2.0       
typing-extensions        4.8.0       
urllib3                  2.2.1       
wheel                    0.42.0      
(003-env) wenyu@wenyu:~/桌面/sb_mix/speechbrain-develop/recipes/LibriMix/separation$ python train.py hparams/sepformer-libri2mix.yaml
speechbrain.core - Beginning experiment!
speechbrain.core - Experiment folder: results/sepformer-libri2mix/1234
speechbrain.pretrained.fetching - Fetch encoder.ckpt: Using existing file/symlink in results/sepformer-libri2mix/1234/save/encoder.ckpt.
speechbrain.pretrained.fetching - Fetch decoder.ckpt: Using existing file/symlink in results/sepformer-libri2mix/1234/save/decoder.ckpt.
speechbrain.pretrained.fetching - Fetch masknet.ckpt: Using existing file/symlink in results/sepformer-libri2mix/1234/save/masknet.ckpt.
speechbrain.utils.parameter_transfer - Loading pretrained files for: encoder, decoder, masknet
speechbrain.core - Info: auto_mix_prec arg from hparam file is used
speechbrain.core - Info: noprogressbar arg from hparam file is used
speechbrain.core - Info: ckpt_interval_minutes arg from hparam file is used
speechbrain.core - 25.7M trainable parameters in Separation
speechbrain.utils.checkpoints - Would load a checkpoint here, but none found yet.
speechbrain.utils.epoch_loop - Going into epoch 1
  0%|                                                                             | 0/6 [00:00<?, ?it/s]程式記憶體區段錯誤 (核心已傾印)
(003-env) wenyu@wenyu:~/桌面/sb_mix/speechbrain-develop/recipes/LibriMix/separation$

Additional Context

Any help someone could provide would be much appreciated!

wenyuc55 added the bug Something isn't working label Mar 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Program memory segmentation error (core dumped) for training LibriMix #2460

Program memory segmentation error (core dumped) for training LibriMix #2460

wenyuc55 commented Mar 15, 2024

Program memory segmentation error (core dumped) for training LibriMix #2460

Program memory segmentation error (core dumped) for training LibriMix #2460

Comments

wenyuc55 commented Mar 15, 2024

Describe the bug

Expected behaviour

To Reproduce

Environment Details

Relevant Log Output

Additional Context