You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to run the tools/nemo_forced_aligner/align.py on a very small toy dataset but get the following error:
return torch._C._nn.pad(input, pad, mode, value)
RuntimeError: [enforce fail at alloc_cpu.cpp:114] data. DefaultCPUAllocator: not enough memory: you tried to allocate 109086779069136 bytes.
The same code works on relatively large dataset but with short audio-clips while in the current dataset I have audios of length
[14, 2, 20, , 2, 16, 19, 2, 14, 2, 7] minutes (batch_size=2)
ass - directory with corresponding tokens & words folders containing the .ass files ctm - directory with corresponding tokens, segments & words folders containing the .ctm files
Environment overview
Environment location: Local
Method of NeMo install: descibed in the NeMo docs
Environment details
If NVIDIA docker image is used you don't need to specify these.
Otherwise, please provide:
OS version: Windows 10 Pro (22H2) | OS build - 19045.4291
PyTorch version: torch2.2.1+cu121
Python version: 3.10.12
Additional context
GPU model: GeForce RTX 3090
CUDA: 12.2
I have modified some torch and pytorch_lightning source codes to make the nemo work on my Windows. But those changes were mostly related to multiprocessing strategy and other parralelization things which should have nothing to do with the given error, as I managed to run the aligner over a big dataset (but many short audios: up to 15 seconds).
The text was updated successfully, but these errors were encountered:
Describe the bug
I am trying to run the
tools/nemo_forced_aligner/align.py
on a very small toy dataset but get the following error:The same code works on relatively large dataset but with short audio-clips while in the current dataset I have audios of length
[14, 2, 20, , 2, 16, 19, 2, 14, 2, 7] minutes (
batch_size=2
)Steps/Code to reproduce bug
python tools/nemo_forced_aligner/align.py
Params:
model_path=speech_to_text_ctc_bpe__checkpoint.nemo
manifest_filepath=metadata_small.json
output_dir=save_dir
Expected behavior
The alignements stored in save_dir as follows:
ass - directory with corresponding tokens & words folders containing the .ass files
ctm - directory with corresponding tokens, segments & words folders containing the .ctm files
Environment overview
Environment details
If NVIDIA docker image is used you don't need to specify these.
Otherwise, please provide:
Additional context
GPU model: GeForce RTX 3090
CUDA: 12.2
I have modified some torch and pytorch_lightning source codes to make the
nemo
work on my Windows. But those changes were mostly related to multiprocessing strategy and other parralelization things which should have nothing to do with the given error, as I managed to run the aligner over a big dataset (but many short audios: up to 15 seconds).The text was updated successfully, but these errors were encountered: