GPU Limit Error when using stt_multilingual_fastconformer_hybrid_large_pc model for long audio file (55 minutes) #9071

tempops · 2024-04-30T17:53:27Z

Describe the bug

I used the Long-audio-transcription-Citrinet.ipynb notebook to transcribe a long audio file the default Citrinet Model performs well but due to the High WER I wanted to try some of the recent models so I swapped it with the fast conformer model and got this error during computation

Error:
OutOfMemoryError: CUDA out of memory. Tried to allocate 30.85 GiB. GPU 0 has a total capacity of 14.75 GiB of which 9.75 GiB is free. Process 167515 has 4.99 GiB memory in use. Of the allocated memory 3.18 GiB is allocated by PyTorch, and 1.68 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management

Environment overview (please complete the following information)

Followed instructions in Google collab notebook: Long-audio-transcription-Citrinet.ipynb

Environment details

Google collar notebook: Long-audio-transcription-Citrinet.ipynb

Wanted to know if the model I am using is the issue and if so then which model can I use from the newer models for longer audio file transcriptions (1 hour and greater)

nithinraok · 2024-05-08T17:56:29Z

55 minutes is high for fastconformer models with full attention.
Could you try with this model: https://huggingface.co/nvidia/parakeet-tdt_ctc-1.1b

tempops · 2024-05-15T14:33:05Z

Thank you, will try this out.
By the way as per the documentation in hugging face under which license does this and other parakeet models (rnnt) and come under? It says cc by 4.0 does this grant it commercial use?

nithinraok · 2024-05-15T15:24:21Z

yes, cc-by-4.0 grants commercial usage

tempops added the bug Something isn't working label Apr 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU Limit Error when using stt_multilingual_fastconformer_hybrid_large_pc model for long audio file (55 minutes) #9071

GPU Limit Error when using stt_multilingual_fastconformer_hybrid_large_pc model for long audio file (55 minutes) #9071

tempops commented Apr 30, 2024

nithinraok commented May 8, 2024

tempops commented May 15, 2024

nithinraok commented May 15, 2024

GPU Limit Error when using stt_multilingual_fastconformer_hybrid_large_pc model for long audio file (55 minutes) #9071

GPU Limit Error when using stt_multilingual_fastconformer_hybrid_large_pc model for long audio file (55 minutes) #9071

Comments

tempops commented Apr 30, 2024

nithinraok commented May 8, 2024

tempops commented May 15, 2024

nithinraok commented May 15, 2024