Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU Limit Error when using stt_multilingual_fastconformer_hybrid_large_pc model for long audio file (55 minutes) #9071

Open
tempops opened this issue Apr 30, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@tempops
Copy link

tempops commented Apr 30, 2024

Describe the bug

I used the Long-audio-transcription-Citrinet.ipynb notebook to transcribe a long audio file the default Citrinet Model performs well but due to the High WER I wanted to try some of the recent models so I swapped it with the fast conformer model and got this error during computation

Error:
OutOfMemoryError: CUDA out of memory. Tried to allocate 30.85 GiB. GPU 0 has a total capacity of 14.75 GiB of which 9.75 GiB is free. Process 167515 has 4.99 GiB memory in use. Of the allocated memory 3.18 GiB is allocated by PyTorch, and 1.68 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management

Environment overview (please complete the following information)

Followed instructions in Google collab notebook: Long-audio-transcription-Citrinet.ipynb

Environment details

Google collar notebook: Long-audio-transcription-Citrinet.ipynb

Wanted to know if the model I am using is the issue and if so then which model can I use from the newer models for longer audio file transcriptions (1 hour and greater)

@tempops tempops added the bug Something isn't working label Apr 30, 2024
@nithinraok
Copy link
Collaborator

55 minutes is high for fastconformer models with full attention.
Could you try with this model: https://huggingface.co/nvidia/parakeet-tdt_ctc-1.1b

@tempops
Copy link
Author

tempops commented May 15, 2024

Thank you, will try this out.
By the way as per the documentation in hugging face under which license does this and other parakeet models (rnnt) and come under? It says cc by 4.0 does this grant it commercial use?

@nithinraok
Copy link
Collaborator

yes, cc-by-4.0 grants commercial usage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants