-
Notifications
You must be signed in to change notification settings - Fork 791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documenting audio length limitations for OpenAI Whisper API. #680
Comments
gpt polite, short version:
|
i am not sure what's wrong /w buzz's part of code to transcribe, i just paste mine (relevant part) here: i am no programmer, i got these from chatgpt plus. `
` |
for an education video, a 3:30 video need 1.7MB of .m4a (smaller than .mp3). so a video of about 49min could be processed by my script in theory, not too bad. but i did saw someone have script to cut it into parts then submit individually. i'll later borrow code from the script if have time. |
Thanks for the report here. I fixed this issue months ago (#652), but I haven't had time to release a new version recently. I'll try to do so this coming week and update this thread. Thanks! |
hey, on website someone showed how to split the file, i am a naive programmer using chatgpt, i could copycat that. indeep i am still writing it yesterday. do you think you can incorporate it, i dont want to re-invent the wheel. that author's youtube video on this: the lab on colab: page on github: essentially: #@markdown ### Length of the segment to split (in milliseconds): Load the MP3 audio filesound = AudioSegment.from_file(f'{filename}.mp3', format='mp3') sound_track = [] Split the audio file into multiple filesfor i, chunk in enumerate(sound[::segment_length]): he use jupyter/colab so the codes are in blocks. code: then join the SRT (and txt file, easy for txt file). code: this will make BUZZ much more useful! |
i need such function deadly now, if you are interested, pls just let me know, then you tell me the part that deal /w the processing, i could implement into it. ps i am no programmer, thanks |
ps: i tried to do the spliting /w .m4a /w pydub and ffmpeg, but i failed. so recently i'll try again with .mp3. |
hi, ok, i'll wait for your reply at the weekend.
|
hi, for those who want a temp solution, it helps to split and merge the video, one by one. would be useful while we wait for the update. Buzz is good that it could handle many files at one (but not recursively, and no need to as likely people wont do that, as recusively process media files will be very CPU demanding). thanks |
When using the OpenAI Whisper API option for transcribing a long audio (1 hour), buzz will report
According to the OpenAI community, whisper API has a file limit of 25MB, while buzz converts input files to PCM (16000 sample rate), which means that audios exceeding around 800 seconds will result in error (I tried 850 seconds and the error happens).
I recommend:
or,
The text was updated successfully, but these errors were encountered: