Real-time transcription on Raspberry Pi 4 #166
Replies: 17 comments 24 replies
-
Fantastic work!! On my RPi 4 with Raspberry Pi OS Lite (bullseye) installed, I had to run
or the compiler would complain about a missing "SDL.h" header file. Now it works like a charm. |
Beta Was this translation helpful? Give feedback.
-
Thanks, worked pretty well on the pi for me! ( twitter clip ) Yeah, I had to install libsdl2-dev as @eternitybt mentioned as well. (for the mic, I used a ReSpeaker Mic Array v2) |
Beta Was this translation helpful? Give feedback.
-
Made a video of the install here: https://youtu.be/caaKhWcfcCY |
Beta Was this translation helpful? Give feedback.
-
Hi, what is version Pi4 are you using? Is there a minimum memory requirement? I'm getting a 'Illegal instruction (core dumped)' when I try this on a 1GB Pi4. |
Beta Was this translation helpful? Give feedback.
-
can this be done on Raspberry? pi 3b+ model? I wanted to use is in speech recognition |
Beta Was this translation helpful? Give feedback.
-
Hello, I followed the build instructions on a Pi4 model B and am receiving this error: "fatal error: immintrin.h: No such file or directory" when attempting the make/build. |
Beta Was this translation helpful? Give feedback.
-
I typed the following in terminal: pi@raspberrypi:~ $ uname -a Linux raspberrypi 6.1.29-v8+ #1652 SMP PREEMPT Wed May 24 14:46:55 BST 2023 aarch64 Also typed: pi@raspberrypi:~ $ cat /etc/os-release PRETTY_NAME="Debian GNU/Linux 11 (bullseye)" Used the Raspberry Pi Imager v1.7.4 with Pi OS 64 bit, Debian Bullseye Desktop |
Beta Was this translation helpful? Give feedback.
-
I am running on Raspberry Pi 4b and I can record through ffmpeg, but Stream has no output : ffmpeg -f pulse -i alsa_input.usb-C-Media_Electronics_Inc._USB_PnP_Sound_Device-00.analog-mono -ar 16000 -ac 1 recording.wav root@a0f34bc2c254:/whisper-cpp/whisper.cpp# ./stream -m ./models/ggml-tiny.bin -t 6 --step 0 --length 30000 -vth 0.6 whisper_model_load: adding 1608 extra tokens whisper_model_load: model size = 73.54 MB whisper_init_state: kv cross size = 8.79 MB main: processing 0 samples (step = 0.0 sec / len = 30.0 sec / keep = 0.0 sec), 6 threads, lang = en, task = transcribe, timestamps = 1 ... [Start speaking] |
Beta Was this translation helpful? Give feedback.
-
Did you try the default example from above? ./stream -m models/ggml-tiny.en.bin --step 4000 --length 8000 -c 0 -t 4 -ac 512 I wasn't able to get your code to work either. ./stream -m ./models/ggml-tiny.bin -t 6 --step 0 --length 30000 -vth 0.6 Try taking the default example and add -vth 0.6 to the end for the voice activation detector (VAD) like below. Worked well for me. ./stream -m models/ggml-tiny.en.bin --step 4000 --length 8000 -c 0 -t 4 -ac 512 -vth 0.6 Also, the line below works with 6 threads which surprised me, because I thought the Raspberry Pi 4 could go up to 4 threads because it has 4 cores. In the task manager, the CPU usage would sometimes throttle up to near 100% when using either -t 4 or -t 6 ./stream -m models/ggml-tiny.en.bin --step 4000 --length 8000 -c 0 -t 6 -ac 512 -vth 0.6 I turned the -step down to 0 like below and it worked once then stopped working. ./stream -m models/ggml-tiny.en.bin --step 0 --length 8000 -c 0 -t 6 -ac 512 -vth 0.6 From what I've seen, upping the --step to 2000 works better and 4000 even better. |
Beta Was this translation helpful? Give feedback.
-
The Raspberry Pi 4 is a bit slow, but some development boards equipped with the RK3588 chip have a 6 TOPS NPU. We should consider supporting these chips, as they could potentially enable "real" real-time transcription. @ggerganov |
Beta Was this translation helpful? Give feedback.
-
Well, it takes several tens of seconds for a 3 second long wav file... |
Beta Was this translation helpful? Give feedback.
-
Whisper is working on the Raspberry Pi 5, up to the small model. Video here: https://youtu.be/W39teHesXus |
Beta Was this translation helpful? Give feedback.
-
I managed to get Output from
Output from
For example, if I use I tried
But then no transcription, even using the sample jfk.wav. |
Beta Was this translation helpful? Give feedback.
-
Hi! Great job, really! One question: is there any example/tutorials/guide/whatever on how to implements the same thing using whisper.cpp inside a python script? Thank you. |
Beta Was this translation helpful? Give feedback.
-
I'm getting an error when compiling using make -j stream on a Raspberry Pi 5 running Pi OS Bookworm 12.2.0 (uname -a following the build instructions I get an error from make which results in no ./stream folder
any ideas? thanks for any help! |
Beta Was this translation helpful? Give feedback.
-
can you show how to fix this build errer ?thanks! |
Beta Was this translation helpful? Give feedback.
-
Hi @ggerganov i was trying to run the |
Beta Was this translation helpful? Give feedback.
-
It is possible to some extend to run Whisper in real-time mode on an embedded device such as the Raspberry Pi.
Below are a few examples + build instructions.
Real-time with 4 seconds step
whisper-raspberry-2.mp4
Real-time with 7.5 seconds step
whisper-raspberry-3.mp4
Build instructions
More information
In order to speed-up the processing, the Encoder's context is reduced from the original 1500 down to 512 (using the
-ac 512
flag). This allows to run the above examples on a Raspberry Pi 4 Model B (2018) on 3 CPU threads using thetiny.en
Whisper model. The attached microphone is from a USB camera, so not great quality.More detailed discussion can be found in this issue: #7
Explanation of what the
-ac
flags does: #137Beta Was this translation helpful? Give feedback.
All reactions