Silero VAD #51

Trevor-Z · 2023-10-03T00:55:30Z

First of all, thanks for this project, it's very easy to set up and run locally.

Transcribing on this webui, the large-v2 model skips the first three sentences in a file I tested, just like what happens over here with the Silero VAD turned off : https://huggingface.co/spaces/aadnk/faster-whisper-webui

I guess the VAD is included here (silero_vad.onnx). Is it on by default? Are there any settings I could tweak?

jhj0517 · 2023-10-03T06:53:13Z

Hi @Trevor-Z !
According to faster-whisper, the vad filter (Silero VAD) is turned off by default.
So it's turned off when you just transcribed in this webui.
I may have to add the vad filter options in the Advanced Paramters.

For now, if Whisper doesn't transcribe the first few sentences, it may mean that Whisper recognized them as a "silent" part of the audio.

You can adjust the log_prob_threshold and no_speech_threshold values in the Advanced Parameters tab to adjust how Whisper handles a silent part.
You can see how to use these parameters in the wiki.

Trevor-Z · 2023-10-03T11:59:28Z

What's the valid range of values for log_prob_threshold and no_speech_threshold?

Also, is there some way to turn the vad on now? Like changing a parameter in some .py file?

jhj0517 added the enhancement New feature or request label Oct 3, 2023

jhj0517 mentioned this issue May 21, 2024

add vad_filter #153

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Silero VAD #51

Silero VAD #51

Trevor-Z commented Oct 3, 2023 •

edited

jhj0517 commented Oct 3, 2023 •

edited

Trevor-Z commented Oct 3, 2023 •

edited

Silero VAD #51

Silero VAD #51

Comments

Trevor-Z commented Oct 3, 2023 • edited

jhj0517 commented Oct 3, 2023 • edited

Trevor-Z commented Oct 3, 2023 • edited

Trevor-Z commented Oct 3, 2023 •

edited

jhj0517 commented Oct 3, 2023 •

edited

Trevor-Z commented Oct 3, 2023 •

edited