A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
-
Updated
Jun 9, 2024 - Python
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Runtime Audio Importer plugin for Unreal Engine. Importing audio of various formats at runtime.
ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!
silero-vad + whisper.cpp (speech-to-text) for ROS 2
OpenVoiceOS Voice Satellite
Tr-VAD: An Efficient Transformer based Voice Activity Detection Model
faster_whisper GUI with PySide6
LastFM recommendation with sentiment analysis (Bachelor Thesis Project)
Transcribe Like a Pro, Without Paying a Penny!
Uses the excellent silero VAD with onnxruntime C api for fast detection of audio segments with speech
Voice Activity Detection (VAD) AudioWorklet
On-device voice activity detection (VAD) powered by deep learning
An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine
A local and uncensored AI entity.
Automagically synchronize subtitles with video.
Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.
Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts
Add a description, image, and links to the vad topic page so that developers can more easily learn about it.
To associate your repository with the vad topic, visit your repo's landing page and select "manage topics."