End-to-End Speech Processing Toolkit
-
Updated
May 22, 2024 - Python
End-to-End Speech Processing Toolkit
A PyTorch-based Speech Toolkit
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
An Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.
A python package to build AI-powered real-time audio applications
Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
End-to-End Neural Diarization
An Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.
Time delay neural network (TDNN) implementation in Pytorch using unfold method
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
Add a description, image, and links to the speaker-diarization topic page so that developers can more easily learn about it.
To associate your repository with the speaker-diarization topic, visit your repo's landing page and select "manage topics."