speech-translation

Here are 47 public repositories matching this topic...

NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

machine-translation tts speech-synthesis neural-networks deeplearning speaker-recognition asr multimodal speech-translation large-language-models speaker-diariazation generative-ai

Updated May 23, 2024
Python

espnet / espnet

Star

End-to-End Speech Processing Toolkit

deep-learning chainer end-to-end machine-translation pytorch speech-synthesis speech-recognition kaldi voice-conversion speaker-diarization speech-separation speech-enhancement spoken-language-understanding speech-translation singing-voice-synthesis

Updated May 22, 2024
Python

KevKibe / African-Whisper

Star

🚀 Framework for seamless fine-tuning of Whisper model on a multi-lingual dataset and deployment to prod.

speech speech-recognition speech-to-text whisper asr speech-translation speech-transcription

Updated May 21, 2024
Python

echogarden-project / echogarden

Star

Easy-to-use speech toolset. Written in TypeScript. Includes tools for synthesis, recognition, alignment, speech translation, language detection, source separation and more.

text-to-speech speech language-detection speech-synthesis speech-recognition speech-to-text source-separation language-identification forced-alignment speech-translation speech-alignment

Updated May 15, 2024
TypeScript

microsoft / SpeechT5

Star

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

speech-synthesis speech-recognition speech-translation speech-pretraining speecht5 speech2c speechlm speechut speech-text-pretraining vatlm vallex

Updated Apr 24, 2024
Python

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Updated May 23, 2024
Python

mllpresearch / ESO-dataset

Star

ESO speech dataset: an English-language speech corpus of the oncology domain for ASR training and benchmarking and MT benchmarking.

machine-translation automatic-speech-recognition oncology domain-adaptation speech-corpus speech-translation large-language-models llm

Updated Apr 15, 2024

csikasote / bigc

Star

This repository contains the data resources for the LacunaFund supported project, Multimodal datasets for the Bemba Language of Zambia.

machine-translation speech-recognition zambia multimodal-learning speech-translation bemba-language image-grounded-conversations africa-language

Updated Apr 1, 2024

mt-upc / ZeroSwot

Star

Pushing the Limits of Zero-shot End-to-End Speech Translation

translation speech-translation

Updated Mar 30, 2024
Python

hlt-mt / FBK-fairseq

Star

Repository containing the open source code of works published at the FBK MT unit.

deep-learning pytorch speech-to-text subtitling gender-bias speech-translation simultaneous-translation

Updated Feb 22, 2024
Python

George0828Zhang / torch_cif

Star

A fast parallel PyTorch implementation of the "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition" https://arxiv.org/abs/1905.11235.

speech torch pytorch speech-recognition alignment automatic-speech-recognition speech-to-text cif asr monotonic speech-translation continuous-integrate-and-fire

Updated Feb 10, 2024
Python

Dadangdut33 / Speech-Translate

Star

A realtime speech transcription and translation application using Whisper OpenAI and free translation API. Interface made using Tkinter. Code written fully in Python.

python translate whisper tkinter-python speech-translation speech-transcription