Simultaneous Speech Translation

Code base for simultaneous speech translation experiments. It is based on fairseq.

Implemented

Encoder

Streaming Models

Setup

Install fairseq

git clone https://github.com/pytorch/fairseq.git
cd fairseq
git checkout 4a7835b
python setup.py build_ext --inplace
pip install .

(Optional) Install apex for faster mixed precision (fp16) training.
Install dependencies

pip install -r requirements.txt

Update submodules

git submodule update --init --recursive

Pre-trained model

ASR model with Emformer encoder and Transformer decoder. Pre-trained with joint CTC cross-entropy loss.

MuST-C (WER)	en-de (V2)	en-es
dev	9.65	14.44
tst-COMMON	12.85	14.02
model	download	download
vocab	download	download

Sequence-level Knowledge Distillation

MuST-C (BLEU)	en-de (V2)
valid	31.76
distillation	download
vocab	download

Citation

Please consider citing our paper:

@inproceedings{chang22f_interspeech,
  author={Chih-Chiang Chang and Hung-yi Lee},
  title={{Exploring Continuous Integrate-and-Fire for Adaptive Simultaneous Speech Translation}},
  year=2022,
  booktitle={Proc. Interspeech 2022},
  pages={5175--5179},
  doi={10.21437/Interspeech.2022-10627}
}

Name		Name	Last commit message	Last commit date
Latest commit History 175 Commits
DATA		DATA
codebase		codebase
docs		docs
eval		eval
exp		exp
scripts		scripts
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DATA

DATA

codebase

codebase

docs

docs

eval

eval

exp

exp

scripts

scripts

.gitignore

.gitignore

.gitmodules

.gitmodules

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

Simultaneous Speech Translation

Implemented

Encoder

Streaming Models

Setup

Pre-trained model

Sequence-level Knowledge Distillation

Citation

About

Releases

Packages

Languages

George0828Zhang/simulst

Folders and files

Latest commit

History

Repository files navigation

Simultaneous Speech Translation

Implemented

Encoder

Streaming Models

Setup

Pre-trained model

Sequence-level Knowledge Distillation

Citation

About

Topics

Resources

Stars

Watchers

Forks

Languages