wav2letter-ctc-pytorch

Wave2Letter (paper) with a waveform input.

The model was trained on LibriSpeech-960. In training, BatchNorm and Dropout were used, which can be fused into the weights to make them compatible with the Wave2Letter from torchaudio.models.

Pretrained weights

for model.Wav2Letter (link)

for torchaudio.models.Wav2Letter (link)

Greedy decoding

dataset	CER	WER
dev-clean	0.111	0.331
test-clean	0.105	0.318

Example

from torchaudio.models import Wav2Letter
model = Wav2Letter(num_classes=len(labels)).cuda()
model.load_state_dict(torch.load('./pretrained/states_fused.pth'))

Some filter kernels from the first Conv1d layer

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
eval.py		eval.py
fuse_bn.py		fuse_bn.py
labels.py		labels.py
model.py		model.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

README.md

README.md

eval.py

eval.py

fuse_bn.py

fuse_bn.py

labels.py

labels.py

model.py

model.py

train.py

train.py

utils.py

utils.py

Repository files navigation

wav2letter-ctc-pytorch

About

Releases

Packages

Languages

nipponjo/wav2letter-ctc-pytorch

Folders and files

Latest commit

History

Repository files navigation

wav2letter-ctc-pytorch

About

Topics

Resources

Stars

Watchers

Forks

Languages