tts-german-pytorch

FastPitch (arXiv) trained on Thorsten Müller's Thorsten–2022.10 and Thorsten-21.06-emotional datasets.

Audio Samples

You can listen to some audio samples here.

Quick Setup

Required packages: torch torchaudio pyyaml phonemizer

Please refer to here to install phonemizer and the espeak-ng backend.

~ for training: librosa matplotlib tensorboard

~ for the demo app: fastapi "uvicorn[standard]"

Download the pretrained weights for the FastPitch model link.

Download the HiFi-GAN vocoder weights (link). Either put them into pretrained/hifigan-thor-v1 or edit the following lines in configs/basic.yaml.

# vocoder
vocoder_state_path: pretrained/hifigan-thor-v1/hifigan-thor.pth
vocoder_config_path: pretrained/hifigan-thor-v1/config.json

Using the models

The FastPitch from models.fastpitch is a wrapper that simplifies text-to-mel inference. The FastPitch2Wave model includes the HiFi-GAN vocoder for direct text-to-speech inference.

Inferring the Mel spectrogram

from models.fastpitch import FastPitch
model = FastPitch('pretrained/fastpitch_de.pth')
model = model.cuda()
mel_spec = model.ttmel("Hallo Welt!")

End-to-end Text-to-Speech

from models.fastpitch import FastPitch2Wave
model = FastPitch2Wave('pretrained/fastpitch_de.pth')
model = model.cuda()
wave = model.tts("Hallo Welt!")

wave_list = model.tts(["null", "eins", "zwei", "drei", "vier", "fünf"])

Web app

The web app uses the FastAPI library. To run the app you need the following packages:

fastapi: for the backend api | uvicorn: for serving the app

Install with: pip install fastapi "uvicorn[standard]"

Run with: python app.py

Preview:

Acknowledgements

Thanks to Thorsten Müller for the high-quality datasets.

The FastPitch files stem from NVIDIA's DeepLearningExamples

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

app

app

configs

configs

models

models

pretrained

pretrained

text

text

utils

utils

vocoder

vocoder

.gitignore

.gitignore

README.md

README.md

app.py

app.py

test.py

test.py

Repository files navigation

tts-german-pytorch

Audio Samples

Quick Setup

Using the models

Inferring the Mel spectrogram

End-to-end Text-to-Speech

Web app

Acknowledgements

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
app		app
configs		configs
models		models
pretrained		pretrained
text		text
utils		utils
vocoder		vocoder
.gitignore		.gitignore
README.md		README.md
app.py		app.py
test.py		test.py

nipponjo/tts-german-pytorch

Folders and files

Latest commit

History

Repository files navigation

tts-german-pytorch

Audio Samples

Quick Setup

Using the models

Inferring the Mel spectrogram

End-to-end Text-to-Speech

Web app

Acknowledgements

About

Topics

Resources

Stars

Watchers

Forks

Languages