DC-TTS

The pytorch implementation of papar Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention.

Thanks for Kyubyong/dc_tts, which helped me a lot to overcome some difficulties.

Dataset

The LJ Speech Dataset. A public domain speech dataset consisting of 13,100 short audio clips of a single female speaker.

Train

I have tuned hyper parameters and trained a model with The LJ Speech Dataset. The hyper parameters may not be the best and are slightly different with those used in original paper.

To train a model yourself with The LJ Speech Dataset:

Download the dataset and extract into a directory, set the directory in pkg/hyper.py
Run preprocess
```
python3 main.py --action preprocess
```
Train Text2Mel network, you can change the device to train text2mel in pkg/hyper.py
```
python3 main.py --action train --module Text2Mel
```
Train SSRN network, also, it's possible to change the training device
```
python3 main.py --action train --module SuperRes
```

Samples

Some synthesized samples are contained in directory synthesis. The according sentences are listed in sentences.txt. The pre-trained model for Text2Mel and SuperRes (auto-saved at logdir/text2mel/pkg/trained.pkg and logdir/superres/pkg/trained.pkg in training phase) will be loaded when synthesizing.

You can synthesis samples listed in sentences.txt with

python3 main.py --action synthesis

Attention Matrix for the sentence: "Which came first... the chicken or the egg? Did the universe have a beginning... and if so, what happened before then? Where did the universe come from... and where is it going?"

Pre-trained model

The samples in directory synthesis is sampled with 410k batches trained Text2Mel and 190k batches trained SuperRes.

The current result is not very satisfying, specificly, some vowels are skipped. Hope someone can find better hyper parameters and train better models. Please tell me if you were able to get a great model.

You can download the current pre-trained model from my dropbox.

Dependancy

scipy, librosa, num2words
pytorch >= 0.4.0

Relative

TensorFlow implementation: Kyubyong/dc_tts

Please email me or open an issue, if you have any question or suggestion.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.idea		.idea
pkg		pkg
synthesis		synthesis
.gitignore		.gitignore
LICENSE		LICENSE
main.py		main.py
readme.md		readme.md
sentences.txt		sentences.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.idea

.idea

pkg

pkg

synthesis

synthesis

.gitignore

.gitignore

LICENSE

LICENSE

main.py

main.py

readme.md

readme.md

sentences.txt

sentences.txt

Repository files navigation

DC-TTS

Dataset

Train

Samples

Pre-trained model

Dependancy

Relative

About

Releases

Packages

Languages

License

chaiyujin/dctts-pytorch

Folders and files

Latest commit

History

Repository files navigation

DC-TTS

Dataset

Train

Samples

Pre-trained model

Dependancy

Relative

About

Topics

Resources

License

Stars

Watchers

Forks

Languages