Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tortoise read.py - test large text files #314

Open
chlowden opened this issue May 12, 2024 · 5 comments
Open

Tortoise read.py - test large text files #314

chlowden opened this issue May 12, 2024 · 5 comments

Comments

@chlowden
Copy link

Hello
I am trying to test how reasonably large text files are interpreted by tortoise. I noticed that there is command line available for Tortoise to breakup large texts into small chunks using the read.py file. I can see the file the tortoise install but not in your webgui version. Is there a way to have this function working in the webgui please?
Many thanks for such a great interface.

@rsxdalv
Copy link
Owner

rsxdalv commented May 12, 2024

Hi, thanks for checking in, I'm guessing what you want is this to be in the webui:
https://github.com/neonbjb/tortoise-tts/blob/572bdf3d2475f1a330bb074c6addf433f887b480/tortoise/utils/text.py#L4

Are you using React UI? It's far easier for me to add this function and to make it work seamlessly within that UI rather than the gradio.

@chlowden
Copy link
Author

chlowden commented May 13, 2024

Hello,
I am worked it out with the NEW REACT UI interface. I activated "Split prompt by lines" button and removed any strange pagination in the text. I found that having more than one return line break sign stopped the process. A single return line break sign helped slightly with intonation. It took me 8 hours using a RTX 3090 GPU at 100% and running very hot and noisy to do 8mins of narration. The result is 100 times better than anything else I have found and compares favorably to a similar production by Eleven Labs. The voice does go a little strange at some points, but that can be corrected as the system produces separate files for each line split so recalculating is easier than correcting files from Eleven Labs.
Thank you so much for putting this UI together. It's fantastic.

@rsxdalv
Copy link
Owner

rsxdalv commented May 13, 2024 via email

@chlowden
Copy link
Author

Below is the setup I used
"voice": "train_grace", "preset": "standard", "seed": "1715536858", "cvvp_amount": 0.0, "split_prompt": true, "num_autoregressive_samples": 256, "diffusion_iterations": 200, "temperature": 0.8, "length_penalty": 1.0, "repetition_penalty": 2.0, "top_p": 0.8, "max_mel_tokens": 500, "cond_free": true, "cond_free_k": 2, "diffusion_temperature": 1.0, "model": "Default", "name": ""}

@rsxdalv
Copy link
Owner

rsxdalv commented May 16, 2024

Fixed the presets, now if you change the preset it will actually update the values (#315).
This won't speed up your previous attempt, but if you select a lower preset it will now work properly and increase speed but reduce quality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants