Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Wrong value for perceiver_cond_length_compression (256 instead of 1024) #3631

Open
talipturkmen opened this issue Mar 14, 2024 · 1 comment
Labels
bug Something isn't working wontfix This will not be worked on but feel free to help.

Comments

@talipturkmen
Copy link

talipturkmen commented Mar 14, 2024

Describe the bug

[Bug]

Hello,

In XTTSv2, in dataloader condition indices (condition start indice and end indice) are loading as audio samples which later will be compressed by 256 due to mel spectrogram extraction. Later these condition start and end indices are used to mask the ground truth audio codes. Since, audio codes compressed by 1024 times these start and end indices also has to be divided by 1024 but they are divided by perceiver_cond_length_compression which is set to 256 coming from the default args of GPT. In this case condition indices in dvae domain will refer to 4 times higher of what it should refer. So you will mask the wrong part of the target audio tokens. I couldn't find any place where they are set to 1024 correcty and I can not understand with this bug how it's able to train and finetune.

I'd appreciate if anyone shed lights on this topic.

Error lines:

perceiver_cond_length_compression=256,

cond_idxs[idx] = cond_idxs[idx] // self.perceiver_cond_length_compression

To Reproduce

  • xttsv2 finetuning

Expected behavior

...

Logs

...

Environment

- all xttsV2 versions

Additional context

No response

@talipturkmen talipturkmen added the bug Something isn't working label Mar 14, 2024
Copy link

stale bot commented Apr 22, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

@stale stale bot added the wontfix This will not be worked on but feel free to help. label Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working wontfix This will not be worked on but feel free to help.
Projects
None yet
Development

No branches or pull requests

1 participant