Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hi! LibriSpeech char training!! #2385

Open
scj0709 opened this issue Feb 2, 2024 · 1 comment
Open

Hi! LibriSpeech char training!! #2385

scj0709 opened this issue Feb 2, 2024 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@scj0709
Copy link

scj0709 commented Feb 2, 2024

Describe the bug

Hello!
I'm really impressed with your code. It's well-structured and a great GitHub repository.
As a result, I'd like to train the LibriSpeech Transformer model following your procedure. However, when I attempted to train it using character tokens by setting the token type to 'char,' I encountered the following error. It seems to be related to padding. image

Expected behaviour

Could you provide any solutions for this issue?

To Reproduce

No response

Environment Details

No response

Relevant Log Output

No response

Additional Context

No response

@scj0709 scj0709 added the bug Something isn't working label Feb 2, 2024
@Adel-Moumen Adel-Moumen self-assigned this Apr 7, 2024
@Adel-Moumen Adel-Moumen added this to the v1.0.1 milestone Apr 8, 2024
@Adel-Moumen
Copy link
Collaborator

Hello @scj0709,

Thanks for opening this issue.

Could you please share with me which YAML you are using to run into this error?

The issue is that the transformer YAMLs that we have in the LibriSpeech folder are using "transformerlm" which has been trained with a SentencePiece BPE tokenizer. We are using the same exact tokenizer, and therefore you cannot change the granularity of your tokenizer.

This is why I'm surprised that you ran into this issue. Do you mind sharing the YAML with me, please?

Thanks and have a great day.

@asumagic asumagic removed this from the v1.0.1 milestone Apr 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants