Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

training config used for training stt_en_quartznet15x5 #9140

Open
ROZBEH opened this issue May 8, 2024 · 2 comments
Open

training config used for training stt_en_quartznet15x5 #9140

ROZBEH opened this issue May 8, 2024 · 2 comments
Assignees

Comments

@ROZBEH
Copy link

ROZBEH commented May 8, 2024

This is not a feature request per se. It's more about seeking help :-)

Right now I am trying the stt_en_quartznet15x5 in nemo and I can load it and run inference without any issues. I can also see the model config which includes the layers, post and preprocessing parts without any issue.

Now, I am trying to train such a model from scratch. I used the default config that comes with the nemo repo like the ones located here. However, this won't produce a model with a performance on par with the model that ships with nemo(stt_en_quartznet15x5).

While I can see and find the model configs located here, I was wondering if there is a way to see the actual config used by NVIDIA to train the aforementioned model. Things like training configs, batches, epochs, learning rate, etc. that might have been changed in the updated config.

Thank you.

@nithinraok
Copy link
Collaborator

@ROZBEH generally we try to put config to suit for training on small number of GPUs but for training these models we might have used large number of GPUs, so parameters might not be the same.

@sam1373 do you know the difference between current config and model trained?

@ROZBEH
Copy link
Author

ROZBEH commented May 8, 2024

I see - thanks @nithinraok
Yeah knowing the actual config can be really beneficial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants