New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/ksponspeech #2510
base: develop
Are you sure you want to change the base?
Feature/ksponspeech #2510
Conversation
5ea0b88
to
74eba89
Compare
I will fix the CI error as soon as possible. |
bf01589
to
0ef03ee
Compare
""" | ||
|
||
import csv | ||
import logging | ||
import os | ||
import re | ||
from multiprocessing import Pool, cpu_count |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: we already have a tool for multiprocessing in speechbrain: https://github.com/speechbrain/speechbrain/blob/develop/speechbrain/utils/parallel.py#L202
You can find examples here: https://github.com/speechbrain/speechbrain/blob/develop/recipes/CommonVoice/common_voice_prepare.py#L328-L346 and https://github.com/speechbrain/speechbrain/blob/develop/recipes/LibriSpeech/librispeech_prepare.py#L329-L333
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have just updated the script accordingly
| 01-23-23 | conformer_medium.yaml | 20.47% | 25.18% | 7.33% | 7.99% | [HuggingFace](https://huggingface.co/speechbrain/asr-conformer-transformerlm-ksponspeech) | [DropBox](https://www.dropbox.com/sh/uibokbz83o8ybv3/AACtO5U7mUbu_XhtcoOphAjza?dl=0) | 6xA100 80GB | 2 days 13 hours | | ||
| 04-16-24 | conformer_medium.yaml | 20.15% | 24.75% | 7.40% | 7.96% | [HuggingFace](https://huggingface.co/ddwkim/asr-conformer-transformerlm-ksponspeech) | [DropBox](https://www.dropbox.com/sh/uibokbz83o8ybv3/AACtO5U7mUbu_XhtcoOphAjza?dl=0) | 2xA100 40GB | 14 hours 7 mins | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit curious but do you know why we went from 2days and 13 hours of training using 6x A100 to "only" 14 hours and 7 mins of training using 2x A100 40GB?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assumed there may have been optimizations in the speechbrain train pipeline and higher torch versions, which I could not enjoy in mid 2022, the year I trained the model last time. I will share the train logs soon, and share findings if there is anything found during inspection of code changes between now and then.
0ef03ee
to
57b6704
Compare
What does this PR do?
This updates KsponSpeech recipe according to the recent changes in Speechbrain.
Not only conformer-medium model is updated, this PR includes new models, such as conformer-small, branchformer-medium, which can serve as new benchmarks for the dataset and models.
Tokenizers and transformer language model used previously trained ones.
Other updates include metadata build script using multiprocessing, and unzip script that follows recent KsponSpeech directory structure updates.
These are all CTC-Attention models, and training process of conformer-transducer model is underway for streaming cases.
We are consistently seeing the demand for KsponSpeech models from the community, as can be seen in issues and model downloads from the huggingface hub.
I believe the continuous updates of the KsponSpeech models will benefit the research community and the industry as a whole.
There are several other things that should be accomplished before this PR is completed.
Before submitting
PR review
Reviewer checklist