Feature/ksponspeech #2510

ddwkim · 2024-04-16T14:23:37Z

What does this PR do?

This updates KsponSpeech recipe according to the recent changes in Speechbrain.

Not only conformer-medium model is updated, this PR includes new models, such as conformer-small, branchformer-medium, which can serve as new benchmarks for the dataset and models.

Tokenizers and transformer language model used previously trained ones.

Other updates include metadata build script using multiprocessing, and unzip script that follows recent KsponSpeech directory structure updates.

These are all CTC-Attention models, and training process of conformer-transducer model is underway for streaming cases.

We are consistently seeing the demand for KsponSpeech models from the community, as can be seen in issues and model downloads from the huggingface hub.

I believe the continuous updates of the KsponSpeech models will benefit the research community and the industry as a whole.

There are several other things that should be accomplished before this PR is completed.

Update huggingface speechbrain models. Models in my repository are already updated, and hence simply copying them to the speechbrain hub will be enough
Upload train logs to speechbrain dropbox
Update Colab and documentations to meet the current changes

Before submitting

Did you read the contributor guideline?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you list all the breaking changes introduced by this pull request?
Does your code adhere to project-specific code style and conventions?

PR review

Reviewer checklist

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified
Confirm that the changes adhere to compatibility requirements (e.g., Python version, platform)
Review the self-review checklist to ensure the code is ready for review

ddwkim · 2024-04-18T00:47:46Z

I will fix the CI error as soon as possible.

Adel-Moumen · 2024-04-19T09:05:49Z

recipes/KsponSpeech/ksponspeech_prepare.py

 """

 import csv
 import logging
 import os
 import re
+from multiprocessing import Pool, cpu_count


Note: we already have a tool for multiprocessing in speechbrain: https://github.com/speechbrain/speechbrain/blob/develop/speechbrain/utils/parallel.py#L202

You can find examples here: https://github.com/speechbrain/speechbrain/blob/develop/recipes/CommonVoice/common_voice_prepare.py#L328-L346 and https://github.com/speechbrain/speechbrain/blob/develop/recipes/LibriSpeech/librispeech_prepare.py#L329-L333

I have just updated the script accordingly

Adel-Moumen · 2024-04-19T09:07:30Z

recipes/KsponSpeech/ASR/transformer/README.md

-| 01-23-23 | conformer_medium.yaml |     20.47%     |     25.18%     |     7.33%      |     7.99%      | [HuggingFace](https://huggingface.co/speechbrain/asr-conformer-transformerlm-ksponspeech) | [DropBox](https://www.dropbox.com/sh/uibokbz83o8ybv3/AACtO5U7mUbu_XhtcoOphAjza?dl=0) | 6xA100 80GB | 2 days 13 hours |
+| 04-16-24 | conformer_medium.yaml |     20.15%     |     24.75%     |     7.40%      |     7.96%      | [HuggingFace](https://huggingface.co/ddwkim/asr-conformer-transformerlm-ksponspeech) | [DropBox](https://www.dropbox.com/sh/uibokbz83o8ybv3/AACtO5U7mUbu_XhtcoOphAjza?dl=0) | 2xA100 40GB | 14 hours 7 mins |


I am a bit curious but do you know why we went from 2days and 13 hours of training using 6x A100 to "only" 14 hours and 7 mins of training using 2x A100 40GB?

I assumed there may have been optimizations in the speechbrain train pipeline and higher torch versions, which I could not enjoy in mid 2022, the year I trained the model last time. I will share the train logs soon, and share findings if there is anything found during inspection of code changes between now and then.

Adel-Moumen self-requested a review April 17, 2024 07:56

ddwkim force-pushed the feature/ksponspeech branch 2 times, most recently from 5ea0b88 to 74eba89 Compare April 18, 2024 00:44

ddwkim force-pushed the feature/ksponspeech branch 2 times, most recently from bf01589 to 0ef03ee Compare April 18, 2024 01:17

Adel-Moumen added enhancement New feature or request recipes Changes to recipes only (add/edit) labels Apr 18, 2024

Adel-Moumen reviewed Apr 19, 2024

View reviewed changes

ddwkim added 2 commits April 19, 2024 22:52

Update KsponSpeech prep script

05a4381

Update KsponSpeech recipe

57b6704

ddwkim force-pushed the feature/ksponspeech branch from 0ef03ee to 57b6704 Compare April 19, 2024 13:57

Add transducer recipe

f2a4254

Adel-Moumen added this to the v1.0.2 milestone May 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/ksponspeech #2510

Feature/ksponspeech #2510

ddwkim commented Apr 16, 2024 •

edited

ddwkim commented Apr 18, 2024

Adel-Moumen Apr 19, 2024

ddwkim Apr 19, 2024

Adel-Moumen Apr 19, 2024

ddwkim Apr 19, 2024

		\| 01-23-23 \| conformer_medium.yaml \| 20.47% \| 25.18% \| 7.33% \| 7.99% \| [HuggingFace](https://huggingface.co/speechbrain/asr-conformer-transformerlm-ksponspeech) \| [DropBox](https://www.dropbox.com/sh/uibokbz83o8ybv3/AACtO5U7mUbu_XhtcoOphAjza?dl=0) \| 6xA100 80GB \| 2 days 13 hours \|
		\| 04-16-24 \| conformer_medium.yaml \| 20.15% \| 24.75% \| 7.40% \| 7.96% \| [HuggingFace](https://huggingface.co/ddwkim/asr-conformer-transformerlm-ksponspeech) \| [DropBox](https://www.dropbox.com/sh/uibokbz83o8ybv3/AACtO5U7mUbu_XhtcoOphAjza?dl=0) \| 2xA100 40GB \| 14 hours 7 mins \|

Feature/ksponspeech #2510

Are you sure you want to change the base?

Feature/ksponspeech #2510

Conversation

ddwkim commented Apr 16, 2024 • edited

What does this PR do?

PR review

ddwkim commented Apr 18, 2024

Adel-Moumen Apr 19, 2024

Choose a reason for hiding this comment

ddwkim Apr 19, 2024

Choose a reason for hiding this comment

Adel-Moumen Apr 19, 2024

Choose a reason for hiding this comment

ddwkim Apr 19, 2024

Choose a reason for hiding this comment

ddwkim commented Apr 16, 2024 •

edited