-
Notifications
You must be signed in to change notification settings - Fork 231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to reproduce results from the paper #131
Comments
+1 |
Hey @MLMonkATGY! Could you share the arguments you're passing to distil-whisper/training/run_eval.py Lines 545 to 548 in 3490d8e
Whereas in the original Flax scripts, we always used the EnglishNormalizer: distil-whisper/training/flax/run_eval.py Line 728 in 3490d8e
You should be able to reproduce the results one-to-one if you use the Flax script. I'll also update the PyTorch script to use the EnglishNormalizer if the language used is English! |
I used the following arguments for
|
Hey @MLMonkATGY, after merging #132, I evaluated the model with the following: #!/bin/bash
python run_eval.py \
--model_name_or_path "distil-whisper/distil-large-v2" \
--dataset_name "distil-whisper/common_voice_13_0" \
--dataset_config_name "en" \
--dataset_split_name "test" \
--text_column_name "text" \
--batch_size 128 \
--dtype "bfloat16" \
--generation_max_length 256 \
--language "en" \
--streaming True And got a WER of 13.0%: https://wandb.ai/sanchit-gandhi/distil-whisper-speed-benchmark/runs/7qihyqbx?nw=nwusersanchitgandhi This is within 0.1% of the 12.9% WER reported in the paper. This 0.1% difference is expected, since the paper WER results are in Flax on TPU, whereas the |
All in all, the PR #132 should now mean that evaluating models in English with the PyTorch script |
Thanks ! |
Hi.
Can the exact code from
run_eval.py
be used to reproduce the results from Table 16 ? I tried to benchmark distill-whisper-v2 ondistil-whisper/common_voice_13_0
dataset and found the WER is a few percent higher than what was reported in the paper?The text was updated successfully, but these errors were encountered: