Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] S2ST recipe for SpeechMatrix #5735

Draft
wants to merge 56 commits into
base: master
Choose a base branch
from

Conversation

juice500ml
Copy link
Member

@juice500ml juice500ml commented Apr 5, 2024

What?

Features

  • Enable utilizing pretrained kmeans model
  • Enable utilizing HiFiGAN pretrained vocoder
  • Add shallow decoder for source unit estimation
  • Add huggingface ASR models for ASR BLEU calculation
  • Refactor: Add option for choosing between maintaining vocabulary vs. filtering OOV cases
  • Refactor: Connect run.sh's src_lang and tgt_lang parameter with local/data.sh
  • Refactor: We have to manually skip Stage 4. Add --skip_stages or other flags for s2st.sh
  • Refactor: fairseq version that is currently installed by ESPnet does not support HiFiGAN
  • Refactor: Handle different vocoder types more elegantly

Bug fixes

  • Fix bugs related to the case where we only have speech data for training (i.e., use_src_lang=false, use_tgt_lang=false)
  • Fixed fix_data_dir filtering bug (due to wav.scp.${src_lang} and wav.scp.${tgt_lang} being custom files)

Data preparation

  • FLEURS data preparation
  • EPST data preparation
  • SpeechMatrix data preparation
  • Remove fairseq dependency on data preparation by copying & modifying
  • Add necessary exception handling for additional python packages
  • SpeechMatrix valid/test splits
  • Refactor data_prep.py (too big of a file with too many repetitions)
  • Refactor run.sh to handle different test_sets per language pairs
  • Refactor this commit: 3af3f41

Modeling

  • Conducted hyperparameter tuning to find the optimal architecture
  • Conducted hyperparameter tuning to find the optimal learning rate

Why?

See also

@mergify mergify bot added the ESPnet2 label Apr 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants