Skip to content

Computes the MWER (minimum WER) Loss with beam search and negative sampling strategy.

Notifications You must be signed in to change notification settings

TeaPoly/CE-OptimizedLoss

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CE-OptimizedLoss

License

Computes the MWER (minimum WER) Loss with beam search and negative sampling strategy.

Q&A

Why is the loss value negative?

The loss is just a scalar that you are trying to minimize. It's not supposed to be positive. A detailed discussion can be found here.

It is normal to observe that the loss value is getting smaller and smaller, because the average word error is subtracted when normalizing. For details, please refer to the following formula from paper:

What is the different between beam search and negative sampling stategy for MWER loss?

The beam search and negative sampling stategy both are method to generate multiple candidate paths.

  • The negative sampling strategy to generate multiple candidate paths by randomly masking the top1 score token during the MWER training as said in paper.
  • The Beam search strategy is a heuristic search algorithm that explores a graph by expanding the most promising node in a limited set.

So the negative sampling strategy is training faster than beam search strategy. MWER loss with the beam search stategy is closer to the actual calling method.

Citations

@article{gao2022paraformer,
  title={Paraformer: Fast and accurate parallel transformer for non-autoregressive end-to-end speech recognition},
  author={Gao, Zhifu and Zhang, Shiliang and McLoughlin, Ian and Yan, Zhijie},
  journal={arXiv preprint arXiv:2206.08317},
  year={2022}
}

@inproceedings{prabhavalkar2018minimum,
  title={Minimum word error rate training for attention-based sequence-to-sequence models},
  author={Prabhavalkar, Rohit and Sainath, Tara N and Wu, Yonghui and Nguyen, Patrick and Chen, Zhifeng and Chiu, Chung-Cheng and Kannan, Anjuli},
  booktitle={2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  pages={4839--4843},
  year={2018},
  organization={IEEE}
}

About

Computes the MWER (minimum WER) Loss with beam search and negative sampling strategy.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages