Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

支持中文吗? #70

Open
xxm1668 opened this issue Dec 29, 2023 · 4 comments
Open

支持中文吗? #70

xxm1668 opened this issue Dec 29, 2023 · 4 comments

Comments

@xxm1668
Copy link

xxm1668 commented Dec 29, 2023

No description provided.

@sanchit-gandhi
Copy link
Collaborator

It does not, but you have two options for training a Chinese-compatible model:

  1. Follow the distillation instructions in the training folder and train on the Mandarin split of the Common Voice dataset https://github.com/huggingface/distil-whisper/tree/main/training
  2. Fine-tune the pre-trained checkpoint on Mandarin (instructions for this can also be found under the training folder)

=> 1 will get you the best results, but is a little bit more involved than 2

Disclaimer: the following translation is generated using Google Translate

它没有,但你有两种选择来训练中文兼容模型:

  1. 按照训练文件夹中的蒸馏说明,对 Common Voice 数据集的普通话分割进行训练 https://github.com/huggingface/distil-whisper/tree/main/training
  2. 微调普通话的预训练检查点(也可以在训练文件夹下找到相关说明)

=> 1 会给你最好的结果,但比 2 稍微复杂一些

免责声明:使用 Google 翻译翻译回复

@xxm1668
Copy link
Author

xxm1668 commented Jan 18, 2024

thanks your recommendation

@shuaijiang
Copy link

you can refer to https://huggingface.co/BELLE-2/Belle-distilwhisper-large-v2-zh
support Chinese based on distil-whipser-larger-v2

@xxm1668
Copy link
Author

xxm1668 commented Feb 22, 2024

thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants