Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

复现TinyBERT需要pre-train的wiki语料,另是否开源tinybert-cased模型 #237

Open
hppy139 opened this issue May 11, 2023 · 0 comments

Comments

@hppy139
Copy link

hppy139 commented May 11, 2023

你好,

论文中提到,For the general distillation, we set the maximum sequence length to 128 and use English Wikipedia (2,500M words) as the text corpus and perform the intermediate layer distillation for 3 epochs with the supervision from a pre-trained BERT BASE and keep other hyper-parameters the same as BERT pre-training (Devlin et al., 2019). 关于pre-train的语料,是否可以提供下载地址?

此外,在pre-train阶段,对于general_distill.py的配置参数--do_lower_case,是否可以不设置该参数。看到已开放模型的vocab.txt是小写字典,请问目前是否有已训好的、关注大小写的TinyBERT模型(即"tinybert-cased")?

谢谢~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant