Efficient-KD

You should download the dataset CoNLL-03 first and save it into './datasets/conll-03' first.

Train the teacher model or baseline (bert_layers=12 as the teacher in the script):

CUDA_VISIBLE_DEVICES=0 python train_bert_crf.py

Train the Efficient KD (You might set the corresponding teacher model path):

CUDA_VISIBLE_DEVICES=0 python train_distil_bert_crf.py

Cite

If you find that efficient sub-structured KD is helpful for your paper , please cite our paper:


@misc{lin2022efficient,
    title={Efficient Sub-structured Knowledge Distillation},
    author={Wenye Lin and Yangming Li and Lemao Liu and Shuming Shi and Hai-tao Zheng},
    year={2022},
    eprint={2203.04825},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
models		models
utils		utils
.gitignore		.gitignore
README.md		README.md
eval.py		eval.py
requirements.txt		requirements.txt
train_bert_crf.py		train_bert_crf.py
train_distil_bert_crf.py		train_distil_bert_crf.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

models

models

utils

utils

.gitignore

.gitignore

README.md

README.md

eval.py

eval.py

requirements.txt

requirements.txt

train_bert_crf.py

train_bert_crf.py

train_distil_bert_crf.py

train_distil_bert_crf.py

Repository files navigation

Efficient-KD

Cite

About

Releases

Packages

Languages

Linwenye/Efficient-KD

Folders and files

Latest commit

History

Repository files navigation

Efficient-KD

Cite

About

Resources

Stars

Watchers

Forks

Languages