This repository contains implementations of various transformer models for different natural language processing and computer vision tasks.
Paper: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
- Fill-mask language model pretraining for downstream tasks ✅
- Sequence classification ✅
- Token classification 💠
- Next sentence prediction 💠
Paper: Language Models are Unsupervised Multitask Learners
- Semi-supervised training for sequence generation 💠
Paper: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
- Image inpainting 💠
- Image classification 💠