MaxViT (PyTorch version)

This repo contains the unofficial PyTorch-version MaxViT model, training, and validation codes. This repo is written to share the PyTorch-version training hyper-parameters of MaxViT. For this, we just copy-and-paste the training hyper-parameters shown in table 12 of the original paper with the modification of the number of GPUs (we use 4 GPUs). Since most codes including model, train, and valid are copy-pasted from Timm github, the credits should be given to @rwightman and the original authors. See also their repos:

Tutorial

Test environments: torch==1.11.0 & timm==0.9.2

Clone this repo

git clone https://github.com/hankyul2/maxvit-pytorch
cd maxvit-pytorch

Run the following command to train MaxViT-T in imagenet-1k dataset. For model variants, just change the --drop-path to 0.3 (small) and 0.4 (base). For training with 4 GPUs, we use the gradient accumulation of 16 = 4096 (paper total batch) / 256 (our total batch).

Training time: about 5 days for the maxvit_tiny_tf_224 model with 4 GPUs (RTX 3090, 24GB).

torchrun --nproc_per_node=4 --master_port=12345 train.py /path/to/imagenet --model maxvit_tiny_tf_224 --aa rand-m15-mstd0.5-inc1 --mixup .8 --cutmix 1.0 --remode pixel --reprob 0.25 --drop-path .2 --opt adamw --weight-decay .05 --sched cosine --epochs 300 --lr 3e-3 --warmup-lr 1e-6 --warmup-epoch 30 --min-lr 1e-5 -b 64 -tb 4096 --smoothing 0.1 --clip-grad 1.0 -j 8 --amp --pin-mem --channels-last

Run the following command to reproduce the validation results of MaxViT-T in the imagenet-1k dataset.

Results: ** Acc@1 83.820 (16.180) Acc@5 96.528 (3.472)*
```
python3 valid.py /path/to/imagenet --img-size 224 --crop-pct 0.95 --cuda 0 --model maxvit_tiny_tf_224 --pretrained
```

Experiment result

Model	Image size	#Param	FLOPs	Top1	Artifacts
MaxViT-T (paper)	224	31M	5.6G	83.62
MaxViT-T (ours)	224	31M	5.6G	83.82	[yaml], [ckpt], [log], [csv]

References

@inproceedings{tu2022maxvit,
  title={Maxvit: Multi-axis vision transformer},
  author={Tu, Zhengzhong and Talebi, Hossein and Zhang, Han and Yang, Feng and Milanfar, Peyman and Bovik, Alan and Li, Yinxiao},
  booktitle={European conference on computer vision},
  pages={459--479},
  year={2022},
  organization={Springer}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
maxvit.py		maxvit.py
readme.md		readme.md
train.py		train.py
valid.py		valid.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

maxvit.py

maxvit.py

readme.md

readme.md

train.py

train.py

valid.py

valid.py

Repository files navigation

MaxViT (PyTorch version)

Tutorial

Experiment result

References

About

Releases 1

Packages

Languages

hankyul2/maxvit-pytorch

Folders and files

Latest commit

History

Repository files navigation

MaxViT (PyTorch version)

Tutorial

Experiment result

References

About

Topics

Resources

Stars

Watchers

Forks

Languages