[Efficient Modulation for Vision Networks] (ICLR 2024)

News & TODO & Updates:

will improve the performance with better training recipe.
Simplify model by moving unnecessary settings and renaming the classes to ease understanding.
Upload benchmark script to ease latency benchmark.

Image Classification

1. Requirements

torch>=1.7.0; torchvision>=0.8.0; pyyaml; timm==0.6.13;

data prepare: ImageNet with the following folder structure, you can extract ImageNet by this script.

│imagenet/
├──train/
│  ├── n01440764
│  │   ├── n01440764_10026.JPEG
│  │   ├── n01440764_10027.JPEG
│  │   ├── ......
│  ├── ......
├──val/
│  ├── n01440764
│  │   ├── ILSVRC2012_val_00000293.JPEG
│  │   ├── ILSVRC2012_val_00002138.JPEG
│  │   ├── ......
│  ├── ......

2. Pre-trained Context Cluster Models

We upload the checkpoints with distillation and logs to google drive. Feel free to download.

Model	#params	Image resolution	Top1 Acc	Download
EfficientMod-xxs	4.7M	224	77.1	[checkpoint & logs]
EfficientMod-xs	6.6M	224	79.4	[checkpoint & logs]
EfficientMod-s	12.9M	224	81.9	[checkpoint & logs]
EfficientMod-s-Conv (No Distill.)	12.9M	224	80.5	[checkpoint & logs]

3. Validation

To evaluate our EfficientMod models, run:

python3 validate.py /path/to/imagenet  --model {model} -b 256 --checkpoint {/path/to/checkpoint}

4. Train

We show how to train EfficientMod on 8 GPUs.

python3 -m torch.distributed.launch --nproc_per_node=8 train.py --data {path-to-imagenet} --model {model} -b 256 --lr 4e-3 --amp --model-ema --distillation-type soft --distillation-tau 1 --auto-resume --exp_tag {experiment_tag}

See folder detection for Detection and instance segmentation tasks on COCO..

See folder segmentation for Semantic Segmentation task on ADE20K.

BibTeX

@inproceedings{
    ma2024efficient,
    title={Efficient Modulation for Vision Networks},
    author={Xu Ma and Xiyang Dai and Jianwei Yang and Bin Xiao and Yinpeng Chen and Yun Fu and Lu Yuan},
    booktitle={The Twelfth International Conference on Learning Representations},
    year={2024},
    url={https://openreview.net/forum?id=ip5LHJs6QX}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
detection		detection
models		models
segmentation		segmentation
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
train.py		train.py
utils.py		utils.py
validate.py		validate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

detection

detection

models

models

segmentation

segmentation

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

train.py

train.py

utils.py

utils.py

validate.py

validate.py

Repository files navigation

[Efficient Modulation for Vision Networks] (ICLR 2024)

News & TODO & Updates:

Image Classification

1. Requirements

2. Pre-trained Context Cluster Models

3. Validation

4. Train

BibTeX

About

Releases

Packages

Languages

License

ma-xu/EfficientMod

Folders and files

Latest commit

History

Repository files navigation

[Efficient Modulation for Vision Networks] (ICLR 2024)

News & TODO & Updates:

Image Classification

1. Requirements

2. Pre-trained Context Cluster Models

3. Validation

4. Train

BibTeX

About

Resources

License

Stars

Watchers

Forks

Languages