Skip to content

Latest commit

 

History

History
160 lines (111 loc) · 5.71 KB

seg.md

File metadata and controls

160 lines (111 loc) · 5.71 KB

Semantic Segmentation

Datasets

Cityscapes: https://www.cityscapes-dataset.com/
Our code expects the Cityscapes dataset directory to follow the following structure:

cityscapes
├── gtFine
|   ├── train
|   ├── val
├── leftImg8bit
|   ├── train
|   ├── val
ADE20K: https://groups.csail.mit.edu/vision/datasets/ADE20K/
Our code expects the ADE20K dataset directory to follow the following structure:

ade20k
├── annotations
|   ├── training
|   ├── validation
├── images
|   ├── training
|   ├── validation

Pretrained Models

Latency/Throughput is measured on NVIDIA Jetson Nano, NVIDIA Jetson AGX Orin, and NVIDIA A100 GPU with TensorRT, fp16. Data transfer time is included.

Cityscapes

Model Resolution Cityscapes mIoU Params MACs Jetson Orin Latency (bs1) A100 Throughput (bs1) Checkpoint
EfficientViT-L1 1024x2048 82.716 40M 282G 45.9ms 122 image/s link
EfficientViT-L2 1024x2048 83.228 53M 396G 60.0ms 102 image/s link
EfficientViT B series
Model Resolution Cityscapes mIoU Params MACs Jetson Nano (bs1) Jetson Orin (bs1) Checkpoint
EfficientViT-B0 1024x2048 75.653 0.7M 4.4G 275ms 9.9ms link
EfficientViT-B1 1024x2048 80.547 4.8M 25G 819ms 24.3ms link
EfficientViT-B2 1024x2048 82.073 15M 74G 1676ms 46.5ms link
EfficientViT-B3 1024x2048 83.016 40M 179G 3192ms 81.8ms link

ADE20K

Model Resolution ADE20K mIoU Params MACs Jetson Orin Latency (bs1) A100 Throughput (bs16) Checkpoint
EfficientViT-L1 512x512 49.191 40M 36G 7.2ms 947 image/s link
EfficientViT-L2 512x512 50.702 51M 45G 9.0ms 758 image/s link
EfficientViT B series
Model Resolution ADE20K mIoU Params MACs Jetson Nano (bs1) Jetson Orin (bs1) Checkpoint
EfficientViT-B1 512x512 42.840 4.8M 3.1G 110ms 4.0ms link
EfficientViT-B2 512x512 45.941 15M 9.1G 212ms 7.3ms link
EfficientViT-B3 512x512 49.013 39M 22G 411ms 12.5ms link

Usage

# semantic segmentation
from efficientvit.seg_model_zoo import create_seg_model

model = create_seg_model(
  name="l2", dataset="cityscapes", weight_url="assets/checkpoints/seg/cityscapes/l2.pt"
)

model = create_seg_model(
  name="l2", dataset="ade20k", weight_url="assets/checkpoints/seg/ade20k/l2.pt"
)

Evaluation

Please run eval_seg_model.py to evaluate our models.

Examples: segmentation

Visualization

Please run eval_seg_model.py to visualize the outputs of our semantic segmentation models.

Example:

python eval_seg_model.py --dataset cityscapes --crop_size 1024 --model b3 --save_path demo/cityscapes/b3/

You can also use demo_seg_model.py to visualize the models.

Example:

python demo_seg_model.py --image_path assets/fig/indoor.jpg --dataset ade20k --crop_size 512 --model l2

python demo_seg_model.py --image_path assets/fig/city.png --dataset cityscapes --crop_size 1024 --model l2

Export

Onnx

To generate ONNX files, please refer to onnx_export.py.

TFLite

To generate TFLite files, please refer to tflite_export.py. It requires the TinyNN package.

pip install git+https://github.com/alibaba/TinyNeuralNetwork.git

Example:

python tflite_export.py --export_path model.tflite --task seg --dataset ade20k --model b3 --resolution 512 512

Citation

If EfficientViT is useful or relevant to your research, please kindly recognize our contributions by citing our paper:

@article{cai2022efficientvit,
  title={Efficientvit: Enhanced linear attention for high-resolution low-computation visual recognition},
  author={Cai, Han and Gan, Chuang and Han, Song},
  journal={arXiv preprint arXiv:2205.14756},
  year={2022}
}