Semantic Segmentation

Datasets

Cityscapes: https://www.cityscapes-dataset.com/

Our code expects the Cityscapes dataset directory to follow the following structure:

cityscapes
├── gtFine
|   ├── train
|   ├── val
├── leftImg8bit
|   ├── train
|   ├── val

ADE20K: https://groups.csail.mit.edu/vision/datasets/ADE20K/

Our code expects the ADE20K dataset directory to follow the following structure:

ade20k
├── annotations
|   ├── training
|   ├── validation
├── images
|   ├── training
|   ├── validation

Pretrained Models

Latency/Throughput is measured on NVIDIA Jetson Nano, NVIDIA Jetson AGX Orin, and NVIDIA A100 GPU with TensorRT, fp16. Data transfer time is included.

Cityscapes

Model	Resolution	Cityscapes mIoU	Params	MACs	Jetson Orin Latency (bs1)	A100 Throughput (bs1)	Checkpoint
EfficientViT-L1	1024x2048	82.716	40M	282G	45.9ms	122 image/s	link
EfficientViT-L2	1024x2048	83.228	53M	396G	60.0ms	102 image/s	link

EfficientViT B series

Model	Resolution	Cityscapes mIoU	Params	MACs	Jetson Nano (bs1)	Jetson Orin (bs1)	Checkpoint
EfficientViT-B0	1024x2048	75.653	0.7M	4.4G	275ms	9.9ms	link
EfficientViT-B1	1024x2048	80.547	4.8M	25G	819ms	24.3ms	link
EfficientViT-B2	1024x2048	82.073	15M	74G	1676ms	46.5ms	link
EfficientViT-B3	1024x2048	83.016	40M	179G	3192ms	81.8ms	link

ADE20K

Model	Resolution	ADE20K mIoU	Params	MACs	Jetson Orin Latency (bs1)	A100 Throughput (bs16)	Checkpoint
EfficientViT-L1	512x512	49.191	40M	36G	7.2ms	947 image/s	link
EfficientViT-L2	512x512	50.702	51M	45G	9.0ms	758 image/s	link

EfficientViT B series

Model	Resolution	ADE20K mIoU	Params	MACs	Jetson Nano (bs1)	Jetson Orin (bs1)	Checkpoint
EfficientViT-B1	512x512	42.840	4.8M	3.1G	110ms	4.0ms	link
EfficientViT-B2	512x512	45.941	15M	9.1G	212ms	7.3ms	link
EfficientViT-B3	512x512	49.013	39M	22G	411ms	12.5ms	link

Usage

# semantic segmentation
from efficientvit.seg_model_zoo import create_seg_model

model = create_seg_model(
  name="l2", dataset="cityscapes", weight_url="assets/checkpoints/seg/cityscapes/l2.pt"
)

model = create_seg_model(
  name="l2", dataset="ade20k", weight_url="assets/checkpoints/seg/ade20k/l2.pt"
)

Evaluation

Please run eval_seg_model.py to evaluate our models.

Examples: segmentation

Visualization

Please run eval_seg_model.py to visualize the outputs of our semantic segmentation models.

Example:

python eval_seg_model.py --dataset cityscapes --crop_size 1024 --model b3 --save_path demo/cityscapes/b3/

You can also use demo_seg_model.py to visualize the models.

Example:

python demo_seg_model.py --image_path assets/fig/indoor.jpg --dataset ade20k --crop_size 512 --model l2

python demo_seg_model.py --image_path assets/fig/city.png --dataset cityscapes --crop_size 1024 --model l2

Export

Onnx

To generate ONNX files, please refer to onnx_export.py.

TFLite

To generate TFLite files, please refer to tflite_export.py. It requires the TinyNN package.

pip install git+https://github.com/alibaba/TinyNeuralNetwork.git

Example:

python tflite_export.py --export_path model.tflite --task seg --dataset ade20k --model b3 --resolution 512 512

Citation

If EfficientViT is useful or relevant to your research, please kindly recognize our contributions by citing our paper:

@article{cai2022efficientvit,
  title={Efficientvit: Enhanced linear attention for high-resolution low-computation visual recognition},
  author={Cai, Han and Gan, Chuang and Han, Song},
  journal={arXiv preprint arXiv:2205.14756},
  year={2022}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

seg.md

seg.md

Semantic Segmentation

Datasets

Pretrained Models

Cityscapes

ADE20K

Usage

Evaluation

Visualization

Export

Onnx

TFLite

Citation

Files

seg.md

Latest commit

History

seg.md

File metadata and controls

Semantic Segmentation

Datasets

Pretrained Models

Cityscapes

ADE20K

Usage

Evaluation

Visualization

Export

Onnx

TFLite

Citation