Pytorch_Quantization_EX

0. Introduction

Goal : Quantization Model PTQ & QAT
Process :
1. Pytorch model train with custom dataset
2. Pytorch-Quantization model calibration for ptq
3. Pytorch-Quantization model fine tuning for qat
4. Generation TensorRT int8 model from Pytorch-Quantization model
5. Generation TensorRT int8 model using tensorrt calibration class
Sample Model : Resnet18
Dataset : imagenet100

1. Development Environment

Device
- Windows 10 laptop
- CPU i7-11375H
- GPU RTX-3060
Dependency
- cuda 12.1
- cudnn 8.9.2
- tensorrt 8.6.1
- pytorch 2.1.0+cu121

2. Code Scheme

    Quantization_EX/
    ├── calibrator.py       # calibration class for TensorRT PTQ
    ├── common.py           # utils for TensorRT
    ├── infer.py            # base model infer
    ├── onnx_export.py      # onnx export
    ├── ptq.py              # Post Train Quantization
    ├── qat.py              # Quantization Aware Training
    ├── quant_utils.py      # utils for quantization
    ├── train.py            # base model train
    ├── trt_infer.py        # TensorRT model infer
    ├── utils.py            # utils
    ├── LICENSE
    └── README.md

3. Performance Evaluation

Calculation 10000 iteration with one input data [1, 3, 224, 224]

	TRT	TRT	TRT PTQ	PT-Q PTQ	PT-Q PTQ w bnf	PT-Q QAT	PT-Q QAT w bnf
Precision	FP32	FP16	Int8	Int8	Int8	Int8	Int8
Acc Top-1 [%]	83.08	83.04	83.12	83.18	82.64	83.42	82.80
Avg Latency [ms]	1.188 ms	0.527 ms	0.418 ms	0.566 ms	0.545 ms	0.577 ms	0.534 ms
Avg FPS [frame/sec]	841.74 fps	1896.01 fps	2388.33 fps	1764.55 fps	1834.69 fps	1730.89 fps	1870.99 fps
Gpu Memory [MB]	179 MB	135 MB	123 MB	129 MB	129 MB	129 MB	129 MB

PT : Ptorch Quantization
TRT : TensorRT
bnf : bach normalization folding (conv + bn -> conv')

4. Guide

infer -> train -> ptq -> qat -> onnx_export -> trt_infer -> trt_infer_acc

5. Reference

pytorch-quantization : https://github.com/NVIDIA/TensorRT/tree/master/tools/pytorch-quantization
imagenet100 : https://www.kaggle.com/datasets/ambityga/imagenet100

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
calibrator.py		calibrator.py
common.py		common.py
infer.py		infer.py
infer_bn_folding.py		infer_bn_folding.py
onnx_export.py		onnx_export.py
ptq.py		ptq.py
qat.py		qat.py
quant_utils.py		quant_utils.py
train.py		train.py
trt_infer.py		trt_infer.py
trt_infer_acc.py		trt_infer_acc.py
utils.py		utils.py

License

yester31/Quantization_EX

Folders and files

Latest commit

History

Repository files navigation

Pytorch_Quantization_EX

0. Introduction

1. Development Environment

2. Code Scheme

3. Performance Evaluation

4. Guide

5. Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Languages