Skip to content

yester31/TensorRT_ONNX

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TensorRT_ONNX

0. Introduction

  • Goal : Convert pytorch model to TensorRT int8 model using by ONNX for use in C++ code.
  • Process : Pytorch model(python) -> ONNX -> TensorRT Model(C++) -> TensorRT PTQ INT8 Model(C++)
  • Process2 : Pytorch model(python) -> ONNX -> TensorRT Model(python)
  • Sample Model : Resnet18

1. Development Environment

  • Device
    • Windows 10 laptop
    • CPU i7-11375H
    • GPU RTX-3060
  • Dependency
    • cuda 11.4.1
    • cudnn 8.4.1
    • tensorrt 8.4.3
    • pytorch 1.13.1+cu116
    • onnx 1.13.0
    • onnxruntime-gpu 1.14.0

2. Code Scheme

    TensorRT_ONNX/
    ├── calib_data/                   # 100 images for ptq
    ├── data/                         # input image
    ├── Pytorch/
    │   ├─ model/                     # onnx, pth, wts files
    │   ├─ 1_resnet18_torch.py        # base pytorch model
    │   ├─ 2_resnet18_onnx_runtime.py # make onnx & onnxruntime model
    │   ├─ 3_resnet18_onnx.py         # make onnx for TRT
    │   ├─ 4_resnet18_gen_wts.py      # make weight(.wts) for api TRT model 
    │   ├─ 5_resnet18_trt.py          # make TRT model using python tensorrt api
    │   ├─ common.py                  # for 5_resnet18_trt.py
    │   └─ utils.py  
    ├── TensorRT_ONNX/ 
    │   ├─ Engine/                    # engine file & calibration cach table
    │   ├─ TensorRT_ONNX/
    │   │   ├─ calibrator.cpp         # for ptq
    │   │   ├─ calibrator.hpp
    │   │   ├─ logging.hpp
    │   │   ├─ main.cpp               # main code
    │   │   ├─ utils.cpp              # custom util functions
    │   │   └─ utils.hpp
    │   └─ TensorRT_ONNX.sln
    ├── LICENSE
    └── README.md

3. Performance Evaluation

  • Comparison of calculation average execution time of 100 iteration and FPS, GPU memory usage for one image [224,224,3]
PytorchONNX-RTTensorRTTensorRTTensorRT
PrecisionFP32FP32FP32FP16Int8(PTQ)
Avg Duration time [ms] 3.68 ms 2.52 ms 1.32 ms 0.56 ms 0.41 ms
FPS [frame/sec] 271.14 fps 396.47 fps 757.00 fps 1797.6 fps 2444.9 fps
Memory [GB] 1.58 GB 1.18 GB 0.31 GB 0.27 GB 0.25 GB