Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

文档中提供的自动压缩后RT-DETR模型的准确率很低 #1862

Closed
bittergourd1224 opened this issue Mar 22, 2024 · 5 comments
Closed
Assignees

Comments

@bittergourd1224
Copy link

bittergourd1224 commented Mar 22, 2024

环境:
paddledet 2.6.0
paddlepaddle-gpu 2.4.2
paddleslim 2.6.0

复现步骤:
尝试使用文档中提供的已自动压缩的RT-DETR-R50,下载并解压

| RT-DETR-R50 | 53.1 | 52.9 | 53.0 | 32.05ms | 9.12ms | **6.96ms** | [config](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/example/auto_compression/detection/configs/rtdetr_r50vd_qat_dis.yaml) | [Model](https://bj.bcebos.com/v1/paddle-slim-models/act/rtdetr_r50vd_6x_coco_quant.tar) |

使用GPU和fp32模式推理图片,具体指令:
python3 paddle_inference_eval.py --model_path=output/rtdetr_r50vd_6x_coco_quant --reader_config=configs/rtdetr_reader.yml --image_file=000000144941.jpg --device=GPU --precision=fp32
推理结果是:

W0322 15:17:07.880046 23290 analysis_predictor.cc:1395] The one-time configuration of analysis predictor failed, which may be due to native predictor called first and its configurations taken effect.
--- Running analysis [ir_graph_build_pass]
--- Running analysis [ir_analysis_pass]
--- Running IR pass [is_test_pass]
--- Running IR pass [simplify_with_basic_ops_pass]
--- Running IR pass [conv_bn_fuse_pass]
--- Running IR pass [conv_eltwiseadd_bn_fuse_pass]
--- Running IR pass [embedding_eltwise_layernorm_fuse_pass]
--- Running IR pass [multihead_matmul_fuse_pass_v2]
--- Running IR pass [fused_multi_transformer_encoder_pass]
--- Running IR pass [fused_multi_transformer_decoder_pass]
--- Running IR pass [fused_multi_transformer_encoder_fuse_qkv_pass]
--- Running IR pass [fused_multi_transformer_decoder_fuse_qkv_pass]
--- Running IR pass [multi_devices_fused_multi_transformer_encoder_fuse_qkv_pass]
--- Running IR pass [multi_devices_fused_multi_transformer_decoder_fuse_qkv_pass]
--- Running IR pass [fuse_multi_transformer_layer_pass]
--- Running IR pass [gpu_cpu_squeeze2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_reshape2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_flatten2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_map_matmul_v2_to_mul_pass]
--- Running IR pass [gpu_cpu_map_matmul_v2_to_matmul_pass]
W0322 15:17:12.930387 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.930546 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.930656 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.935962 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.936060 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.936131 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.936201 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.936271 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.936342 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.936410 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.936480 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.936549 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.936619 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.936689 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.936758 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.936829 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.940769 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.940857 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.940928 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.940999 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.941068 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.941138 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.941207 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.941277 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.941346 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.941416 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.941485 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.941555 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.941625 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.941694 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.941763 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.945178 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.945257 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.945327 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.945396 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.945466 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.945534 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.945602 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.945672 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.945741 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.945811 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.945880 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.945948 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.946018 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.946086 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.946156 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.949091 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.949169 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.949239 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.949309 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.949379 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.949447 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.949517 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.949585 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.949653 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.949723 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.949790 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.949859 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.949929 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.949998 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.950066 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.952968 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.953047 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.953115 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.953184 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.953253 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.953321 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.953389 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.953459 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.953527 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.953596 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.953665 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.953734 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.953804 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.953872 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.953941 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.956862 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.956943 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.957013 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.957082 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.957149 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.957218 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.957286 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.957355 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.957423 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.957492 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.957561 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.957630 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.957700 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.957767 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.957836 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.960781 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.960860 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.960929 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.960999 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.961066 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.961134 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.961202 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.961270 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.961339 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.961408 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:17:12.961477 23290 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
I0322 15:17:12.961534 23290 fuse_pass_base.cc:59] ---  detected 14 subgraphs
--- Running IR pass [matmul_scale_fuse_pass]
--- Running IR pass [multihead_matmul_fuse_pass_v3]
--- Running IR pass [gpu_cpu_map_matmul_to_mul_pass]
--- Running IR pass [fc_fuse_pass]
--- Running IR pass [fc_elementwise_layernorm_fuse_pass]
--- Running IR pass [conv_elementwise_add_act_fuse_pass]
--- Running IR pass [conv_elementwise_add2_act_fuse_pass]
--- Running IR pass [conv_elementwise_add_fuse_pass]
--- Running IR pass [transpose_flatten_concat_fuse_pass]
--- Running IR pass [constant_folding_pass]
--- Running IR pass [auto_mixed_precision_pass]
--- Running IR pass [runtime_context_cache_pass]
--- Running analysis [ir_params_sync_among_devices_pass]
I0322 15:17:36.090184 23290 ir_params_sync_among_devices_pass.cc:89] Sync params from CPU to GPU
--- Running analysis [adjust_cudnn_workspace_size_pass]
--- Running analysis [inference_op_replace_pass]
--- Running analysis [memory_optimize_pass]
I0322 15:17:36.764639 23290 memory_optimize_pass.cc:219] Cluster name : fill_constant_213.tmp_0  size: 4
I0322 15:17:36.764690 23290 memory_optimize_pass.cc:219] Cluster name : tmp_143  size: 4
I0322 15:17:36.764705 23290 memory_optimize_pass.cc:219] Cluster name : conv2d_118.tmp_0.quantized  size: 26214400
I0322 15:17:36.764717 23290 memory_optimize_pass.cc:219] Cluster name : batch_norm_6.tmp_2  size: 26214400
I0322 15:17:36.764729 23290 memory_optimize_pass.cc:219] Cluster name : fill_constant_325.tmp_0  size: 4
I0322 15:17:36.764741 23290 memory_optimize_pass.cc:219] Cluster name : fill_constant_49.tmp_0  size: 4
I0322 15:17:36.764753 23290 memory_optimize_pass.cc:219] Cluster name : elementwise_add_0  size: 26214400
I0322 15:17:36.764765 23290 memory_optimize_pass.cc:219] Cluster name : flatten_33.tmp_0.quantized.dequantized  size: 9600
I0322 15:17:36.764777 23290 memory_optimize_pass.cc:219] Cluster name : conv2d_115.tmp_0.quantized.dequantized  size: 26214400
I0322 15:17:36.764789 23290 memory_optimize_pass.cc:219] Cluster name : relu_5.tmp_0.quantized.dequantized  size: 26214400
I0322 15:17:36.764801 23290 memory_optimize_pass.cc:219] Cluster name : concat_4.tmp_0  size: 8601600
I0322 15:17:36.764812 23290 memory_optimize_pass.cc:219] Cluster name : image  size: 4915200
I0322 15:17:36.764824 23290 memory_optimize_pass.cc:219] Cluster name : transpose_10.tmp_0  size: 307200
I0322 15:17:36.764837 23290 memory_optimize_pass.cc:219] Cluster name : scale_factor  size: 8
I0322 15:17:36.764847 23290 memory_optimize_pass.cc:219] Cluster name : elementwise_add_18  size: 8601600
I0322 15:17:36.764858 23290 memory_optimize_pass.cc:219] Cluster name : cast_2.tmp_0  size: 2150400
I0322 15:17:36.764876 23290 memory_optimize_pass.cc:219] Cluster name : sigmoid_28.tmp_0.quantized.dequantized  size: 4800
I0322 15:17:36.764886 23290 memory_optimize_pass.cc:219] Cluster name : conv2d_118.tmp_0.quantized.dequantized  size: 26214400
I0322 15:17:36.764902 23290 memory_optimize_pass.cc:219] Cluster name : im_shape  size: 8
--- Running analysis [ir_graph_to_program_pass]
I0322 15:17:37.901614 23290 analysis_predictor.cc:1318] ======= optimize end =======
I0322 15:17:37.951170 23290 naive_executor.cc:110] ---  skip [feed], feed -> scale_factor
I0322 15:17:37.951238 23290 naive_executor.cc:110] ---  skip [feed], feed -> image
I0322 15:17:37.951278 23290 naive_executor.cc:110] ---  skip [feed], feed -> im_shape
I0322 15:17:37.998129 23290 naive_executor.cc:110] ---  skip [save_infer_model/scale_0.tmp_0], fetch -> fetch
I0322 15:17:37.998183 23290 naive_executor.cc:110] ---  skip [save_infer_model/scale_1.tmp_0], fetch -> fetch
W0322 15:17:38.045212 23290 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.5, Driver API Version: 11.2, Runtime API Version: 10.2
W0322 15:17:38.052327 23290 gpu_resources.cc:91] device: 0, cuDNN Version: 7.6.
[Benchmark]Inference time(ms): min=90.2, max=90.2, avg=90.2
bicycle: 0.732
bicycle: 0.709

使用图片为
000000144941
结果与图片明显不符

另外也尝试用coco的一个子数据集进行批量测试,tiny_coco_dataset
指令为:
python3 paddle_inference_eval.py --model_path=output/rtdetr_r50vd_6x_coco_quant --reader_config=configs/rtdetr_reader.yml --device=GPU --precision=fp32 --benchmark=True
结果为:

W0322 15:43:39.533730 27379 analysis_predictor.cc:1395] The one-time configuration of analysis predictor failed, which may be due to native predictor called first and its configurations taken effect.
--- Running analysis [ir_graph_build_pass]
--- Running analysis [ir_analysis_pass]
--- Running IR pass [is_test_pass]
--- Running IR pass [simplify_with_basic_ops_pass]
--- Running IR pass [conv_bn_fuse_pass]
--- Running IR pass [conv_eltwiseadd_bn_fuse_pass]
--- Running IR pass [embedding_eltwise_layernorm_fuse_pass]
--- Running IR pass [multihead_matmul_fuse_pass_v2]
--- Running IR pass [fused_multi_transformer_encoder_pass]
--- Running IR pass [fused_multi_transformer_decoder_pass]
--- Running IR pass [fused_multi_transformer_encoder_fuse_qkv_pass]
--- Running IR pass [fused_multi_transformer_decoder_fuse_qkv_pass]
--- Running IR pass [multi_devices_fused_multi_transformer_encoder_fuse_qkv_pass]
--- Running IR pass [multi_devices_fused_multi_transformer_decoder_fuse_qkv_pass]
--- Running IR pass [fuse_multi_transformer_layer_pass]
--- Running IR pass [gpu_cpu_squeeze2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_reshape2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_flatten2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_map_matmul_v2_to_mul_pass]
--- Running IR pass [gpu_cpu_map_matmul_v2_to_matmul_pass]
W0322 15:43:48.186563 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.186674 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.186748 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.190639 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.190724 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.190795 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.190865 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.190945 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.191016 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.191084 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.191152 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.191220 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.191287 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.191356 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.191424 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.191493 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.194361 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.194439 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.194509 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.194578 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.194646 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.194715 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.194783 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.194851 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.194929 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.194999 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.195067 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.195135 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.195204 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.195272 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.195340 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.198206 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.198283 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.198352 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.198421 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.198489 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.198558 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.198626 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.198694 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.198762 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.198832 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.198909 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.198979 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.199047 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.199115 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.199183 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.202046 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.202122 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.202191 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.202260 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.202329 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.202397 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.202466 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.202534 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.202602 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.202670 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.202740 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.202808 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.202876 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.202955 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.203024 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.205875 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.205950 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.206020 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.206089 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.206156 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.206224 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.206292 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.206362 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.206429 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.206497 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.206565 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.206633 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.206702 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.206769 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.206838 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.209704 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.209784 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.209851 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.209920 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.209988 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.210057 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.210124 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.210193 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.210261 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.210330 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.210397 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.210465 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.210534 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.210602 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.210670 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.213546 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.213624 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.213691 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.213759 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.213827 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.213894 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.213961 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.214030 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.214097 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.214165 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
W0322 15:43:48.214233 27379 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
I0322 15:43:48.214284 27379 fuse_pass_base.cc:59] ---  detected 14 subgraphs
--- Running IR pass [matmul_scale_fuse_pass]
--- Running IR pass [multihead_matmul_fuse_pass_v3]
--- Running IR pass [gpu_cpu_map_matmul_to_mul_pass]
--- Running IR pass [fc_fuse_pass]
--- Running IR pass [fc_elementwise_layernorm_fuse_pass]
--- Running IR pass [conv_elementwise_add_act_fuse_pass]
--- Running IR pass [conv_elementwise_add2_act_fuse_pass]
--- Running IR pass [conv_elementwise_add_fuse_pass]
--- Running IR pass [transpose_flatten_concat_fuse_pass]
--- Running IR pass [constant_folding_pass]
--- Running IR pass [auto_mixed_precision_pass]
--- Running IR pass [runtime_context_cache_pass]
--- Running analysis [ir_params_sync_among_devices_pass]
I0322 15:44:08.669596 27379 ir_params_sync_among_devices_pass.cc:89] Sync params from CPU to GPU
--- Running analysis [adjust_cudnn_workspace_size_pass]
--- Running analysis [inference_op_replace_pass]
--- Running analysis [memory_optimize_pass]
I0322 15:44:09.206269 27379 memory_optimize_pass.cc:219] Cluster name : fill_constant_213.tmp_0  size: 4
I0322 15:44:09.206315 27379 memory_optimize_pass.cc:219] Cluster name : tmp_143  size: 4
I0322 15:44:09.206326 27379 memory_optimize_pass.cc:219] Cluster name : conv2d_118.tmp_0.quantized  size: 26214400
I0322 15:44:09.206336 27379 memory_optimize_pass.cc:219] Cluster name : batch_norm_6.tmp_2  size: 26214400
I0322 15:44:09.206346 27379 memory_optimize_pass.cc:219] Cluster name : fill_constant_325.tmp_0  size: 4
I0322 15:44:09.206353 27379 memory_optimize_pass.cc:219] Cluster name : fill_constant_49.tmp_0  size: 4
I0322 15:44:09.206362 27379 memory_optimize_pass.cc:219] Cluster name : elementwise_add_0  size: 26214400
I0322 15:44:09.206372 27379 memory_optimize_pass.cc:219] Cluster name : flatten_33.tmp_0.quantized.dequantized  size: 9600
I0322 15:44:09.206380 27379 memory_optimize_pass.cc:219] Cluster name : conv2d_115.tmp_0.quantized.dequantized  size: 26214400
I0322 15:44:09.206389 27379 memory_optimize_pass.cc:219] Cluster name : relu_5.tmp_0.quantized.dequantized  size: 26214400
I0322 15:44:09.206398 27379 memory_optimize_pass.cc:219] Cluster name : concat_4.tmp_0  size: 8601600
I0322 15:44:09.206406 27379 memory_optimize_pass.cc:219] Cluster name : image  size: 4915200
I0322 15:44:09.206414 27379 memory_optimize_pass.cc:219] Cluster name : transpose_10.tmp_0  size: 307200
I0322 15:44:09.206423 27379 memory_optimize_pass.cc:219] Cluster name : scale_factor  size: 8
I0322 15:44:09.206431 27379 memory_optimize_pass.cc:219] Cluster name : elementwise_add_18  size: 8601600
I0322 15:44:09.206440 27379 memory_optimize_pass.cc:219] Cluster name : cast_2.tmp_0  size: 2150400
I0322 15:44:09.206449 27379 memory_optimize_pass.cc:219] Cluster name : sigmoid_28.tmp_0.quantized.dequantized  size: 4800
I0322 15:44:09.206457 27379 memory_optimize_pass.cc:219] Cluster name : conv2d_118.tmp_0.quantized.dequantized  size: 26214400
I0322 15:44:09.206466 27379 memory_optimize_pass.cc:219] Cluster name : im_shape  size: 8
--- Running analysis [ir_graph_to_program_pass]
I0322 15:44:09.936969 27379 analysis_predictor.cc:1318] ======= optimize end =======
I0322 15:44:09.971310 27379 naive_executor.cc:110] ---  skip [feed], feed -> scale_factor
I0322 15:44:09.971345 27379 naive_executor.cc:110] ---  skip [feed], feed -> image
I0322 15:44:09.971355 27379 naive_executor.cc:110] ---  skip [feed], feed -> im_shape
I0322 15:44:10.002552 27379 naive_executor.cc:110] ---  skip [save_infer_model/scale_0.tmp_0], fetch -> fetch
I0322 15:44:10.002584 27379 naive_executor.cc:110] ---  skip [save_infer_model/scale_1.tmp_0], fetch -> fetch
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
[03/22 15:44:10] ppdet.data.source.coco INFO: Load [48 samples valid, 2 samples invalid] in file /home/foia_xlc/dataset/tiny_coco_dataset/tiny_coco/annotations/instances_val2017.json.
Evaluating:   0%|                                                                                                                                                                    | 0/48 [00:00<?, ?it/s]W0322 15:44:10.055075 27379 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.5, Driver API Version: 11.2, Runtime API Version: 10.2
W0322 15:44:10.060184 27379 gpu_resources.cc:91] device: 0, cuDNN Version: 7.6.
Evaluating: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 48/48 [00:06<00:00,  7.55it/s]
[03/22 15:44:16] ppdet.metrics.metrics INFO: The bbox result is saved to bbox.json.
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
[03/22 15:44:16] ppdet.metrics.coco_utils INFO: Start evaluate...
Loading and preparing results...
DONE (t=0.21s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.27s).
Accumulating evaluation results...
DONE (t=0.37s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
[Benchmark]Inference time(ms): min=71.54, max=2036.7, avg=121.4
[Benchmark] COCO mAP: 0.0

mAP为0

请问是什么原因?谢谢!

@wanghaoshuang
Copy link
Collaborator

感谢关注,我们找相关同事看下。

@xiaoluomi
Copy link
Collaborator

你好由于这个模型是经过量化后的模型,故采用fp32模式跑是精度不对的,因为已经插入了量化算子。应改成指令 python3 paddle_inference_eval.py --model_path=output/rtdetr_r50vd_6x_coco_quant --reader_config=configs/rtdetr_reader.yml --device=GPU --use_trt=True --precision=int8 --benchmark=True
如果你需要跑浮点的模型,可以从这下载:
https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.7/configs/rtdetr
这里有RT-DETR浮点模型下载 和 其导出成静态图模型的教程。浮点模型可设置 --precision=fp16 和 --precision=fp32

@xiaoluomi
Copy link
Collaborator

另外 RT-DETR模型可能需要更高版本的paddlepaddle-gpu

@bittergourd1224
Copy link
Author

@xiaoluomi 我在aistudio上重新部署了一套新的环境,paddlepaddle-gpu的版本为2.6.0
用你给的指令,删掉了--use_trt=True,会报错:

Warning: Unable to use numba in PP-Tracking, please install numba, for example(python3.7): `pip install numba==0.56.4`
Warning: Unable to use numba in PP-Tracking, please install numba, for example(python3.7): `pip install numba==0.56.4`
--- Running analysis [ir_graph_build_pass]
I0403 16:07:40.582815 52182 executor.cc:187] Old Executor is Running.
--- Running analysis [ir_analysis_pass]
--- Running IR pass [map_op_to_another_pass]
I0403 16:07:40.819634 52182 fuse_pass_base.cc:59] ---  detected 47 subgraphs
--- Running IR pass [is_test_pass]
--- Running IR pass [simplify_with_basic_ops_pass]
--- Running IR pass [delete_quant_dequant_linear_op_pass]
I0403 16:08:01.944480 52182 fuse_pass_base.cc:59] ---  detected 1106 subgraphs
--- Running IR pass [delete_weight_dequant_linear_op_pass]
--- Running IR pass [constant_folding_pass]
I0403 16:08:04.266005 52182 fuse_pass_base.cc:59] ---  detected 229 subgraphs
--- Running IR pass [silu_fuse_pass]
--- Running IR pass [conv_bn_fuse_pass]
I0403 16:08:04.597553 52182 fuse_pass_base.cc:59] ---  detected 73 subgraphs
--- Running IR pass [conv_eltwiseadd_bn_fuse_pass]
--- Running IR pass [embedding_eltwise_layernorm_fuse_pass]
--- Running IR pass [multihead_matmul_fuse_pass_v2]
--- Running IR pass [vit_attention_fuse_pass]
--- Running IR pass [fused_multi_transformer_encoder_pass]
--- Running IR pass [fused_multi_transformer_decoder_pass]
--- Running IR pass [fused_multi_transformer_encoder_fuse_qkv_pass]
--- Running IR pass [fused_multi_transformer_decoder_fuse_qkv_pass]
--- Running IR pass [multi_devices_fused_multi_transformer_encoder_pass]
--- Running IR pass [multi_devices_fused_multi_transformer_encoder_fuse_qkv_pass]
--- Running IR pass [multi_devices_fused_multi_transformer_decoder_fuse_qkv_pass]
--- Running IR pass [fuse_multi_transformer_layer_pass]
--- Running IR pass [gpu_cpu_squeeze2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_reshape2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_flatten2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_map_matmul_v2_to_mul_pass]
I0403 16:08:08.044272 52182 fuse_pass_base.cc:59] ---  detected 92 subgraphs
--- Running IR pass [gpu_cpu_map_matmul_v2_to_matmul_pass]
I0403 16:08:08.068631 52182 fuse_pass_base.cc:59] ---  detected 14 subgraphs
--- Running IR pass [matmul_scale_fuse_pass]
I0403 16:08:08.107571 52182 fuse_pass_base.cc:59] ---  detected 7 subgraphs
--- Running IR pass [multihead_matmul_fuse_pass_v3]
--- Running IR pass [gpu_cpu_map_matmul_to_mul_pass]
--- Running IR pass [fc_fuse_pass]
I0403 16:08:08.607133 52182 fuse_pass_base.cc:59] ---  detected 92 subgraphs
--- Running IR pass [fc_elementwise_layernorm_fuse_pass]
I0403 16:08:08.677088 52182 fuse_pass_base.cc:59] ---  detected 20 subgraphs
--- Running IR pass [conv_elementwise_add_act_fuse_pass]
I0403 16:08:08.764535 52182 fuse_pass_base.cc:59] ---  detected 35 subgraphs
--- Running IR pass [conv_elementwise_add2_act_fuse_pass]
I0403 16:08:08.801440 52182 fuse_pass_base.cc:59] ---  detected 4 subgraphs
--- Running IR pass [conv_elementwise_add_fuse_pass]
I0403 16:08:08.854338 52182 fuse_pass_base.cc:59] ---  detected 46 subgraphs
--- Running IR pass [transpose_flatten_concat_fuse_pass]
--- Running IR pass [fused_conv2d_add_act_layout_transfer_pass]
--- Running IR pass [transfer_layout_elim_pass]
I0403 16:08:08.963197 52182 transfer_layout_elim_pass.cc:346] move down 0 transfer_layout
I0403 16:08:08.963239 52182 transfer_layout_elim_pass.cc:347] eliminate 0 pair of transfer_layout
--- Running IR pass [auto_mixed_precision_pass]
--- Running IR pass [identity_op_clean_pass]
I0403 16:08:09.042109 52182 fuse_pass_base.cc:59] ---  detected 2 subgraphs
--- Running IR pass [inplace_op_var_pass]
I0403 16:08:09.078701 52182 fuse_pass_base.cc:59] ---  detected 146 subgraphs
--- Running analysis [save_optimized_model_pass]
--- Running analysis [ir_params_sync_among_devices_pass]
I0403 16:08:09.089999 52182 ir_params_sync_among_devices_pass.cc:53] Sync params from CPU to GPU
--- Running analysis [adjust_cudnn_workspace_size_pass]
--- Running analysis [inference_op_replace_pass]
--- Running analysis [memory_optimize_pass]
I0403 16:08:11.687259 52182 memory_optimize_pass.cc:118] The persistable params in main graph are : 160.793MB
I0403 16:08:11.737028 52182 memory_optimize_pass.cc:246] Cluster name : relu_11.tmp_0  size: 26214400
I0403 16:08:11.737099 52182 memory_optimize_pass.cc:246] Cluster name : tmp_4  size: 409600
I0403 16:08:11.737114 52182 memory_optimize_pass.cc:246] Cluster name : scale_factor  size: 8
I0403 16:08:11.737118 52182 memory_optimize_pass.cc:246] Cluster name : relu_5.tmp_0  size: 26214400
I0403 16:08:11.737123 52182 memory_optimize_pass.cc:246] Cluster name : elementwise_add_18  size: 8601600
I0403 16:08:11.737144 52182 memory_optimize_pass.cc:246] Cluster name : elementwise_add_17  size: 8601600
I0403 16:08:11.737152 52182 memory_optimize_pass.cc:246] Cluster name : image  size: 4915200
I0403 16:08:11.737159 52182 memory_optimize_pass.cc:246] Cluster name : shape_21.tmp_0_slice_0  size: 4
I0403 16:08:11.737167 52182 memory_optimize_pass.cc:246] Cluster name : tmp_11  size: 1638400
I0403 16:08:11.737174 52182 memory_optimize_pass.cc:246] Cluster name : im_shape  size: 8
I0403 16:08:11.737181 52182 memory_optimize_pass.cc:246] Cluster name : layer_norm_23.tmp_2  size: 307200
I0403 16:08:11.737188 52182 memory_optimize_pass.cc:246] Cluster name : transpose_14.tmp_0  size: 76800
I0403 16:08:11.737195 52182 memory_optimize_pass.cc:246] Cluster name : softmax_10.tmp_0  size: 115200
I0403 16:08:11.737202 52182 memory_optimize_pass.cc:246] Cluster name : elementwise_add_2  size: 26214400
I0403 16:08:11.737210 52182 memory_optimize_pass.cc:246] Cluster name : sigmoid_28.tmp_0  size: 4800
--- Running analysis [ir_graph_to_program_pass]
I0403 16:08:12.254211 52182 analysis_predictor.cc:1838] ======= optimize end =======
I0403 16:08:12.310688 52182 naive_executor.cc:200] ---  skip [feed], feed -> scale_factor
I0403 16:08:12.310745 52182 naive_executor.cc:200] ---  skip [feed], feed -> image
I0403 16:08:12.310756 52182 naive_executor.cc:200] ---  skip [feed], feed -> im_shape
I0403 16:08:12.319670 52182 naive_executor.cc:200] ---  skip [save_infer_model/scale_0.tmp_0], fetch -> fetch
I0403 16:08:12.319722 52182 naive_executor.cc:200] ---  skip [save_infer_model/scale_1.tmp_0], fetch -> fetch
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
[04/03 16:08:12] ppdet.data.source.coco INFO: Load [48 samples valid, 2 samples invalid] in file /home/aistudio/tiny_coco_dataset/tiny_coco/annotations/instances_val2017.json.
W0403 16:08:12.394639 52182 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 12.0, Runtime API Version: 11.8
W0403 16:08:12.395746 52182 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.
Traceback (most recent call last):
  File "/home/aistudio/PaddleSlim/example/auto_compression/detection/paddle_inference_eval.py", line 452, in <module>
    main()
  File "/home/aistudio/PaddleSlim/example/auto_compression/detection/paddle_inference_eval.py", line 436, in main
    eval(predictor, val_loader, metric, rerun_flag=rerun_flag)
  File "/home/aistudio/PaddleSlim/example/auto_compression/detection/paddle_inference_eval.py", line 367, in eval
    predictor.run()
ValueError: (InvalidArgument) The type of data we are trying to retrieve (float32) does not match the type of data (int8) currently contained in the container.
  [Hint: Expected dtype() == phi::CppTypeToDataType<T>::Type(), but received dtype():3 != phi::CppTypeToDataType<T>::Type():10.] (at /paddle/paddle/phi/core/dense_tensor.cc:171)
  [operator < fused_fc_elementwise_layernorm > error]

是GPU无法用Int8模式吗?
有能不用TensorRT,使用量化模型成功的方法吗?

@xiaoluomi
Copy link
Collaborator

xiaoluomi commented Apr 3, 2024

不启用Paddle-trt,使用原生GPU来推理量化模型,现在版本的paddle是不支持的,这里报错是算子数据类型不匹配,目前paddle的原生gpu推理也尚未支持量化的模型进行推理,所以需要开启--use_trt=True 来开启Paddle-trt进行量化模型的int8推理。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants