Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trt accelerator #7238

Open
riyajatar37003 opened this issue May 17, 2024 · 0 comments
Open

trt accelerator #7238

riyajatar37003 opened this issue May 17, 2024 · 0 comments

Comments

@riyajatar37003
Copy link

riyajatar37003 commented May 17, 2024

Description
i have converted a pytorch model to onnx with fp16 precision.
Triton Information
24.03

Are you using the Triton container or did you build it yourself?
container
To Reproduce
i am using model-analyser to generate reports for different configs, but its giving below warning and stucks there forever.
I0517 17:00:33.419397 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_0 (GPU device 0) I0517 17:00:33.419416 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_1 (GPU device 0) I0517 17:00:33.419458 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_2 (GPU device 0) I0517 17:00:33.419473 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_3 (GPU device 0) I0517 17:00:33.419514 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_4 (GPU device 0) I0517 17:00:33.419526 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_5 (GPU device 0) I0517 17:00:33.419580 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_6 (GPU device 0) I0517 17:00:33.419601 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_7 (GPU device 0) 2024-05-17 17:00:41.008249791 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:00:41 WARNING] TensorRT encountered issues when converting weights between types and that could affect accuracy. 2024-05-17 17:00:41.008285569 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:00:41 WARNING] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights. 2024-05-17 17:00:41.008290949 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:00:41 WARNING] Check verbose logs for the list of affected weights. 2024-05-17 17:00:41.008297602 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:00:41 WARNING] - 256 weights are affected by this issue: Detected subnormal FP16 values. 2024-05-17 17:00:41.008343438 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:00:41 WARNING] - 1 weights are affected by this issue: Detected finite FP32 values which would overflow in FP16 and converted them to the closest finite FP16 value. 2024-05-17 17:00:42.421015291 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:00:42 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:00:43.797671162 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:00:43 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:03:19.817791037 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:03:19 WARNING] TensorRT encountered issues when converting weights between types and that could affect accuracy. 2024-05-17 17:03:19.817834209 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:03:19 WARNING] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights. 2024-05-17 17:03:19.817839830 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:03:19 WARNING] Check verbose logs for the list of affected weights. 2024-05-17 17:03:19.817845841 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:03:19 WARNING] - 256 weights are affected by this issue: Detected subnormal FP16 values. 2024-05-17 17:03:19.817889874 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:03:19 WARNING] - 1 weights are affected by this issue: Detected finite FP32 values which would overflow in FP16 and converted them to the closest finite FP16 value. 2024-05-17 17:03:21.216239987 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:03:21 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:03:22.564165174 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:03:22 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:06:00.961948435 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:06:00 WARNING] TensorRT encountered issues when converting weights between types and that could affect accuracy. 2024-05-17 17:06:00.961992879 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:06:00 WARNING] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights. 2024-05-17 17:06:00.961998901 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:06:00 WARNING] Check verbose logs for the list of affected weights. 2024-05-17 17:06:00.962005373 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:06:00 WARNING] - 256 weights are affected by this issue: Detected subnormal FP16 values. 2024-05-17 17:06:00.962053173 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:06:00 WARNING] - 1 weights are affected by this issue: Detected finite FP32 values which would overflow in FP16 and converted them to the closest finite FP16 value. 2024-05-17 17:06:02.351065690 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:06:02 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:06:03.729500906 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:06:03 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:08:41.788991283 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:08:41 WARNING] TensorRT encountered issues when converting weights between types and that could affect accuracy. 2024-05-17 17:08:41.789050785 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:08:41 WARNING] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights. 2024-05-17 17:08:41.789056105 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:08:41 WARNING] Check verbose logs for the list of affected weights. 2024-05-17 17:08:41.789062357 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:08:41 WARNING] - 256 weights are affected by this issue: Detected subnormal FP16 values. 2024-05-17 17:08:41.789107623 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:08:41 WARNING] - 1 weights are affected by this issue: Detected finite FP32 values which would overflow in FP16 and converted them to the closest finite FP16 value. 2024-05-17 17:08:43.198704153 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:08:43 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:08:44.570473931 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:08:44 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:11:28.251322741 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:11:28 WARNING] TensorRT encountered issues when converting weights between types and that could affect accuracy. 2024-05-17 17:11:28.251357717 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:11:28 WARNING] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights. 2024-05-17 17:11:28.251363007 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:11:28 WARNING] Check verbose logs for the list of affected weights. 2024-05-17 17:11:28.251369129 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:11:28 WARNING] - 256 weights are affected by this issue: Detected subnormal FP16 values. 2024-05-17 17:11:28.251412792 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:11:28 WARNING] - 1 weights are affected by this issue: Detected finite FP32 values which would overflow in FP16 and converted them to the closest finite FP16 value. 2024-05-17 17:11:29.643603875 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:11:29 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:11:31.028186788 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:11:31 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.

model is just embedding model from hugging face.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant