You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description
i have converted a pytorch model to onnx with fp16 precision. Triton Information
24.03
Are you using the Triton container or did you build it yourself?
container To Reproduce
i am using model-analyser to generate reports for different configs, but its giving below warning and stucks there forever. I0517 17:00:33.419397 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_0 (GPU device 0) I0517 17:00:33.419416 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_1 (GPU device 0) I0517 17:00:33.419458 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_2 (GPU device 0) I0517 17:00:33.419473 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_3 (GPU device 0) I0517 17:00:33.419514 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_4 (GPU device 0) I0517 17:00:33.419526 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_5 (GPU device 0) I0517 17:00:33.419580 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_6 (GPU device 0) I0517 17:00:33.419601 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_7 (GPU device 0) 2024-05-17 17:00:41.008249791 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:00:41 WARNING] TensorRT encountered issues when converting weights between types and that could affect accuracy. 2024-05-17 17:00:41.008285569 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:00:41 WARNING] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights. 2024-05-17 17:00:41.008290949 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:00:41 WARNING] Check verbose logs for the list of affected weights. 2024-05-17 17:00:41.008297602 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:00:41 WARNING] - 256 weights are affected by this issue: Detected subnormal FP16 values. 2024-05-17 17:00:41.008343438 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:00:41 WARNING] - 1 weights are affected by this issue: Detected finite FP32 values which would overflow in FP16 and converted them to the closest finite FP16 value. 2024-05-17 17:00:42.421015291 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:00:42 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:00:43.797671162 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:00:43 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:03:19.817791037 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:03:19 WARNING] TensorRT encountered issues when converting weights between types and that could affect accuracy. 2024-05-17 17:03:19.817834209 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:03:19 WARNING] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights. 2024-05-17 17:03:19.817839830 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:03:19 WARNING] Check verbose logs for the list of affected weights. 2024-05-17 17:03:19.817845841 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:03:19 WARNING] - 256 weights are affected by this issue: Detected subnormal FP16 values. 2024-05-17 17:03:19.817889874 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:03:19 WARNING] - 1 weights are affected by this issue: Detected finite FP32 values which would overflow in FP16 and converted them to the closest finite FP16 value. 2024-05-17 17:03:21.216239987 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:03:21 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:03:22.564165174 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:03:22 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:06:00.961948435 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:06:00 WARNING] TensorRT encountered issues when converting weights between types and that could affect accuracy. 2024-05-17 17:06:00.961992879 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:06:00 WARNING] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights. 2024-05-17 17:06:00.961998901 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:06:00 WARNING] Check verbose logs for the list of affected weights. 2024-05-17 17:06:00.962005373 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:06:00 WARNING] - 256 weights are affected by this issue: Detected subnormal FP16 values. 2024-05-17 17:06:00.962053173 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:06:00 WARNING] - 1 weights are affected by this issue: Detected finite FP32 values which would overflow in FP16 and converted them to the closest finite FP16 value. 2024-05-17 17:06:02.351065690 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:06:02 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:06:03.729500906 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:06:03 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:08:41.788991283 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:08:41 WARNING] TensorRT encountered issues when converting weights between types and that could affect accuracy. 2024-05-17 17:08:41.789050785 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:08:41 WARNING] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights. 2024-05-17 17:08:41.789056105 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:08:41 WARNING] Check verbose logs for the list of affected weights. 2024-05-17 17:08:41.789062357 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:08:41 WARNING] - 256 weights are affected by this issue: Detected subnormal FP16 values. 2024-05-17 17:08:41.789107623 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:08:41 WARNING] - 1 weights are affected by this issue: Detected finite FP32 values which would overflow in FP16 and converted them to the closest finite FP16 value. 2024-05-17 17:08:43.198704153 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:08:43 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:08:44.570473931 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:08:44 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:11:28.251322741 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:11:28 WARNING] TensorRT encountered issues when converting weights between types and that could affect accuracy. 2024-05-17 17:11:28.251357717 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:11:28 WARNING] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights. 2024-05-17 17:11:28.251363007 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:11:28 WARNING] Check verbose logs for the list of affected weights. 2024-05-17 17:11:28.251369129 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:11:28 WARNING] - 256 weights are affected by this issue: Detected subnormal FP16 values. 2024-05-17 17:11:28.251412792 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:11:28 WARNING] - 1 weights are affected by this issue: Detected finite FP32 values which would overflow in FP16 and converted them to the closest finite FP16 value. 2024-05-17 17:11:29.643603875 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:11:29 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:11:31.028186788 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:11:31 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
model is just embedding model from hugging face.
The text was updated successfully, but these errors were encountered:
Description
i have converted a pytorch model to onnx with fp16 precision.
Triton Information
24.03
Are you using the Triton container or did you build it yourself?
container
To Reproduce
i am using model-analyser to generate reports for different configs, but its giving below warning and stucks there forever.
I0517 17:00:33.419397 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_0 (GPU device 0) I0517 17:00:33.419416 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_1 (GPU device 0) I0517 17:00:33.419458 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_2 (GPU device 0) I0517 17:00:33.419473 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_3 (GPU device 0) I0517 17:00:33.419514 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_4 (GPU device 0) I0517 17:00:33.419526 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_5 (GPU device 0) I0517 17:00:33.419580 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_6 (GPU device 0) I0517 17:00:33.419601 1 onnxruntime.cc:2965] TRITONBACKEND_ModelInstanceInitialize: bge_reranker_v2_onnx_0_7 (GPU device 0) 2024-05-17 17:00:41.008249791 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:00:41 WARNING] TensorRT encountered issues when converting weights between types and that could affect accuracy. 2024-05-17 17:00:41.008285569 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:00:41 WARNING] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights. 2024-05-17 17:00:41.008290949 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:00:41 WARNING] Check verbose logs for the list of affected weights. 2024-05-17 17:00:41.008297602 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:00:41 WARNING] - 256 weights are affected by this issue: Detected subnormal FP16 values. 2024-05-17 17:00:41.008343438 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:00:41 WARNING] - 1 weights are affected by this issue: Detected finite FP32 values which would overflow in FP16 and converted them to the closest finite FP16 value. 2024-05-17 17:00:42.421015291 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:00:42 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:00:43.797671162 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:00:43 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:03:19.817791037 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:03:19 WARNING] TensorRT encountered issues when converting weights between types and that could affect accuracy. 2024-05-17 17:03:19.817834209 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:03:19 WARNING] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights. 2024-05-17 17:03:19.817839830 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:03:19 WARNING] Check verbose logs for the list of affected weights. 2024-05-17 17:03:19.817845841 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:03:19 WARNING] - 256 weights are affected by this issue: Detected subnormal FP16 values. 2024-05-17 17:03:19.817889874 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:03:19 WARNING] - 1 weights are affected by this issue: Detected finite FP32 values which would overflow in FP16 and converted them to the closest finite FP16 value. 2024-05-17 17:03:21.216239987 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:03:21 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:03:22.564165174 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:03:22 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:06:00.961948435 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:06:00 WARNING] TensorRT encountered issues when converting weights between types and that could affect accuracy. 2024-05-17 17:06:00.961992879 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:06:00 WARNING] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights. 2024-05-17 17:06:00.961998901 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:06:00 WARNING] Check verbose logs for the list of affected weights. 2024-05-17 17:06:00.962005373 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:06:00 WARNING] - 256 weights are affected by this issue: Detected subnormal FP16 values. 2024-05-17 17:06:00.962053173 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:06:00 WARNING] - 1 weights are affected by this issue: Detected finite FP32 values which would overflow in FP16 and converted them to the closest finite FP16 value. 2024-05-17 17:06:02.351065690 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:06:02 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:06:03.729500906 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:06:03 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:08:41.788991283 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:08:41 WARNING] TensorRT encountered issues when converting weights between types and that could affect accuracy. 2024-05-17 17:08:41.789050785 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:08:41 WARNING] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights. 2024-05-17 17:08:41.789056105 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:08:41 WARNING] Check verbose logs for the list of affected weights. 2024-05-17 17:08:41.789062357 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:08:41 WARNING] - 256 weights are affected by this issue: Detected subnormal FP16 values. 2024-05-17 17:08:41.789107623 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:08:41 WARNING] - 1 weights are affected by this issue: Detected finite FP32 values which would overflow in FP16 and converted them to the closest finite FP16 value. 2024-05-17 17:08:43.198704153 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:08:43 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:08:44.570473931 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:08:44 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:11:28.251322741 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:11:28 WARNING] TensorRT encountered issues when converting weights between types and that could affect accuracy. 2024-05-17 17:11:28.251357717 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:11:28 WARNING] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights. 2024-05-17 17:11:28.251363007 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:11:28 WARNING] Check verbose logs for the list of affected weights. 2024-05-17 17:11:28.251369129 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:11:28 WARNING] - 256 weights are affected by this issue: Detected subnormal FP16 values. 2024-05-17 17:11:28.251412792 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:11:28 WARNING] - 1 weights are affected by this issue: Detected finite FP32 values which would overflow in FP16 and converted them to the closest finite FP16 value. 2024-05-17 17:11:29.643603875 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:11:29 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2024-05-17 17:11:31.028186788 [W:onnxruntime:log, tensorrt_execution_provider.h:83 log] [2024-05-17 17:11:31 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
model is just embedding model from hugging face.
The text was updated successfully, but these errors were encountered: