TRT8 -> TRT10, severe performance degradation. #3853

Mr-Nineteen · 2024-05-10T04:12:52Z

Verified model: Search and recommendation model.
Inference time for TRT8 model：

latency	milliseconds
50%	12.860
60%	13.192
70%	13.513
80%	14.229
90%	15.736
95%	16.618
99%	19.204

Inference time for TRT10 model：

latency	milliseconds
50%	37.349
60%	39.082
70%	41.100
80%	43.191
90%	46.483
95%	49.919
99%	56.780

Analyze the main time-consuming part:

    // Prepare inputs
    for (size_t i = 0; i < inputs.size(); ++i) {
      const TFTensor& input = inputs[i];
      const std::string& input_name = input_names_[i];
      if (!context_->setTensorAddress(input_name.c_str(), input.data())) {
        return tf::errors::Internal(
            carbon::Printf("Failed to `setTensorAddress` for input name [%s]",
                           input_name.c_str()));
      }
      nvinfer1::Dims dims;
      TensorShapeToDims(input.shape(), &dims);
      if (!context_->setInputShape(input_name.c_str(), dims)) {
        return tf::errors::Internal(carbon::Printf(
            "Failed to `setInputShape` for name [%s]", input_name.c_str()));
      }
    }

It is this setInputShape API that causes the increase in time consumption. If there are nearly a thousand inputs, the time consumption increases by tens of milliseconds.

The text was updated successfully, but these errors were encountered:

lix19937 · 2024-05-10T15:29:18Z

Why not use trtexec for benchmark ?

Mr-Nineteen · 2024-05-11T06:28:50Z

@lix19937

TRT is integrated into a self-developed framework, with its own set of benchmark tests and verification processes.

This issue is solely due to the upgrade to TRT10.

zerollzeng · 2024-05-12T07:24:50Z

Could you please provide a reproduce? @Mr-Nineteen

zerollzeng self-assigned this May 12, 2024

zerollzeng added the triaged Issue has been triaged by maintainers label May 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TRT8 -> TRT10, severe performance degradation. #3853

TRT8 -> TRT10, severe performance degradation. #3853

Mr-Nineteen commented May 10, 2024 •

edited

lix19937 commented May 10, 2024

Mr-Nineteen commented May 11, 2024

zerollzeng commented May 12, 2024

TRT8 -> TRT10, severe performance degradation. #3853

TRT8 -> TRT10, severe performance degradation. #3853

Comments

Mr-Nineteen commented May 10, 2024 • edited

lix19937 commented May 10, 2024

Mr-Nineteen commented May 11, 2024

zerollzeng commented May 12, 2024

Mr-Nineteen commented May 10, 2024 •

edited