Triton Server OpenVINO backend not working with Tensorflow saved models #7200

atobiszei · 2024-05-09T14:28:12Z

Description
Triton is unable to load models with Tensorflow saved model format with OpenVINO backend.

Triton Information
What version of Triton are you using?
23.10,23.11,23.12,24.03,24.04 don't work.

Are you using the Triton container or did you build it yourself?
Triton container

To Reproduce
Basically follow:
https://github.com/triton-inference-server/tutorials/tree/main/Quick_Deploy/TensorFlow but change backend to OpenVINO

Model config:

name: "resnet50"
backend: "openvino"
#platform: "tensorflow_savedmodel"
default_model_filename: "model.saved_model"
max_batch_size : 0
input [
  {
    name: "input_1"
    data_type: TYPE_FP32
    dims: [-1, 224, 224, 3 ]
  }
]
output [
  {
    name: "predictions"
    data_type: TYPE_FP32
    dims: [-1, 1000]
  }
]

Command and logs:

docker run --rm -v ${PWD}/model_repository:/models nvcr.io/nvidia/tritonserver:24.03-py3 tritonserver --model-repository=/models #docker run --rm -p 8000:8000 -p 8001:8001 -p 8002:8002 -v ${PWD}/model_repository:/models nvcr.io/nvidia/tritonserver:24.03-py3 tritonserver --model-repository=/models


=============================
== Triton Inference Server ==
=============================

NVIDIA Release 24.03 (build 86102629)
Triton Server Version 2.44.0

Copyright (c) 2018-2023, NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

WARNING: The NVIDIA Driver was not detected.  GPU functionality will not be available.
   Use the NVIDIA Container Toolkit to start this container with GPU support; see
   https://docs.nvidia.com/datacenter/cloud-native/ .

W0509 14:21:08.011250 1 pinned_memory_manager.cc:271] Unable to allocate pinned system memory, pinned memory pool will not be available: CUDA driver version is insufficient for CUDA runtime version
I0509 14:21:08.011288 1 cuda_memory_manager.cc:117] CUDA memory pool disabled
E0509 14:21:08.011338 1 server.cc:243] CudaDriverHelper has not been initialized.
I0509 14:21:08.013345 1 model_lifecycle.cc:469] loading: resnet50:1
I0509 14:21:08.022993 1 openvino.cc:1373] TRITONBACKEND_Initialize: openvino
I0509 14:21:08.023009 1 openvino.cc:1383] Triton TRITONBACKEND API version: 1.19
I0509 14:21:08.023012 1 openvino.cc:1389] 'openvino' TRITONBACKEND API version: 1.19
I0509 14:21:08.023059 1 openvino.cc:1473] TRITONBACKEND_ModelInitialize: resnet50 (version 1)
terminate called after throwing an instance of 'triton::backend::BackendModelInstanceException'

Expected behavior
Load the model.

The text was updated successfully, but these errors were encountered:

atobiszei · 2024-05-09T14:30:55Z

I found out that in the Triton image there are 2 versions of OpenVINO, and one of them is missing libraries from OpenVINO:

root@8bc8eab2d6ce:/# find -name "*openvino*" | grep -v 2330 | grep -v 23\.3\.0 | grep -v LICENSE | grep -v "libopenvino_c\|libopenvino.so"

./opt/tritonserver/backends/openvino
./opt/tritonserver/backends/openvino/libopenvino_intel_gna_plugin.so
./opt/tritonserver/backends/openvino/libopenvino_tensorflow_lite_frontend.so
./opt/tritonserver/backends/openvino/libtriton_openvino.so
./opt/tritonserver/backends/openvino/libopenvino_onnx_frontend.so
./opt/tritonserver/backends/openvino/libopenvino_auto_batch_plugin.so
./opt/tritonserver/backends/openvino/libopenvino_pytorch_frontend.so
./opt/tritonserver/backends/openvino/libopenvino_paddle_frontend.so
./opt/tritonserver/backends/openvino/libopenvino_intel_gpu_plugin.so
./opt/tritonserver/backends/openvino/libopenvino_tensorflow_frontend.so
./opt/tritonserver/backends/openvino/libopenvino_gapi_preproc.so
./opt/tritonserver/backends/openvino/libopenvino_auto_plugin.so
./opt/tritonserver/backends/openvino/libopenvino_hetero_plugin.so
./opt/tritonserver/backends/openvino/libopenvino_intel_cpu_plugin.so
./opt/tritonserver/backends/onnxruntime/libopenvino_onnx_frontend.so
./opt/tritonserver/backends/onnxruntime/libonnxruntime_providers_openvino.so
./opt/tritonserver/backends/onnxruntime/libopenvino_ir_frontend.so
./opt/tritonserver/backends/onnxruntime/libopenvino_intel_cpu_plugin.so
./opt/tritonserver/backends/onnxruntime/libopenvino_tensorflow_frontend.so

So this problem most likely affects also TF Lite, PaddlePaddle & Pytorch model formats.

Culprit is most likely here:
https://github.com/triton-inference-server/onnxruntime_backend/blob/48cc4f132a451a8dfebe501583d88acb5243dc38/tools/gen_ort_dockerfile.py#L311
as not all libraries are copied.

krishung5 · 2024-05-09T21:57:23Z

@tanmayv25 for vis.

tanmayv25 · 2024-05-10T01:02:38Z

@atobiszei The openVINO backend in Triton does not support models saved in savedModel format. Read about Triton's OpenVINO backend here: https://github.com/triton-inference-server/openvino_backend?tab=readme-ov-file#openvino-backend

You'd have to convert savedModel using model optimizer tool into OpenVINO IR model (.xml and .bin files). Then place these files into the model directory instead of TF savedmodel dir.

atobiszei · 2024-05-10T14:51:33Z

@tanmayv25
This paragraph states otherwise:
https://github.com/triton-inference-server/openvino_backend#loading-non-default-model-format.

When I removed ONNX backend from Triton image & tuned shape parameters in config it worked fine.

tanmayv25 · 2024-05-10T18:19:48Z

Thanks for the correction. It seems the feature to load savedmodel has been added recently.
We need to revisit the Triton image to make sure that there are no conflicting dependencies. The openVINO backend should be using its own installation of openVINO library instead of the one held in onnxruntime.

This could also help us installing different OV between OV and ONNXRuntime backends.

tanmayv25 added the bug Something isn't working label May 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Triton Server OpenVINO backend not working with Tensorflow saved models #7200

Triton Server OpenVINO backend not working with Tensorflow saved models #7200

atobiszei commented May 9, 2024

atobiszei commented May 9, 2024

krishung5 commented May 9, 2024

tanmayv25 commented May 10, 2024

atobiszei commented May 10, 2024 •

edited

tanmayv25 commented May 10, 2024 •

edited

Triton Server OpenVINO backend not working with Tensorflow saved models #7200

Triton Server OpenVINO backend not working with Tensorflow saved models #7200

Comments

atobiszei commented May 9, 2024

atobiszei commented May 9, 2024

krishung5 commented May 9, 2024

tanmayv25 commented May 10, 2024

atobiszei commented May 10, 2024 • edited

tanmayv25 commented May 10, 2024 • edited

atobiszei commented May 10, 2024 •

edited

tanmayv25 commented May 10, 2024 •

edited