Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Triton Server OpenVINO backend not working with Tensorflow saved models #7200

Open
atobiszei opened this issue May 9, 2024 · 5 comments
Open
Labels
bug Something isn't working

Comments

@atobiszei
Copy link

Description
Triton is unable to load models with Tensorflow saved model format with OpenVINO backend.

Triton Information
What version of Triton are you using?
23.10,23.11,23.12,24.03,24.04 don't work.

Are you using the Triton container or did you build it yourself?
Triton container

To Reproduce
Basically follow:
https://github.com/triton-inference-server/tutorials/tree/main/Quick_Deploy/TensorFlow but change backend to OpenVINO

Model config:

name: "resnet50"
backend: "openvino"
#platform: "tensorflow_savedmodel"
default_model_filename: "model.saved_model"
max_batch_size : 0
input [
  {
    name: "input_1"
    data_type: TYPE_FP32
    dims: [-1, 224, 224, 3 ]
  }
]
output [
  {
    name: "predictions"
    data_type: TYPE_FP32
    dims: [-1, 1000]
  }
]


Command and logs:

docker run --rm -v ${PWD}/model_repository:/models nvcr.io/nvidia/tritonserver:24.03-py3 tritonserver --model-repository=/models #docker run --rm -p 8000:8000 -p 8001:8001 -p 8002:8002 -v ${PWD}/model_repository:/models nvcr.io/nvidia/tritonserver:24.03-py3 tritonserver --model-repository=/models


=============================
== Triton Inference Server ==
=============================

NVIDIA Release 24.03 (build 86102629)
Triton Server Version 2.44.0

Copyright (c) 2018-2023, NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

WARNING: The NVIDIA Driver was not detected.  GPU functionality will not be available.
   Use the NVIDIA Container Toolkit to start this container with GPU support; see
   https://docs.nvidia.com/datacenter/cloud-native/ .

W0509 14:21:08.011250 1 pinned_memory_manager.cc:271] Unable to allocate pinned system memory, pinned memory pool will not be available: CUDA driver version is insufficient for CUDA runtime version
I0509 14:21:08.011288 1 cuda_memory_manager.cc:117] CUDA memory pool disabled
E0509 14:21:08.011338 1 server.cc:243] CudaDriverHelper has not been initialized.
I0509 14:21:08.013345 1 model_lifecycle.cc:469] loading: resnet50:1
I0509 14:21:08.022993 1 openvino.cc:1373] TRITONBACKEND_Initialize: openvino
I0509 14:21:08.023009 1 openvino.cc:1383] Triton TRITONBACKEND API version: 1.19
I0509 14:21:08.023012 1 openvino.cc:1389] 'openvino' TRITONBACKEND API version: 1.19
I0509 14:21:08.023059 1 openvino.cc:1473] TRITONBACKEND_ModelInitialize: resnet50 (version 1)
terminate called after throwing an instance of 'triton::backend::BackendModelInstanceException'

Expected behavior
Load the model.

@atobiszei
Copy link
Author

I found out that in the Triton image there are 2 versions of OpenVINO, and one of them is missing libraries from OpenVINO:

root@8bc8eab2d6ce:/# find -name "*openvino*" | grep -v 2330 | grep -v 23\.3\.0 | grep -v LICENSE | grep -v "libopenvino_c\|libopenvino.so"

./opt/tritonserver/backends/openvino
./opt/tritonserver/backends/openvino/libopenvino_intel_gna_plugin.so
./opt/tritonserver/backends/openvino/libopenvino_tensorflow_lite_frontend.so
./opt/tritonserver/backends/openvino/libtriton_openvino.so
./opt/tritonserver/backends/openvino/libopenvino_onnx_frontend.so
./opt/tritonserver/backends/openvino/libopenvino_auto_batch_plugin.so
./opt/tritonserver/backends/openvino/libopenvino_pytorch_frontend.so
./opt/tritonserver/backends/openvino/libopenvino_paddle_frontend.so
./opt/tritonserver/backends/openvino/libopenvino_intel_gpu_plugin.so
./opt/tritonserver/backends/openvino/libopenvino_tensorflow_frontend.so
./opt/tritonserver/backends/openvino/libopenvino_gapi_preproc.so
./opt/tritonserver/backends/openvino/libopenvino_auto_plugin.so
./opt/tritonserver/backends/openvino/libopenvino_hetero_plugin.so
./opt/tritonserver/backends/openvino/libopenvino_intel_cpu_plugin.so
./opt/tritonserver/backends/onnxruntime/libopenvino_onnx_frontend.so
./opt/tritonserver/backends/onnxruntime/libonnxruntime_providers_openvino.so
./opt/tritonserver/backends/onnxruntime/libopenvino_ir_frontend.so
./opt/tritonserver/backends/onnxruntime/libopenvino_intel_cpu_plugin.so
./opt/tritonserver/backends/onnxruntime/libopenvino_tensorflow_frontend.so

So this problem most likely affects also TF Lite, PaddlePaddle & Pytorch model formats.

Culprit is most likely here:
https://github.com/triton-inference-server/onnxruntime_backend/blob/48cc4f132a451a8dfebe501583d88acb5243dc38/tools/gen_ort_dockerfile.py#L311
as not all libraries are copied.

@krishung5
Copy link
Contributor

@tanmayv25 for vis.

@tanmayv25
Copy link
Contributor

@atobiszei The openVINO backend in Triton does not support models saved in savedModel format. Read about Triton's OpenVINO backend here: https://github.com/triton-inference-server/openvino_backend?tab=readme-ov-file#openvino-backend

You'd have to convert savedModel using model optimizer tool into OpenVINO IR model (.xml and .bin files). Then place these files into the model directory instead of TF savedmodel dir.

@atobiszei
Copy link
Author

atobiszei commented May 10, 2024

@tanmayv25
This paragraph states otherwise:
https://github.com/triton-inference-server/openvino_backend#loading-non-default-model-format.

When I removed ONNX backend from Triton image & tuned shape parameters in config it worked fine.

@tanmayv25
Copy link
Contributor

tanmayv25 commented May 10, 2024

Thanks for the correction. It seems the feature to load savedmodel has been added recently.
We need to revisit the Triton image to make sure that there are no conflicting dependencies. The openVINO backend should be using its own installation of openVINO library instead of the one held in onnxruntime.

This could also help us installing different OV between OV and ONNXRuntime backends.

@tanmayv25 tanmayv25 added the bug Something isn't working label May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

No branches or pull requests

3 participants