Alternative OpenVINO integration ideas #285

stefannica · 2021-10-27T14:12:03Z

This issue documents additional ideas of integrating OpenVINO and FuseML that were explored during the research done for the OVMS FuseML extensions, but not implemented.

Since the OpenVINO DL Workbench is essentially a web front-end built on top of most of the other OpenVINO components, such as the Model Optimizer and Model Zoo, it can play the role of ML experiment tracker. However, consuming the various services provided by the DL Workbench in FuseML workflows is hindered by the fact that the DL Workbench doesn't expose a programmable API. With such an API present, OpenVINO enabled FuseML workflow steps like the OVMS converter could perform various ML operations through the DL Workbench and have all results immediately visible and accessible through the DL Workbench web UI. Another purpose for including the DL Workbench in the FuseML orchestrated tool stack is being able to train and prepare ML models through the DL Workbench and then have the option to export and consume those ML models as FuseML workflow inputs (e.g. to serve them with an OVMS predictor workflow). A FuseML installer extension should also be provided to simplify the installation of DL Workbench instances in a Kubernetes cluster.
The Open Model Zoo and Model Downloader components can be leveraged to bring pre-trained OpenVINO ML models into FuseML workflows, for use in further training and/or serving operations.
Custom FuseML workflow training steps can be implemented to use the OpenVINO Training Extensions.
ML applications can be implemented directly against the OpenVINO IR libraries. FuseML could facilitate the development of such applications through builder workflow steps targeted specifically at simplifying building and packaging IR powered code.

Integration Through 3rd Parties

There are a couple of 3rd party inference platforms that already include support for OpenVINO ML models.

Seldon Core

Since OVMS implements the same API as TensorFlow Serving, Seldon Core can run an OVMS container alongside the tfserving-proxy container normally used to serve models with TFServing (since they implement the same API). Seldon Core also runs a storage initializer init container (from KFServing) that initializes the model store format required for OVMS.

Seldon Core is already included in the list of extensions available for FuseML. Theoretically, it can easily be enhanced to include the OpenVINO back-end as a supported option.

More information:

NVidia Triton

Triton allows custom backends in C++ and Python to be integrated easily. As part of the 21.03 release, a beta version of the OpenVINO backend in Triton is available for high performance CPU inferencing on the Intel platform.

Triton is also featured as one of the predictors supported by KFServing, for which FuseML already provides a workflow predictor step. The KFServing workflow predictor step can be extended to include Triton+OpenVINO as a supported combo.

More information:

stefannica added enhancement New feature or request area/extension FuseML extensions for 3rd party tool integration labels Oct 27, 2021

stefannica added this to Backlog in FuseML Project Board via automation Oct 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alternative OpenVINO integration ideas #285

Alternative OpenVINO integration ideas #285

stefannica commented Oct 27, 2021

Alternative OpenVINO integration ideas #285

Alternative OpenVINO integration ideas #285

Comments

stefannica commented Oct 27, 2021

Integration Through 3rd Parties

Seldon Core

NVidia Triton