You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue documents additional ideas of integrating OpenVINO and FuseML that were explored during the research done for the OVMS FuseML extensions, but not implemented.
Since the OpenVINO DL Workbench is essentially a web front-end built on top of most of the other OpenVINO components, such as the Model Optimizer and Model Zoo, it can play the role of ML experiment tracker. However, consuming the various services provided by the DL Workbench in FuseML workflows is hindered by the fact that the DL Workbench doesn't expose a programmable API. With such an API present, OpenVINO enabled FuseML workflow steps like the OVMS converter could perform various ML operations through the DL Workbench and have all results immediately visible and accessible through the DL Workbench web UI. Another purpose for including the DL Workbench in the FuseML orchestrated tool stack is being able to train and prepare ML models through the DL Workbench and then have the option to export and consume those ML models as FuseML workflow inputs (e.g. to serve them with an OVMS predictor workflow). A FuseML installer extension should also be provided to simplify the installation of DL Workbench instances in a Kubernetes cluster.
The Open Model Zoo and Model Downloader components can be leveraged to bring pre-trained OpenVINO ML models into FuseML workflows, for use in further training and/or serving operations.
Custom FuseML workflow training steps can be implemented to use the OpenVINO Training Extensions.
ML applications can be implemented directly against the OpenVINO IR libraries. FuseML could facilitate the development of such applications through builder workflow steps targeted specifically at simplifying building and packaging IR powered code.
Integration Through 3rd Parties
There are a couple of 3rd party inference platforms that already include support for OpenVINO ML models.
Seldon Core
Since OVMS implements the same API as TensorFlow Serving, Seldon Core can run an OVMS container alongside the tfserving-proxy container normally used to serve models with TFServing (since they implement the same API). Seldon Core also runs a storage initializer init container (from KFServing) that initializes the model store format required for OVMS.
Seldon Core is already included in the list of extensions available for FuseML. Theoretically, it can easily be enhanced to include the OpenVINO back-end as a supported option.
Triton allows custom backends in C++ and Python to be integrated easily. As part of the 21.03 release, a beta version of the OpenVINO backend in Triton is available for high performance CPU inferencing on the Intel platform.
Triton is also featured as one of the predictors supported by KFServing, for which FuseML already provides a workflow predictor step. The KFServing workflow predictor step can be extended to include Triton+OpenVINO as a supported combo.
This issue documents additional ideas of integrating OpenVINO and FuseML that were explored during the research done for the OVMS FuseML extensions, but not implemented.
Since the OpenVINO DL Workbench is essentially a web front-end built on top of most of the other OpenVINO components, such as the Model Optimizer and Model Zoo, it can play the role of ML experiment tracker. However, consuming the various services provided by the DL Workbench in FuseML workflows is hindered by the fact that the DL Workbench doesn't expose a programmable API. With such an API present, OpenVINO enabled FuseML workflow steps like the OVMS converter could perform various ML operations through the DL Workbench and have all results immediately visible and accessible through the DL Workbench web UI. Another purpose for including the DL Workbench in the FuseML orchestrated tool stack is being able to train and prepare ML models through the DL Workbench and then have the option to export and consume those ML models as FuseML workflow inputs (e.g. to serve them with an OVMS predictor workflow). A FuseML installer extension should also be provided to simplify the installation of DL Workbench instances in a Kubernetes cluster.
The Open Model Zoo and Model Downloader components can be leveraged to bring pre-trained OpenVINO ML models into FuseML workflows, for use in further training and/or serving operations.
Custom FuseML workflow training steps can be implemented to use the OpenVINO Training Extensions.
ML applications can be implemented directly against the OpenVINO IR libraries. FuseML could facilitate the development of such applications through builder workflow steps targeted specifically at simplifying building and packaging IR powered code.
Integration Through 3rd Parties
There are a couple of 3rd party inference platforms that already include support for OpenVINO ML models.
Seldon Core
Since OVMS implements the same API as TensorFlow Serving, Seldon Core can run an OVMS container alongside the tfserving-proxy container normally used to serve models with TFServing (since they implement the same API). Seldon Core also runs a storage initializer init container (from KFServing) that initializes the model store format required for OVMS.
Seldon Core is already included in the list of extensions available for FuseML. Theoretically, it can easily be enhanced to include the OpenVINO back-end as a supported option.
More information:
NVidia Triton
Triton allows custom backends in C++ and Python to be integrated easily. As part of the 21.03 release, a beta version of the OpenVINO backend in Triton is available for high performance CPU inferencing on the Intel platform.
Triton is also featured as one of the predictors supported by KFServing, for which FuseML already provides a workflow predictor step. The KFServing workflow predictor step can be extended to include Triton+OpenVINO as a supported combo.
More information:
The text was updated successfully, but these errors were encountered: