Add support for providing a model configuration file for inference services #249
Labels
area/core
area/extension
FuseML extensions for 3rd party tool integration
enhancement
New feature or request
Projects
Some inference services (e.g. Triton) requires a configuration file to be able to serve a model. In the case of triton a minimal model configuration must specify the platform and/or backend properties, the max_batch_size property, and the input and output tensors of the model (see: https://github.com/triton-inference-server/server/blob/main/docs/model_configuration.md).
There is also the case where the configuration file is optional, for example the sklearn predictor in kfserving (see: https://github.com/kserve/kserve/tree/master/docs/samples/v1beta1/sklearn/v2#model-settings), which is used for specifying some meta-data about the model (name, version, ...)
Currently FuseML does not have a mechanism for providing such configuration file, which makes it unable to support some inference service solutions, for example using Triton to serve a pytorch model.
The text was updated successfully, but these errors were encountered: