📏 MetricRule

Easy open source monitoring for ML models.

A sidecar agent that creates metrics for monitoring deployed machine learning models.

Motivation

MetricRule agents are designed to be deployed with a serving model endpoint to generate input feature and output distribution metrics based on the model's endpoint usage. Integrations with Tensorflow Serving and KFServing are supported.

The motivation of this project is to make it easier to monitor feature distributions in production to better catch real world ML issues like training-serving skew, feature drifts, poor model performance on specific slices of input.

TFServing

The recommended usage with TFServing is to deploy the agent as a sidecar with the model.

The executable used is cmd/proxy. The latest release is available at Docker Hub.

KFServing

The recommended usage with KFServing is to use the agent as a logger sink.

The executable used is cmd/eventlistener. The latest release is available at Docker Hub.

Screenshots and Demo

Taken with a demo Grafana instance of this agent running with a toy PetFinder model with simulated traffic.

The model is built using this Tensorflow tutorial

Input Feature Distributions

Output Distributions

Output Distribution by Input Slice

Get Started

Images of the agent are maintained on Docker Hub.

The agent takes as input a configuration to define what metrics to create based on input and output JSONs. Based on this, the agent provides a HTTP endpoint (by default /metrics) with metric aggregates.

The expected usage is to use Prometheus to periodically scrape these endpoints. This can be then used as a data source for a Grafana dashboard for visualizations and alerts.

End-to-end examples with Kubernetes and Docker Compose are provided in the example/ subdirectory.

Configuration

Metric Definition

The configuration allows specifying metrics based on the JSON input features and the output predictions.

Additionally, input features can be parsed to create labels that are tagged to input and output metric instances.

The config format is defined at api/proto/metricrule_metric_configuration.proto.

See an example configuration used for the demo at configs/example_sidecar_config.textproto.

The configuration is supplied as a config file, with location defined by the SIDECAR_CONFIG_PATH environment variable.

Other Config

Configuration is through environment variables. Some options of interest are:

Set REVERSE_PROXY_PORT to the port the agent is running on, e.g "9551"
Set APPLICATION_HOST to the host the serving endpoint is running on, defaults to "127.0.0.1"
Set APPLICATION_PORT to the port the serving endpoint is running on, e.g "8501"
Set METRICS_PATH to the path where a HTTP endpoint for metrics scraping should be exposed at, defaults to "/metrics"

Contribute

We ❤️ contributions. Please see CONTRIBUTING.md.

Please feel free to use the Issue Tracker and GitHub Discussions to reach out to the maintainers.

For more information

Please refer to metricrule.com for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
.github/workflows		.github/workflows
api @ 54d22a5		api @ 54d22a5
cmd		cmd
configs		configs
example		example
internal		internal
pkg		pkg
.dockerignore		.dockerignore
.gitignore		.gitignore
.gitmodules		.gitmodules
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Dockerfile.kfserving		Dockerfile.kfserving
Dockerfile.tfserving		Dockerfile.tfserving
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum

License

MetricRule/metricrule-agent-go

Folders and files

Latest commit

History

Repository files navigation

📏 MetricRule

Motivation

TFServing

KFServing

Screenshots and Demo

Input Feature Distributions

Output Distributions

Output Distribution by Input Slice

Get Started

Configuration

Metric Definition

Other Config

Contribute

For more information

About

Topics

Resources

License

Stars

Watchers

Forks

Languages