A high-throughput and memory-efficient inference and serving engine for LLMs
-
Updated
May 9, 2024 - Python
A high-throughput and memory-efficient inference and serving engine for LLMs
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.
Valor is a centralized evaluation store which makes it easy to measure, explore, and rank model performance.
Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint in the cloud.
Build applications that make decisions (chatbots, agents, simulations, etc...). Monitor, persist, and execute on your own infrastructure.
Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database.
Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..
Multiwoven Documentation
Determined is an open-source machine learning platform that simplifies distributed training, hyperparameter tuning, experiment tracking, and resource management. Works with PyTorch and TensorFlow.
Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
Distributed ML Training and Fine-Tuning on Kubernetes
Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
An orchestration platform for the development, production, and observation of data assets.
Python client for Kolena's machine learning testing platform
Machine Learning Pipelines for Kubeflow
Serve, optimize and scale PyTorch models in production
Add a description, image, and links to the mlops topic page so that developers can more easily learn about it.
To associate your repository with the mlops topic, visit your repo's landing page and select "manage topics."