PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
-
Updated
May 17, 2024 - Python
PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
Common Framework for Inference
High-efficiency floating-point neural network inference operators for mobile, server, and Web
A high-throughput and memory-efficient inference and serving engine for LLMs
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
A universal scalable machine learning model deployment solution
An open source NLP as a service project focused on providing state of the art systems with ease. Training and inference by simple docker commands
Utilities to use the Hugging Face Hub API
🔮 SuperDuperDB: Bring AI to your database! Build, deploy and manage any AI application directly with your existing data infrastructure, without moving your data. Including streaming inference, scalable model training and vector search.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
Study materials for taking the Harvard Biostatistics PhD Qualifying Exam, Summer 2024
Large Language Model Text Generation Inference
A high-performance inference system for large language models, designed for production environments.
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.
TypeDB: the polymorphic database powered by types
Port of OpenAI's Whisper model in C/C++
Cross-platform, customizable ML solutions for live and streaming media.
AICI: Prompts as (Wasm) Programs
Add a description, image, and links to the inference topic page so that developers can more easily learn about it.
To associate your repository with the inference topic, visit your repo's landing page and select "manage topics."