#

inference-engine

Here are 222 public repositories matching this topic...

onediff

siliconflow / onediff

OneDiff: An out-of-the-box acceleration library for diffusion models.

cuda pytorch lora lcm performance-optimization inference-engine diffusion-models stable-diffusion diffusers sd-webui comfyui sdxl aigc-serving lcm-lora stable-video-diffusion sdxl-turbo comfyui-workflow

Updated Jun 1, 2024
Python

dieharders / ai-text-server

🍺 Obrew Server: A local & private Ai inference engine.

desktop-app ai inference-engine

Updated Jun 1, 2024
Python

matteocarnelos / microflow-rs

A robust and efficient TinyML inference engine.

rust machine-learning inference-engine tensorflow-lite tinyml

Updated May 31, 2024
Rust

TorchMoE / MoE-Infinity

PyTorch library for cost-effective, fast and easy serving of MoE models.

pytorch inference-engine mixture-of-experts huggingface large-language-models

Updated May 31, 2024
C++

gottingen / kumo-search

docs for search system and ai infra

python search-engine performance ai deep-learning neural-network tensorflow inference-engine tensorflow2

Updated May 31, 2024

MIVisionX

ROCm / MIVisionX

MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into a single toolkit. AMD MIVisionX also delivers a highly optimized open-source implementation of the Khronos OpenVX™ and OpenVX™ Extensions.

Updated May 31, 2024
C++

zjhellofss / KuiperInfer

带你从零实现一个高性能的深度学习推理库，支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library step by step

deep-neural-networks caffe deep-learning graph-algorithms inference pytorch yolo convolution diy resnet maxpooling sigmoid inference-engine ncnn relu yolov5 pnnx

Updated May 31, 2024
C++

insight-platform / Savant

Python Computer Vision & Video Analytics Framework With Batteries Included

opencv machine-learning video computer-vision deep-learning cuda nvidia yolo object-detection deepstream tensorrt inference-engine instance-segmentation edge-computing peoplenet nvidia-deepstream-sdk yolov5-face yolov8 yolov8-face

Updated May 31, 2024
Python

janhq / cortex

Drop-in, local AI alternative to the OpenAI stack. Multi-engine (llama.cpp, TensorRT-LLM). Powers 👋 Jan

ai cuda llama accelerated inference-engine openai-api llm stable-diffusion llms llamacpp llama2 gguf tensorrt-llm

Updated May 31, 2024
C++

Vision-Inference

PRITHIVSAKTHIUR / Vision-Inference

What Happen Next ? Live Inference

javascript css html docker model live inference-engine

Updated May 31, 2024
JavaScript

FedML-AI / FedML

FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs on any GPU cloud or on-premise cluster. Built on this library, TensorOpera AI (https://TensorOpera.ai) is your generative AI platform at scale.

machine-learning deep-learning inference-engine model-deployment model-serving distributed-training federated-learning mlops edge-ai ai-agent on-device-training

Updated May 31, 2024
Python

PygmalionAI / aphrodite-engine

PygmalionAI's large-scale inference engine

machine-learning cuda api-rest avx512 rocm inference-engine inferentia

Updated May 30, 2024
Python

quic / ai-hub-models

The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.

machine-learning inference pytorch machinelearning deeplearning demos inference-engine onnx tensorflow-lite qnn inference-api

Updated May 29, 2024
Python

wingman

curtisgray / wingman

Wingman is the fastest and easiest way to run Llama models on your PC or Mac.

windows macos linux downloader ai local download gpu chatbot inference openai gpu-acceleration llama inference-server inference-engine gpu-monitoring llm chatgpt llamacpp

Updated May 29, 2024
TypeScript

T-head-Semi / csi-nn2

An optimized neural network operator library for chips base on Xuantie CPU.

deep-learning neural-network riscv risc-v inference-engine riscv-assembly

Updated May 29, 2024
C

pylint-dev / astroid

A common base representation of python source code for pylint and other projects

parser static-code-analysis static-analysis ast hacktoberfest inference-engine closember

Updated May 27, 2024
Python

openvino_contrib

openvinotoolkit / openvino_contrib

Repository for OpenVINO's extra modules

java arm pytorch inference-engine nvidia-gpu openvino

Updated May 27, 2024
C++

sdcondon / SCFirstOrderLogic

Simple first-order logic implementation for .NET.

dotnet first-order-logic artificial-intelligence linq-expressions inference-engine

Updated May 27, 2024
C#

friendliai / friendli-client

Friendli: the fastest serving engine for generative AI

ai ml inference gpt inference-server mistral inference-engine serving mlops gpt3 llm stable-diffusion llms generative-ai llmops llm-serving llm-inference llama2 llm-ops

Updated May 25, 2024
Python

SkywardAI / kimchima

The collections of tools for testing and dumping LLMs

ai pipeline tokenizer transformers inference pytorch inference-engine llm

Updated May 27, 2024
Python

Improve this page

Add a description, image, and links to the inference-engine topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the inference-engine topic, visit your repo's landing page and select "manage topics."