quantization

Here are 563 public repositories matching this topic...

quic / aimet

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.

open-source machine-learning opensource deep-neural-networks compression deep-learning pruning quantization auto-ml network-quantization network-compression

Updated May 13, 2024
Python

OpenNMT / CTranslate2

Star

Fast inference engine for Transformer models

Updated May 13, 2024
C++

hiyouga / LLaMA-Factory

Star

Unify Efficient Fine-Tuning of 100+ LLMs

Updated May 13, 2024
Python

Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massive Transpose extrapolation problem in onnx-tensorflow (onnx-tf). I don't need a Star, but give me a pull request.

android docker machine-learning deep-learning tensorflow models keras transformer lstm quantization coreml onnx model-converter tensorflow-lite tflite tfjs yolov7 onnx-tensorflow

Updated May 13, 2024
Python

openvinotoolkit / nncf

Star

Neural Network Compression Framework for enhanced OpenVINO™ inference

nlp sparsity compression deep-learning tensorflow transformers pytorch classification pruning object-detection quantization semantic-segmentation bert hawq onnx openvino mmdetection mixed-precision-training quantization-aware-training

Updated May 13, 2024
Python

intel / neural-compressor

Star

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

sparsity pruning quantization knowledge-distillation auto-tuning int8 low-precision quantization-aware-training post-training-quantization awq int4 large-language-models gptq smoothquant sparsegpt fp4 mxformat

Updated May 13, 2024
Python

huggingface / quanto

Star

A pytorch Quantization Toolkit

pytorch quantization

Updated May 13, 2024
Python

mobiusml / hqq

Star

Official implementation of Half-Quadratic Quantization (HQQ)

machine-learning quantization llm

Updated May 13, 2024
Python

SYSTRAN / faster-whisper

Star

Faster Whisper transcription with CTranslate2

deep-learning inference transformer speech-recognition openai speech-to-text quantization whisper

Updated May 13, 2024
Python

openvinotoolkit / training_extensions

Star

Train, Evaluate, Optimize, Deploy Computer Vision Models via OpenVINO™

machine-learning computer-vision deep-learning pytorch semi-supervised-learning image-classification object-detection transfer-learning image-segmentation quantization action-recognition automl incremental-learning anomaly-detection hyper-parameter-optimization self-supervised-learning openvino neural-networks-compression datumaro

Updated May 13, 2024
Python

ModelTC / llmc

Star

This is the official implementation of "LLM-QBench: A Benchmark Towards the Best Practice for Post-training Quantization of Large Language Models", and it is also an efficient LLM compression tool with various advanced compression methods, supporting multiple inference backends.

benchmark deployment tool evaluation pruning quantization large-language-models llm

Updated May 13, 2024
Python

ModelTC / TFMQ-DM

Star

[CVPR 2024 Highlight] TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models

highlight quantization cvpr ldm diffusion-models post-training-quantization ddim stable-diffusion cvpr2024

Updated May 13, 2024
Jupyter Notebook

intel / intel-extension-for-pytorch

Star

A Python package for extending the official PyTorch that can easily obtain performance on Intel platform

machine-learning deep-learning neural-network intel pytorch quantization

Updated May 13, 2024
Python

koszeggy / KGySoft.Drawing

Star

KGy SOFT Drawing is a library for advanced image, icon and graphics handling.

c-sharp drawing animated-gif libraries graphics images icons gif dithering quantization bitmaps gif-animation kgysoft metafiles quantizing animgif

Updated May 12, 2024
C#

kurianbenoy / Indic-Subtitler

Star

Open source subtitling platform 💻 for transcribing and translating videos/audios in Indic languages.

deep-learning nextjs transformers inference webapp speech-recognition openai speech-to-text quantization whisper asr fastapi faster-whisper whisperx vegam-whisper

Updated May 12, 2024
Jupyter Notebook

sony / model_optimization

Star

Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. This project provides researchers, developers, and engineers advanced quantization and compression tools for deploying state-of-the-art neural networks.

machine-learning deep-neural-networks deep-learning neural-network tensorflow optimizer pytorch quantization qat network-quantization network-compression edge-ai ptq

Updated May 12, 2024
Python

adithya-s-k / LLM-Alchemy-Chamber

Star

a friendly neighborhood repository with diverse experiments and adventures in the world of LLMs

python inference quantization fine-tuning finetuning large-language-models llm finetuning-llms

Updated May 12, 2024
Jupyter Notebook

ArslanKAS / Quantization-Fundamentals

Star

Learn to compressing models through methods such as quantization to make them more efficient, faster, and accessible

quantization deeplearning-ai huggingface llm

Updated May 12, 2024
Jupyter Notebook

ash2703 / MyLearnings

Star

Documentation of my notes, learnings, presentations on Computer vision and some other cool stuff

opencv computer-vision image-processing style-transfer sampling quantization image-synthesis morphological-image-processing fastapi

Updated May 11, 2024
Jupyter Notebook

arjbingly / grag

Star

GRAG is a simple python package that provides an easy end-to-end solution for implementing Retrieval Augmented Generation (RAG). The package offers an easy way for running various LLMs locally, Thanks to LlamaCpp and also supports vector stores like Chroma and DeepLake.

retrieval quantization rag llm retrieval-augmented-generation

Updated May 11, 2024
Python

Improve this page

Add a description, image, and links to the quantization topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the quantization topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

quantization

Here are 563 public repositories matching this topic...

quic / aimet

OpenNMT / CTranslate2

hiyouga / LLaMA-Factory

PINTO0309 / onnx2tf

openvinotoolkit / nncf

intel / neural-compressor

huggingface / quanto

mobiusml / hqq

SYSTRAN / faster-whisper

openvinotoolkit / training_extensions

ModelTC / llmc

ModelTC / TFMQ-DM

intel / intel-extension-for-pytorch

koszeggy / KGySoft.Drawing

kurianbenoy / Indic-Subtitler

sony / model_optimization

adithya-s-k / LLM-Alchemy-Chamber

ArslanKAS / Quantization-Fundamentals

ash2703 / MyLearnings

arjbingly / grag

Improve this page

Add this topic to your repo