A curated list for Efficient Large Language Models
-
Updated
May 17, 2024 - Python
A curated list for Efficient Large Language Models
模型压缩的小白入门教程
This project is the official implementation of our accepted NeurIPS 2023 (spotlight) paper QuantSR: Accurate Low-bit Quantization for Efficient Image Super-Resolution.
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binarization.
Chat to LLaMa 2 that also provides responses with reference documents over vector database. Locally available model using GPTQ 4bit quantization.
The official implementation of the ICML 2023 paper OFQ-ViT
Quantization is a technique to reduce the computational and memory costs of running inference by representing the weights and activations with low-precision data types like 8-bit integer (int8) instead of the usual 32-bit floating point (float32).
[WINNER! 🏆] Psychopathology FER Assistant. Because mental health matters. My project submission for #TFWorld TF 2.0 Challenge at Devpost.
This repository contains notebooks that show the usage of TensorFlow Lite for quantizing deep neural networks.
Unofficial implementation of NCNet using flax and jax
A tutorial of model quantization using TensorFlow
Add a description, image, and links to the model-quantization topic page so that developers can more easily learn about it.
To associate your repository with the model-quantization topic, visit your repo's landing page and select "manage topics."