Architecture for pruning methods analysis using pytorch prune module
-
Updated
May 26, 2024 - Python
Architecture for pruning methods analysis using pytorch prune module
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
🤗 Optimum Intel: Accelerate inference with Intel optimization tools
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
Neural Network Compression Framework for enhanced OpenVINO™ inference
Chess engine
This is the official implementation of "LLM-QBench: A Benchmark Towards the Best Practice for Post-training Quantization of Large Language Models", and it is also an efficient LLM compression tool with various advanced compression methods, supporting multiple inference backends.
Code for CPAL-2024 paper "Continual Learning with Dynamic Sparse Training: Exploring Algorithms for Effective Model Updates"
Characterization study repository for model compression method: pruning
TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
Code for the paper "FOCIL: Finetune-and-Freeze for Online Class-Incremental Learning by Training Randomly Pruned Sparse Experts"
Sparsity-aware deep learning inference runtime for CPUs
[CVPR 2023] Towards Any Structural Pruning; LLMs / SAM / Diffusion / Transformers / YOLOv8 / CNNs
《李宏毅深度学习教程》(李宏毅老师推荐👍),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
Config driven, easy backup cli for restic.
AIMET GitHub pages documentation
PaddleSlim is an open-source library for deep model compression and architecture search.
Add a description, image, and links to the pruning topic page so that developers can more easily learn about it.
To associate your repository with the pruning topic, visit your repo's landing page and select "manage topics."