int8
Here are 28 public repositories matching this topic...
TensorRT Int8 Python version sample. TensorRT Int8 Python 实现例子。TensorRT Int8 Pythonの例です
-
Updated
Jan 28, 2019 - Python
Reimplement RetinaFace use C++ and TensorRT
-
Updated
Dec 4, 2019 - C++
a simple pipline of int8 quantization based on tensorrt.
-
Updated
Oct 14, 2022 - Python
LLM-Lora-PEFT_accumulate explores optimizations for Large Language Models (LLMs) using PEFT, LORA, and QLORA. Contribute experiments and implementations to enhance LLM efficiency. Join discussions and push the boundaries of LLM optimization. Let's make LLMs more efficient together!
-
Updated
Jun 16, 2023 - Jupyter Notebook
quantization example for pqt & qat
-
Updated
Jul 24, 2023 - Python
development quantization framework
-
Updated
Sep 9, 2023 - Python
A LLaMA2-7b chatbot with memory running on CPU, and optimized using smooth quantization, 4-bit quantization or Intel® Extension For PyTorch with bfloat16.
-
Updated
Feb 27, 2024 - Python
Improve this page
Add a description, image, and links to the int8 topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the int8 topic, visit your repo's landing page and select "manage topics."