#

int8

Here are 28 public repositories matching this topic...

intel / neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

sparsity pruning quantization knowledge-distillation auto-tuning int8 low-precision quantization-aware-training post-training-quantization awq int4 large-language-models gptq smoothquant sparsegpt fp4 mxformat

Updated May 23, 2024
Python

clancylian / retinaface

Reimplement RetinaFace use C++ and TensorRT

caffe tensorrt int8 retinaface mxnet2caffe

Updated Dec 4, 2019
C++

Wulingtian / yolov5_tensorrt_int8_tools

tensorrt int8 量化yolov5 onnx模型

tensorrt int8 onnx yolov5

Updated Apr 23, 2021
Python

intel / neural-speed

An innovative library for efficient LLM inference via low-bit quantization

Updated May 23, 2024
C++

Wulingtian / yolov5_tensorrt_int8

TensorRT int8 量化部署 yolov5s 模型，实测3.3ms一帧！

tensorrt int8 yolov5

Updated Apr 23, 2021
C++

Wulingtian / RepVGG_TensorRT_int8

RepVGG TensorRT int8 量化，实测推理不到1ms一帧！

tensorrt int8 repvgg

Updated Apr 23, 2021
Python

Wulingtian / nanodet_tensorrt_int8

nanodet int8 量化，实测推理2ms一帧！

tensorrt int8 nanodet

Updated Apr 23, 2021
C++

ppogg / ncnn-yolov4-int8

NCNN+Int8+YOLOv4 quantitative modeling and real-time inference

real-time int8 ncnn yolov4

Updated Aug 24, 2021
C++

xuanandsix / Tensorrt-int8-quantization-pipline

a simple pipline of int8 quantization based on tensorrt.

quantization tensorrt int8 yolox classifaction

Updated Oct 14, 2022
Python

yester31 / Quantization_EX

quantization example for pqt & qat

quantization tensorrt int8 qat model-optimization quantization-aware-training post-training-quantization pytorch-quantization ptq

Updated Jul 24, 2023
Python

whitelok / tensorrt-int8-python-sample

TensorRT Int8 Python version sample. TensorRT Int8 Python 实现例子。TensorRT Int8 Pythonの例です

python machine-learning ai deep-learning inference nvidia tensorrt int8 int8-inference tensorrt-int8-python

Updated Jan 28, 2019
Python

RyannnG / gie_int8_sample

Updated Mar 27, 2017
C++

dasdristanta13 / LLM-Lora-PEFT_accumulate

LLM-Lora-PEFT_accumulate explores optimizations for Large Language Models (LLMs) using PEFT, LORA, and QLORA. Contribute experiments and implementations to enhance LLM efficiency. Join discussions and push the boundaries of LLM optimization. Let's make LLMs more efficient together!

falcon llama lora alpaca int8 peft llm qlora bitsandbytes

Updated Jun 16, 2023
Jupyter Notebook

cbalint13 / rvv-kernels

RISCV Vector Kernel C/LLVM-IR generator

kernel math vector llvm riscv int8 tvm rvv

Updated Mar 17, 2024
Python

aahouzi / llama2-chatbot-cpu

A LLaMA2-7b chatbot with memory running on CPU, and optimized using smooth quantization, 4-bit quantization or Intel® Extension For PyTorch with bfloat16.

Updated Feb 27, 2024
Python

stdlib-js / constants-int8-num-bytes

Size (in bytes) of an 8-bit signed integer.

Updated May 1, 2024
JavaScript

stdlib-js / napi-argv-int8array

Convert a Node-API value to a signed 8-bit integer array.

nodejs javascript node utilities native addon utils array stdlib macros node-js napi int8 int

Updated May 1, 2024
C

douzsh / mxnet-quantized

mxnet GluonCV quantization binary ternary models

mxnet binary quantization ternary int8 gluoncv

Updated May 22, 2019
Python

lbin / gie_int8_sample

Updated Mar 27, 2017
C++

MrFMach / Practice-C-types

Practicing C data types using the sizeof function

c int64 float double char int8 int16 int32 sizeof int128

Updated Sep 21, 2020
C

Improve this page

Add a description, image, and links to the int8 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the int8 topic, visit your repo's landing page and select "manage topics."