19 Feb 16:37

nickfraser

2004568

Release v0.10.2 Latest

Latest

What's Changed

Fix (QuantLayer): make bias for QuantLayer optional by @fabianandresgrob in #846
Fix (examples/llm): set group_size only for groupwise quantization by @nickfraser in #853
Fix (gpfq): updating input processing and L1-norm constraints for GPFA2Q by @i-colbert in #852
ImageNet PTQ example fix by @Giuseppe5 in #863
feat (gen/quantize): Added device flag to quantize_model by @nickfraser in #860
Docs: update README for 0.10.2 release by @Giuseppe5 in #865

Full Changelog: v0.10.1...v0.10.2

Contributors

nickfraser, Giuseppe5, and 2 other contributors

Assets 2

15 Feb 11:50

nickfraser

v0.10.1

2d76aa3

Release v0.10.1

Highlights

A2Q+ support paper
A2Q+ examples with CIFAR10 and Super Resolution
Support for concatenation equalization for weights and activations
Support for GPFQ + A2Q L1 Norm bound
Possibility to explicitly export Q node for weights in QCDQ export
Support for float16 and bfloat16 for QCDQ export
Support for Dynamic Activation Quantization for ONNX QDQ export
Support for channel-splitting paper
(Beta) Better compatibility with Huggingface accelerate and optimum
(Beta) Improved support and testing for minifloat quantization

What's Changed

Fix (examples/generative): set weight_bit_width in weight_quant by @Giuseppe5 in #783
Feat (graph/equalize): improvements for llm equalization by @Giuseppe5 in #784
[graph] Fix typo in class name by @nickfraser in #765
Fix (graph/equalize): refactor for act equalization by @Giuseppe5 in #787
[quant_tensor] Updates __truediv__ behaviour to match "standard fixed point rules" by @nickfraser in #769
Feat (export): (b)float16 support for qcdq export by @Giuseppe5 in #776
Feat (ptq): Adding A2Q Upper Bound clipping to GPFQ by @fabianandresgrob in #734
Extended equalization by @Giuseppe5 in #778
Better Bfloat16 support by @Giuseppe5 in #777
Fix (stats): add return statement in state_dict by @Giuseppe5 in #792
Fix (equalize): improved cat eq checks by @Giuseppe5 in #793
Fix (export): add CastMixin by @Giuseppe5 in #794
Dynamic Act Quant support by @Giuseppe5 in #796
Fix (examples/quantizers): correct dynamic zero point handling by @Giuseppe5 in #806
Feat (a2q+): improving accumulator-aware weight quantization by @i-colbert in #797
Feat (a2q+): adding new super resolution models to brevitas_examples by @i-colbert in #811
Feat (Channel-Splitting): sets up first skeleton for channel-splitting by @fabianandresgrob in #772
Feat: support for optimum by @Giuseppe5 in #826
Fix (tests): adding tests for FloatQuant by @fabianandresgrob in #815
Fix (export): correct q node export by @Giuseppe5 in #829
Fix (examples/llm): correct groupwise export by @Giuseppe5 in #832
Fix (examples/super_res): updating README by @i-colbert in #828
Fix (examples/export): improved export by @Giuseppe5 in #838
Fix (graph/equalize): cleanup and device management by @Giuseppe5 in #840
Feat (examples/a2q): adding CIFAR10 example by @i-colbert in #813
Fix (export): check for Per Group quantization by @Giuseppe5 in #848

Full Changelog: v0.10.0...v0.10.1

Contributors

nickfraser, Giuseppe5, and 2 other contributors

Assets 2

12 Feb 18:17

nickfraser

a2q_cifar10_r1

c78f974

A2Q+ CIFAR10 model release Pre-release

Pre-release

This release contains training code and pre-trained weights to demonstrate accumulator-aware quantization (A2Q) on an image classification task. Code is also provided to demonstrate Euclidean projection-based weight initialization (EP-init) as proposed in our paper "A2Q+: Improving Accumulator-Aware Weight Quantization".

Find the associated docs at https://github.com/Xilinx/brevitas/tree/a2q_cifar10_r1/src/brevitas_examples/imagenet_classification/a2q.

Assets 13

30 Jan 19:00

nickfraser

super_res_r2

17fb49e

A2Q+ model release Pre-release

Pre-release

A2Q+ Super Resolution Experiments with Brevitas

This release contains training code and pre-trained weights to demonstrate accumulator-aware quantization (A2Q+) as proposed in our paper "A2Q+: Improving Accumulator-Aware Weight Quantization" on a super resolution task.

Find the associated docs at https://github.com/Xilinx/brevitas/tree/super_res_r2/src/brevitas_examples/super_resolution.

Assets 4

08 Dec 16:36

volcacius

v0.10.0

84f4225

Release v0.10.0

Highlights

Support for PyTorch up to version 2.1 .
Support for GPTQ PTQ algorithm.
Support for GPFQ PTQ algorithm.
Support for SmoothQuant / activation equalization PTQ algorithm.
Support for MSE based scale and zero-point for weights and activations.
Support for row-wise scaling at the input of QuantLinear.
Support for quantization of a slice of a weight tensor.
End-to-end support for learned rounding in ImageNet PTQ.
End-to-end example training scripts for A2Q (low precision accumulation) over superresolution.
Experimental support for minifloats (eXmY quantization).
Experimental LLM PTQ flow with support for weight-only and weight+activation quantization, together with GPTQ, AWQ and SmoothQuant.
Experimental Stable Diffusion PTQ flow with support for weight-only quantization.
Deprecated FINN ONNX export flow.
Update custom value_trace FX tracer to latest FX.
New custom variant of make_fx tracer with support for custom torch.library ops through @Wrap annotation.

What's Changed

Feat (nn): cache modules that require subtensor slicing by @volcacius in #628
Feat: support slicing for gptq by @Giuseppe5 in #626
Feat: add support to row wise input quantization to QuantLinear by @volcacius in #625
Fix (nn): disable weight tensor slicing syntax by @volcacius in #633
Feat (core): add SliceTensor util for sub-weight quant by @volcacius in #634
Fix (core): add missing dtype and device by @Giuseppe5 in #635
Feat (ptq): activation equalization support by @Giuseppe5 in #541
Feat (fx): value_trace improvements by @volcacius in #636
Fix (core/utils): jit ignore eager mode tensor slicing impl by @volcacius in #637
Fix (weight_eq): fix for llm equalization by @Giuseppe5 in #638
Add missing license by @Giuseppe5 in #640
Feat (ptq): act equalization support for vision by @Giuseppe5 in #643
Fix (tracer): support for index and no-tracer ops by @Giuseppe5 in #644
Setup: pin version of inflect for compatibility by @Giuseppe5 in #647
Activation eq extension by @Giuseppe5 in #642
Fix (core): correct forward in ParameterFromStatsFromParameter by @Giuseppe5 in #650
Feat (zero_point): grid search for mse zp by @Giuseppe5 in #651
Fix (weight_eq): correct handling of layernorm/batchnorm as sink by @Giuseppe5 in #646
Feat (nn): set dim names in QuantMHA Linear by @volcacius in #629
Fix (act_quant): flag to enable/disable stats collection by @Giuseppe5 in #641
Feat (core): add keepdim to min/max/percentile stats by @volcacius in #657
Fix (ptq): conflicts between gptq and equalization by @volcacius in #656
Fix (nn): state_dict load for unpacked in_proj in MHA by @volcacius in #654
Feat (ptq): learned round support in evaluate/benchmark by @Giuseppe5 in #639
Feat (nn): avoid computing output scale/zp when not needed by @volcacius in #655
Fix (QuantTensor): pixel_shuffle and unshuffle handler by @volcacius in #663
Setup: fix installation of libgomp1 by @Giuseppe5 in #662
Fix (quantize): fix and improvements for fx quantize by @Giuseppe5 in #661
Fix (resnet18): fixing default weight quantizer for linear layer by @i-colbert in #660
Fix(gptq): fix for quant convtranspose1d/2d and conv1d by @Giuseppe5 in #665
Refactor of ptq_common by @Giuseppe5 in #649
Examples: initial support for LLMs PTQ by @volcacius in #658
Fix (weight_eq): mantain order of regions by @Giuseppe5 in #667
Feat (core): simplify binary_sign impl by @volcacius in #672
Feat (core): add permute_dims to all reshape fns by @volcacius in #671
Feat (graph/equalize): clean up scale invariant ops by @volcacius in #669
Misc: fix pre-commit by @volcacius in #676
Misc: fix another pre-commit by @volcacius in #677
Feat (examples/llm): initial support for loading AWQ results by @volcacius in #673
Fix (espcn): updating links to use new tags by @i-colbert in #678
Fix (ptq): fix for act quantizers by @Giuseppe5 in #675
Fix (ptq): fix for residual with mha by @Giuseppe5 in #681
Fix (fx): fix fx quantize for conv->bn by @Giuseppe5 in #680
Feat (gptq): add option to return output from forward by @Giuseppe5 in #684
Fix (a2q): correcting post-rounding scaling initialization by @i-colbert in #659
Feat (quant): initial support for fp8 variants by @volcacius in #686
Fix (gptq): fix for depthwise act_order by @Giuseppe5 in #688
Feat (core): support for stochastic round by @volcacius in #689
Fix (gptq): Caching quant_inp values for quant_weight by @i-colbert in #653
Feat (gptq): support for groupwise conv by @Giuseppe5 in #690
Fix (gptq): typo in variable name by @Giuseppe5 in #691
Rename brevitas quant custom op by @jinchen62 in #693
Change tolerance for fp16 by @jinchen62 in #694
Fix (docs): Updating references to A2Q paper by @i-colbert in #698
Feat (examples/llm): add first/last layer support by @volcacius in #699
Feat (examples/llm): add packed 3/5/6b export by @volcacius in #700
Fix (examples/llm): padding for packed 3/5/6b by @volcacius in #701
Fix (gptq): linalg import fix by @Giuseppe5 in #705
Examples (a2q): updating and extending ESPCN demo by @i-colbert in #706
Examples (a2q): adding links for pretrained models by @i-colbert in #707
Fix (nn): add missing support for padding_mode by @volcacius in #709
Feat (examples/llm): add custom float support by @volcacius in #708
GPFQ by @Giuseppe5 in #666
Feat (ptq): support for float bias by @Giuseppe5 in #713
Feat (ptq): flag to disable/enable signed activations by @Giuseppe5 in #714
Support for minifloat benchmark by @Giuseppe5 in #712
adding quant_format, mantissa, and exponent options to evaluate script by @fabianandresgrob in #717
Fix (fx): import backport on 2.1 by @volcacius in #732
Fix (ptq): correct bitwidth for layerwise int benchmark by @Giuseppe5 in #737
Fix (ptq): fix for ptq_common by @Giuseppe5 in #739
Fix (examples): adding bias_quant to final linear layer in resnet18 by @i-colbert in #720
Fix (base): Updating A2Q defaults by @i-colbert in #718
Fix (core): arithmetic of zero-point with positive only values by @volcacius in #670
Fix (nn): QuantConv group calculation by @i-colbert in #703
Feat (QuantTensor): QuantTensor x Tensor elementary ops dequantize to Tensor by @volcacius in #668
Feat (examples): initial Stable Diffusion support by @volcacius in #715
changes class_implementation to init_class in gpxq_mode by @fabianandresgrob in #754
Fix errors in test by @Giuseppe5 in #716
Fix (notebook): increase atol for asserts by @Giuseppe5 in #759
Gpfq/act order by @fabianandresgrob in #729
Fix (backport): op decomp in make_fx backport by @volcacius in #763
Feat (export): deprecate FINN ONNX export by @Giuseppe5 in https://github.com/Xilinx/brevitas/p...

Contributors

volcacius, nickfraser, and 6 other contributors

Assets 2

0 Join discussion

20 Sep 16:07

volcacius

super_res_r1

acf1f5d

A2Q model release Pre-release

Pre-release

Integer-Quantized Super Resolution Experiments with Brevitas

This release contains scripts demonstrating how to train integer-quantized super resolution models using Brevitas.
Code is also provided to demonstrate accumulator-aware quantization (A2Q) as proposed in our ICCV 2023 paper "A2Q: Accumulator-Aware Quantization with Guaranteed Overflow Avoidance".

Find the associated docs at https://github.com/Xilinx/brevitas/tree/super_res_r1/src/brevitas_examples/super_resolution .

Assets 9

28 Apr 16:57

volcacius

v0.9.1

d30ba0d

Release v0.9.1

What's Changed

Setup: add requirements-dev with pre-commit by @Giuseppe5 in #581
CI update by @Giuseppe5 in #570
Fix (brevitas_examples/bnn_pynq): missing 4b resnet18 link and hash fn by @volcacius in #583
Docs: update READMEs by @Giuseppe5 in #584

Full Changelog: v0.9.0...v0.9.1

Contributors

volcacius and Giuseppe5

Assets 2

21 Apr 17:50

volcacius

v0.9.0

1b9589d

Release v0.9.0

Highlights

Initial support for graph quantization to programmatically generate a quantized model from a floating-point one. ImageNet examples with PTQ can be found here: https://github.com/Xilinx/brevitas/tree/master/src/brevitas_examples/imagenet_classification/ptq .
Initial support for QuantMultiheadAttention, which is leveraged for e.g. ViT support above.
Various improvements to graph equalization, which are leveraged in the PTQ examples above.
New accumulation-aware quantizers, to train for low-precision accumulation, based on our A2Q paper https://arxiv.org/abs/2301.13376 .
Experimental support for BatchQuant quantizer, based on https://arxiv.org/abs/2105.08952 , currently still untested.
Initial support for learned rounding.

Overview of changes

Graph quantization

Initial graph quantization support by @Giuseppe5 in #549 #574 #532 #579

Quantized layers

Initial support for QuantMultiheadAttention #568
Breaking change: rename Quant(Adaptive)AvgPool to Trunc(Adaptive)AvgPool by @volcacius in #562

Quantizers

Weight normalization-based integer quantizers by @i-colbert in #559
Accumulator-aware weight quantization by @i-colbert in #567
BatchQuant quantizers support by @volcacius in #563

QuantTensor

Support to move QuantTensor across devices by @Giuseppe5 in #528
Initial support for interpolate and pixel_shuffle by @volcacius in #578

PTQ

Batch Norm support in graph equalization by @Giuseppe5 in #531
Mul support in graph equalization by @Giuseppe5 in #530
Learned round support by @Giuseppe5 in #573
MultiheadAttention and LayerNorm support in graph equalization by @Giuseppe5 in #555
Fix calibration over large number of batches by @Giuseppe5 in #523

Export

Itemize scalar quantize args only in TorchScript QCDQ by @volcacius in #561
Round avgpool export fixes by @volcacius in #562

CI, linting

Linter isort by @Giuseppe5 in #505
CI: bump isort from 5.10.1 to 5.11.5 by @Giuseppe5 in #540
Test: enable parallelism with pytest-xdist by @Giuseppe5 in #513
GHA workflow improvement by @Giuseppe5 in #507
Add support for yapf by @Giuseppe5 in #511

FX

Disable FX backport on 1.8.1+ by @volcacius in #504

Examples

Pretrained Resnet18 example on CIFAR10 targeting FINN by @volcacius in #577
Graph quantization + PTQ examples and benchmarking scripts by @Giuseppe5 in #547 #575 #576

For the Full Changelog please check : v0.8.0...v0.9.0

Contributors

volcacius, Giuseppe5, and i-colbert

Assets 2

20 Apr 12:17

volcacius

bnn_pynq-r2

d781df2

FINN-friendly 4W4A ResNet18 on CIFAR10 Pre-release

Pre-release

Model definition and pretrained 4b variant of ResNet18 for FINN deployment. Available under the bnn_pynq examples:

from brevitas_examples.bnn_pynq.models import resnet18_4w4a
quant_model = resnet18_4w4a(pretrained=True)

Assets 3

10 Jan 09:09

volcacius

v0.8.0

c65f9c1

Release version 0.8.0

What's Changed

Add support for PyTorch 1.11-1.13.1. Brevitas 0.8 supports PyTorch 1.5.1 to 1.13.1, with 1.10+ suggested.
Deprecate support for Python 3.6, 3.7+ is now required.
Add support for export to ONNX QCDQ for <= int8 quantization, for out of the box execution with onnxruntime or similar backends.
Extend support for export to ONNX QOps to <= int8 quantization, for out of the box execution with onnxruntime or similar backends.
Add experimental support for export to torch QCDQ for <= int32 quantization, as an entry point for future MLIR integration with torch-mlir.
Add support for QuantRNN, QuantLSTM, w/ support for CIFG, bidirectional layers, shared input-hidden gates, shared quantizers, training-time JIT compilation, and partial export support to ONNX (QONNX and QCDQ).
Improve support for zero-point for both weights and activations quantization.
New default asymmetric activation quantizer based on percentile rather than min/max.
Add more built-in quantizers (symmetric per-channel, asymmetric per-channel, symmetric decoupled per-channel).
Simplify interface for activation calibration.
Simplify interface for bias correction.
Initial support for QuantEmbedding.
Deprecate support for XIR and PyXIR export flows.
Many bug fixes and minor improvements.

New Contributors

@fd0r made their first contribution in #434
@omarperacha made their first contribution in #483
@andrei-stoian-zama made their first contribution #470

Full Changelog: v0.7.1...v0.8.0

Contributors

fd0r, omarperacha, and andrei-stoian-zama

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Contributors

Highlights

What's Changed

Contributors

A2Q+ Super Resolution Experiments with Brevitas

Highlights

What's Changed

Contributors

Integer-Quantized Super Resolution Experiments with Brevitas

What's Changed

Contributors

Highlights

Overview of changes

Graph quantization

Quantized layers

Quantizers

QuantTensor

PTQ

Export

CI, linting

FX

Examples

Contributors

What's Changed

New Contributors

Contributors

Releases: Xilinx/brevitas

Release v0.10.2

What's Changed

Contributors

Release v0.10.1

Highlights

What's Changed

Contributors

A2Q+ CIFAR10 model release

A2Q+ model release

A2Q+ Super Resolution Experiments with Brevitas

Release v0.10.0

Highlights

What's Changed

Contributors

A2Q model release

Integer-Quantized Super Resolution Experiments with Brevitas

Release v0.9.1

What's Changed

Contributors

Release v0.9.0

Highlights

Overview of changes

Graph quantization

Quantized layers

Quantizers

QuantTensor

PTQ

Export

CI, linting

FX

Examples

Contributors

FINN-friendly 4W4A ResNet18 on CIFAR10

Release version 0.8.0

What's Changed

New Contributors

Contributors