Is pruning of quantized models supported? #1422

thomasave · 2023-11-27T10:00:09Z

Hello,

I'm attempting to train a model for a micro-controller that only supports 8-bit precision or lower.
This works perfectly when training using your QuantizationAwareTrainingConfig.
In addition to this we also want to prune the network to also reduce the number of parameters in our model.
Luckily, the prepare_compression method accepts multiple configurations to be passed to it, so I attempted to also introduce a WeightPruningConfig.
This fails however with the following traceback:

Traceback (most recent call last):
  File "test.py", line 8, in <module>
    compression_manager.callbacks.on_train_end()
  File "/lib/python3.11/site-packages/neural_compressor/training.py", line 420, in on_train_end
    callbacks.on_train_end()
  File "/lib/python3.11/site-packages/neural_compressor/compression/callbacks.py", line 226, in on_train_end
    get_sparsity_ratio(self.pruners, self.model)
  File "/lib/python3.11/site-packages/neural_compressor/compression/pruner/utils.py", line 145, in get_sparsity_ratio
    linear_conv_cnt += module.weight.numel()
                       ^^^^^^^^^^^^^^^^^^^
AttributeError: 'function' object has no attribute 'numel'

I was wondering if this is supposed to be a supported use case, and I'm doing something wrong, or is combining multiple compression methods not yet supported?

The following code can be used to minimally reproduce the error:

from neural_compressor import QuantizationAwareTrainingConfig
from neural_compressor.training import prepare_compression, WeightPruningConfig
from timm.models import create_model
quant_config = QuantizationAwareTrainingConfig()
prune_config = WeightPruningConfig([{"start_step": 1, "end_step": 10000}])
compression_manager = prepare_compression(create_model("resnet50"), [quant_config, prune_config])
compression_manager.callbacks.on_train_begin()
compression_manager.callbacks.on_train_end()

The text was updated successfully, but these errors were encountered:

YIYANGCAI · 2023-11-28T06:53:09Z

Hi,

Thanks for your interest in our project.

I will take a look into your proposed situation combined with quantization and pruning. If it is a bug, I will try to fix it. Furthermore, I would like to note that our most pruning process requires an additional training process, which may cause weight shift and make your aforementioned quantization process invalid. If you have any further questions, please do not hesitate to contact us.

Best,
Frank

thomasave · 2023-11-28T19:57:42Z

Hi,

Thank you for looking into this!
I am indeed aware that the pruning process would require additional training, but would it not be possible to do this training quantization-aware?
It would not be a problem that the low-precision model weights would shift during the pruning process, that was more my intention actually.

Kind regards,
Thomas

YIYANGCAI · 2023-11-29T04:19:25Z

Hello Thomas,

Thanks for your further information. I think that combining quantization and pruning in one training process seems to make sense. I will investigate this situation and find out if we currently support this configuration or not.

Best,
Frank

chensuyue assigned YIYANGCAI Nov 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is pruning of quantized models supported? #1422

Is pruning of quantized models supported? #1422

thomasave commented Nov 27, 2023

YIYANGCAI commented Nov 28, 2023

thomasave commented Nov 28, 2023

YIYANGCAI commented Nov 29, 2023

Is pruning of quantized models supported? #1422

Is pruning of quantized models supported? #1422

Comments

thomasave commented Nov 27, 2023

YIYANGCAI commented Nov 28, 2023

thomasave commented Nov 28, 2023

YIYANGCAI commented Nov 29, 2023