You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello community,
I've tried the smoothquant flow on an OPT-125m model with the default setting. Unsurprisely the activations are quantized per tensor and weighs are per channel. According to the following table from the SmoothQuant paper I see weights can be also quantized per tensor (smoothquant-O3). Is it possible to apply smoothquant by setting the QuantizeConfig or other stuff? I really aim a per_tensor quantization on both activations and weights due to a limitation of my hardware. Thanks!
The text was updated successfully, but these errors were encountered:
Hi Chen, thanks for your response.
Currently weights could only be quantized per-channel in INC SmoothQuant. Please refer to SmoothQuant_doc for more details of our implementation.
Thanks!
Hello community,
I've tried the smoothquant flow on an OPT-125m model with the default setting. Unsurprisely the activations are quantized per tensor and weighs are per channel. According to the following table from the SmoothQuant paper I see weights can be also quantized per tensor (smoothquant-O3). Is it possible to apply smoothquant by setting the QuantizeConfig or other stuff? I really aim a per_tensor quantization on both activations and weights due to a limitation of my hardware. Thanks!
The text was updated successfully, but these errors were encountered: