The quantization parameters of encodings are inconsistent with the quantization parameters in embeeding.onnx #2680

hcqylymzc · 2024-01-29T07:07:53Z

Hi, I am aligning the accuracy of the x86 platform and the htp platform. I found that the quantization parameters in the encodings output by sim.export are inconsistent with the quantization parameters in x_embeed.onnx with the qdq node. Why is this? How are the quantization parameters calculated?
I also tried to manually extract the quantization parameters from x_embeed.onnx, but found that the accuracy of the extracted encodings on the HTP platform was different from the accuracy of the encodings exported by aimet. Why is this? Is there any way to align the precision of x_embeed.onnx with onnx+encodings?

quic-mangal · 2024-02-15T01:27:36Z

Are you passing a config file to quantization sim API which aligns with HTP hardware?

hcqylymzc · 2024-02-17T01:31:00Z

Are you passing a config file to quantization sim API which aligns with HTP hardware?

Yes, I found that although the quantization parameters in embed_onnx and encodings are inconsistent, the quantization parameters in cpp after passing qnn_convert_onnx are the same, but the running results on HTP are still inconsistent, and the max difference in integers exceeds 10 (int8).

quic-shathwar · 2024-03-19T23:28:55Z

@hcqylymzc This could potentially happen when the nodes in the resulting QNN model(defined in the cpp file) are not aligned with the encodings exported from AIMET. One of the main reason for this is the various optimizations done by the converter which might result in a different set of nodes compared to what the encodings were generated for in AIMET, resulting in a drop in accuracy when taken to target.

To see what's going on with your case, can you please provide the following information?

Minimal Reproducible Example: Could you provide a minimal code snippet and the model that reproduces the issue?
Environment Details: Could you share the details of your environment? This includes the AIMET version, QNN version, and versions of Python, ONNX and torch.
Steps to Reproduce: Could you list the exact steps you took that led to the issue? That is, can you also provide the exact list of QNN commands you are running with the exported onnx/encodings? This will help us follow along and see the problem for ourselves.

quic-akinlawo assigned quic-shathwar Feb 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The quantization parameters of encodings are inconsistent with the quantization parameters in embeeding.onnx #2680

The quantization parameters of encodings are inconsistent with the quantization parameters in embeeding.onnx #2680

hcqylymzc commented Jan 29, 2024

quic-mangal commented Feb 15, 2024

hcqylymzc commented Feb 17, 2024

quic-shathwar commented Mar 19, 2024

The quantization parameters of encodings are inconsistent with the quantization parameters in embeeding.onnx #2680

The quantization parameters of encodings are inconsistent with the quantization parameters in embeeding.onnx #2680

Comments

hcqylymzc commented Jan 29, 2024

quic-mangal commented Feb 15, 2024

hcqylymzc commented Feb 17, 2024

quic-shathwar commented Mar 19, 2024