Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The quantization parameters of encodings are inconsistent with the quantization parameters in embeeding.onnx #2680

Open
hcqylymzc opened this issue Jan 29, 2024 · 3 comments
Assignees

Comments

@hcqylymzc
Copy link

Hi, I am aligning the accuracy of the x86 platform and the htp platform. I found that the quantization parameters in the encodings output by sim.export are inconsistent with the quantization parameters in x_embeed.onnx with the qdq node. Why is this? How are the quantization parameters calculated?
I also tried to manually extract the quantization parameters from x_embeed.onnx, but found that the accuracy of the extracted encodings on the HTP platform was different from the accuracy of the encodings exported by aimet. Why is this? Is there any way to align the precision of x_embeed.onnx with onnx+encodings?

@quic-mangal
Copy link
Contributor

Are you passing a config file to quantization sim API which aligns with HTP hardware?

@hcqylymzc
Copy link
Author

Are you passing a config file to quantization sim API which aligns with HTP hardware?

Yes, I found that although the quantization parameters in embed_onnx and encodings are inconsistent, the quantization parameters in cpp after passing qnn_convert_onnx are the same, but the running results on HTP are still inconsistent, and the max difference in integers exceeds 10 (int8).

@quic-shathwar
Copy link
Contributor

@hcqylymzc This could potentially happen when the nodes in the resulting QNN model(defined in the cpp file) are not aligned with the encodings exported from AIMET. One of the main reason for this is the various optimizations done by the converter which might result in a different set of nodes compared to what the encodings were generated for in AIMET, resulting in a drop in accuracy when taken to target.

To see what's going on with your case, can you please provide the following information?

  1. Minimal Reproducible Example: Could you provide a minimal code snippet and the model that reproduces the issue?
  2. Environment Details: Could you share the details of your environment? This includes the AIMET version, QNN version, and versions of Python, ONNX and torch.
  3. Steps to Reproduce: Could you list the exact steps you took that led to the issue? That is, can you also provide the exact list of QNN commands you are running with the exported onnx/encodings? This will help us follow along and see the problem for ourselves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants