Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

怎么把paddleslim量化之后的模型部署在昇腾硬件上 #1815

Open
chairman-lu opened this issue Dec 12, 2023 · 1 comment
Open

Comments

@chairman-lu
Copy link

chairman-lu commented Dec 12, 2023

用paddleslim的ACT自动压缩工具对yolov7进行压缩之后,得到的模型虽然保存的参数是int8的范围,但是类型是float,也就是压缩前后的onnx文件体积没变,然后要把这个onnx模型部署在昇腾的开发板Atlas 500上。我的操作如下:

首先把onnx模型转化为昇腾支持的om模型,转化命令为:
atc --framework=5 --model=./model/yolov7.onnx --input_shape="images:1,3,640,640"
--output=./model/yolov7 --compression_optimize=./scripts/compression_opt.config --soc_version=Ascend310
--log=info --insert_op_conf=./scripts/aipp.cfg
这里的compression_optimize参数是相当于加了一个后训练量化PTQ,把float的参数转为int8,之后按照昇腾官方例子程序,链接如下:
https://gitee.com/ascend/samples/tree/master/inference/modelInference/sampleYOLOV7
按照链接中的内容执行推理,输出的图片上一个预测框都没有。

按照相同的方法把未量化的onnx模型部署在Atlas 500上,可以正常输出,该onnx转om模型的命令为:
atc --framework=5 --model=./model/yolov7.onnx --input_shape="images:1,3,640,640"
--output=./model/yolov7 --soc_version=Ascend310
--log=info --insert_op_conf=./scripts/aipp.cfg

如果把未量化的onnx模型(大小和量化后一样)加上PTQ转化一下,命令如下(与第一个相同):
atc --framework=5 --model=./model/yolov7.onnx --input_shape="images:1,3,640,640"
--output=./model/yolov7 --compression_optimize=./scripts/compression_opt.config --soc_version=Ascend310
--log=info --insert_op_conf=./scripts/aipp.cfg
同样不能正常推理。

所以目前来看是因为加了PTQ导致无法正常推理,但是有没有方法直接导出一个参数为int8格式保存的onnx可以直接在昇腾硬件上转化部署呢,用INT8 MKL-DNN可行吗?这是用在x86CPU上加速的方法,能否用在昇腾硬件上呢

@ceci3
Copy link
Contributor

ceci3 commented Feb 6, 2024

@qili93

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants