-
-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
integrate with OnnxSlim #12683
integrate with OnnxSlim #12683
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #12683 +/- ##
==========================================
+ Coverage 70.15% 70.57% +0.41%
==========================================
Files 122 122
Lines 15636 15645 +9
==========================================
+ Hits 10970 11041 +71
+ Misses 4666 4604 -62
Flags with carried forward coverage won't be shown. Click here to find out more. β View full report in Codecov by Sentry. |
@Kayzwer Hello, thx for your effort! What do you think, what is better to use - onnx-simplifier or onnx-slim? |
try both |
also thanks to @inisis for developing the tool |
Thanks for your pr, and onnxslim can have a better performance. |
@inisis @Kayzwer thanks guys! Do you have benchmarks or info on i.e. YOLOv8n models exported with both onnx-simplifier and onnx-slim? Are the number of layers, operations or file size different? |
@glenn-jocher I will do some experiments with yolov8n. |
Thanks a lot, I have tested myself. |
we have tested all the onnx models generated by tests/test_exports.py, and each model is processd by onnxsim and onnxslim and log to a file, you can find it here. here is what chatgpt generates:
|
but it is possible to apply two at the same time right? (onnxsim first, onnxslim second) |
the slimmed model is slightly larger because it contains more detailed tensor value info for the intermediate layer. |
I have conducted an experiment, using yolov8n_False_1.onnx which are non dynamic shape, using sim and slim to process it, and convert them to tensorrt, and here are the result. from the result we can see that using slimmed model, it's 1647.92 qps, and simmed it 1635.28, and eigne file size is 18345652 vs 20483532 |
@initialencounter hey, thanks for adding to the discussion! Regarding applying onnx-simplifier and onnx-slim sequentially, you can definitely try using both tools on a model to maximize optimization. Each tool removes different redundancies which potentially could complement each other -- applying onnx-simplifier first and then onnx-slim might yield an even more optimized model. However, it's important to thoroughly test the final model to ensure that these optimizations don't impede its accuracy or expected behavior! π§ͺπ |
@glenn-jocher , all my experiments are done seperately with onnxslim and onnxsim alone and I have tested the accuracy of the exported engine, here are the result, using the script from tests/test_cuda.py. from the result we can see, that the slimmed engine produces the right result |
so using only onnxslim can get optimal performance? |
@Kayzwer thanks for sharing your findings! Yes, based on your experiments, it seems using only onnxslim has provided satisfactory performance and accuracy. If the slimmed model meets your requirements, you might not need additional optimization with onnx-simplifier. However, different models and applications can benefit variably, so a combined approach could be explored if further optimization is necessary. Keep up the great experimentation! π |
@glenn-jocher Hi, what else do we need to provide, in order to merge onnxslim into ultralytics |
@inisis the best way to validate both speed and accuracy of the exported models is to export both and then validate on COCO (for both speed and mAP):
|
@glenn-jocher Hi, here are the results, the average inference speed for slim model is 6.4ms while accuracy stay the same. |
Hey @inisis, thanks for sharing the results! It looks like the slim model maintains accuracy while offering a slight improvement in inference speed. This could be beneficial for applications requiring faster processing without a compromise on performance. If there's anything specific you need for integrating onnxslim into Ultralytics, such as further testing or documentation updates, just let me know! ππ |
@glenn-jocher Thanks for your quick response, I'm wondering if there are performance tests for Ultralytics on CI, and we can make a pr which utilize onnxslim to do model optimization and see the speed and accuracy results. Onnxslim is a very useful tool and easy to maintain and cross-platform, users don't need to make much effort to get a speedup. It's very honored to be merged into Ultralytics. ππππππ |
Hey @inisis, absolutely, we do have CI performance tests set up for Ultralytics. You're welcome to submit a PR with onnxslim integration! We can then review the changes together, especially focusing on the speed and accuracy benchmarks. It sounds like a great addition to enhance model performance with minimal user effort. Looking forward to your contribution! ππ |
Hi, @Kayzwer there is a conflict here, can you help fix it, and I want to make onnxslim as default option, because it's faster. |
just make slim to true in default.yaml file |
@Kayzwer Can you help fix the merge conflict or should I create another pull request, current workflow is a little bit complicated for me. |
alright, I will fix it later |
Hi @Kayzwer, thanks for offering to help! If you could resolve the merge conflict, that would be great. Setting |
π οΈ PR Summary
Made with β€οΈ by Ultralytics Actions
π Summary
This PR is a placeholder without actual code changes. π«
π Key Changes
π― Purpose & Impact