Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime performance for debug compilation with debugger attached #8

Open
breakin opened this issue Dec 20, 2022 · 2 comments
Open

Runtime performance for debug compilation with debugger attached #8

breakin opened this issue Dec 20, 2022 · 2 comments

Comments

@breakin
Copy link

breakin commented Dec 20, 2022

I'm curious if any measurement has been made with a debug build of the library.

Background; my personal number one use-case as a developer is starting programs with a debugger attached, all in debug build, when trying to find bugs or understand a program I am new to. Hence this performance metric is very valuable to me.

@spnda
Copy link
Owner

spnda commented Dec 24, 2022

I never really cared about the debug performance, as it doesn't show how quick something will be when it ships. However, I've just ran the benchmarks in debug mode and I am actually surprised that it's actually slower than both tinygltf and cgltf (Except for 2CylinderEngine, where tinygltf is INCREDIBLY slow for some reason, taking 0.2 seconds to parse the gltf). Though yes, it does make sense when debugging other parts of your application.

My first guess would be that simdjson causes this, as it is essentially a lot of small functions (1-4 lines) that would be usually inlined, but won't because of debug compilation, and, well, function invocations are something compilers specifically try to optimize away if it is worth it.

So, to counter that, I just tried /Ob1 with MSVC which "allows expansion only of functions marked inline, __inline, or __forceinline, [...]" (debug mode usually sets /Ob0, which disables all inlining), which made fastgltf perform more or less just as fast as tinygltf and cgltf. /Ob2 and /Ob3, however, seem to have nearly no effect on the runtime performance in debug mode.

Below you can find some charts I made for the results I got in debug mode. In both cases, cgltf is a good bit quicker than both tinygltf and fastgltf. Though only in the first one fastgltf is also faster than tinygltf, likely because of the base64 buffer it has to decode. @breakin, do you perhaps have any idea how one could improve these builds even more?
Mean vs 2CylinderEngine (2MB base64 encoded buffer) load from memory (DEBUG)
Mean vs NewSponza load from memory (without images and buffer load) (DEBUG)

@breakin
Copy link
Author

breakin commented Dec 24, 2022

First it doesn´t look that bad, maybe it is good enough. I am not at home (due to christmas) at my regular machine so I can´t check now but I had a hunch that cgltf - being a c-library - would have less of a diff between release/debug. Using C++-features (mostly STL) used to drag down debug performance. I am not sure that is the case with your library, however, I have to check it more!

If I want to test replicate you test myself, what do I need to do? I´ve heard that later Visual Studio has better debug-performance so I would test there and maybe profile with/without debug and see what looks different. If something else become a hotspot that would tell us something!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants