Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable LTO in supported compilers #5874

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Enable LTO in supported compilers #5874

wants to merge 2 commits into from

Conversation

In-line
Copy link

@In-line In-line commented May 4, 2024

Describe your PR, what does it fix/add?

Enabled LTO, because why not?

Is there anything you want to mention? (unchecked code, possible bugs, found problems, breaking compatibility, etc.)

No

Is it ready for merging, or does it need work?

It's ready

@Agent00Ming
Copy link
Contributor

Benefits of LTO

LTO can give double digit performance boosts for many programs.
Can lower RAM usage per program making it very useful for limited memory systems.

Downsides of LTO

Can increase compile time by 2 to 3 times.
Uses more RAM during compiling.
Not all programs become faster or smaller.
There is an increased chance of finding build-time or runtime bugs while using it.
Always be prepared to try without it if something is acting odd.

gentoo wiki

@In-line
Copy link
Author

In-line commented May 4, 2024

Some stats on my machine (test before the patch)
cmake -G Ninja -B build/ -DCMAKE_BUILD_TYPE=Release -DCMAKE_INTERPROCEDURAL_OPTIMIZATION=ON/OFF

GCC (-flto=auto):
LTO ON: cmake --build build/ --clean-first 298.82s user 32.47s system 1880% cpu 17.613 total
LTO OFF: cmake --build build --clean-first 507.19s user 31.09s system 2270% cpu 23.704 total

Clang(-flto=thin)
LTO ON: cmake --build build/ --clean-first 276.76s user 10.66s system 1997% cpu 14.391 total
LTO OFF: cmake --build build --clean-first 308.75s user 10.49s system 2278% cpu 14.012 total

❯ clang --version
clang version 17.0.6
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
❯ gcc --version                     
gcc (GCC) 13.2.1 20240417
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE
❯ neofetch                                
                   -`                    codemonkey@workstation-01 
                  .o+`                   ------------------------- 
                 `ooo/                   OS: Arch Linux x86_64 
                `+oooo:                  Kernel: 6.8.9-1-cachyos-bore 
               `+oooooo:                 Uptime: 4 hours, 50 mins 
               -+oooooo+:                Packages: 2471 (pacman), 6 (flatpak) 
             `/:-:++oooo+:               Shell: zsh 5.9 
            `/++++/+++++++:              Resolution: 2560x1440 
           `/++++++++++++++:             DE: Hyprland 
          `/+++ooooooooooooo/`           WM: sway 
         ./ooosssso++osssssso+`          Theme: Adwaita [GTK2], Adwaita-dark [GTK3] 
        .oossssso-````/ossssss+`         Icons: Adwaita [GTK2/3] 
       -osssssso.      :ssssssso.        Terminal: vscode 
      :osssssss/        osssso+++.       CPU: AMD Ryzen 9 7950X (32) @ 5.881GHz 
     /ossssssss/        +ssssooo/-       GPU: AMD ATI Radeon RX 7900 XT/7900 XTX/7900M 
   `/ossssso+/:-        -:/+osssso+-     Memory: 19065MiB / 63999MiB 
  `+sso+:-`                 `.-/+oso:
 `++:.                           `-/+/                           
 .`                                 `/                           

@JohnRTitor
Copy link

Obviously this support is still experimental, but nice addition to have. Thoughts? @vaxerski

@vaxerski
Copy link
Member

vaxerski commented May 5, 2024

looking at the drawbacks, I'm not convinced this is a good idea.

@JohnRTitor
Copy link

I do agree that this should not be enabled by default :)
But if the user is adventurous enough to try :)

@In-line
Copy link
Author

In-line commented May 5, 2024

Well I'm not sure, where it's mentioned that LTO is experimental. Both GCC and Clang claim it's mature. Maybe it was experimental a few years ago, but it's not currently. Chromium, Firefox and many more much complex and bigger projects use it.

@vaxerski Is there a good CPU bottleneck benchmark I can use to compare LTO and non-LTO builds?

@vaxerski
Copy link
Member

vaxerski commented May 5, 2024

no clue, I've never used lto

@JohnRTitor
Copy link

@In-line can you provide a "patch" for meson based building too? I'll try to build and test it on Nix.

@fufexan
Copy link
Member

fufexan commented May 5, 2024

@JohnRTitor you can rebase this PR on top of #5667 to test. I'm going to merge that soon.

@JohnRTitor
Copy link

JohnRTitor commented May 5, 2024

I am not the PR author this time :)
@In-line well, you heard fufexan :)

@Agent00Ming
Copy link
Contributor

Agent00Ming commented May 5, 2024

I still think this should be left as an "option", the compile times will vary due to hardware and feature sets.

Compilation time table for me:
LTO OFF ON
real 0m55.257s 0m38.746s -30%
user 12m28.347s 7m28.746s -40%
sys 0m20.171s 0m24.944s +25%

@fufexan
Copy link
Member

fufexan commented May 5, 2024

I am not the PR author this time :) @In-line well, you heard fufexan :)

I meant more as: clone repo, gh pr checkout 5874, checkout cmake, git rebase In-line:lto.

But the CMake PR is now merged, so a simple rebase should get you up and running.

@fufexan
Copy link
Member

fufexan commented May 5, 2024

What starship reports in my case:
LTO on: 1m57s
LTO off: 2m44s

@JohnRTitor
Copy link

GCC lto itself does not do much. Clang LTO, especially thin LTO is much better.

@JohnRTitor
Copy link

@vaxerski Is there a good CPU bottleneck benchmark I can use to compare LTO and non-LTO builds?

Maybe these are not what you are looking for, but can be helpful:

https://www.phoronix.com/review/clang-lto-kernel
https://www.phoronix.com/review/clang-12-opt
https://www.phoronix.com/review/gcc11-rocket-opts
They are pretty outdated though.

@JohnRTitor
Copy link

Clang LTO: Finished at 20:34:56 after 1m3s
GCC LTO: Finished at 20:28:26 after 1m16s

@In-line
Copy link
Author

In-line commented May 5, 2024

Hyprland isn't that big to be bottlenecked by CPU compilation time on modern systems. I don't think compilation time is the metric that has noticeable regression for us.

I meant CPU bottleneck benchmarks for Hyprland to see how much difference it brings in weak systems with iGPUs, where bottleneck might be on CPU side. As LTO is performance optimization, it should decrease Hyprland executable size and increase it's execution speed.

I was asking for any benchmarks I can run on slow GPU to test improvements that come with LTO.

@In-line
Copy link
Author

In-line commented May 5, 2024

@JohnRTitor Patches for Meson are ready

@nonetrix
Copy link

nonetrix commented May 14, 2024

I don't know if this is a good idea either, even more so if we don't benchmark it at least and see if there is meaningful improvement. Has anyone tried something to get Hyprland to lag and compare with and without? Maybe a stress test would be a neat idea if someone would like to work on that if it doesn't already exist, also could prove to be useful in improving performance in general without compiler flags if we can profile it. I have compiled my whole system with Gentoo in the past with LTO and NodeJS was the only thing that caused issues so it's somewhat stable I guess but likely still not good idea. But I imagine you might get bigger gains doing -O3 or -march=native latter wouldn't be practical of course always. Maybe this could be added as like a build option for those who want it to be faster and don't mind possible bugs? But would have to check if it actually is or not, sometimes can make things slower

@In-line
Copy link
Author

In-line commented May 24, 2024

I think all this conversations about some abstract rick in enabling LTO are pointless. As Hyprland is included in ALHP project already https://status.alhp.dev/?pkgbase=hyprland

I don't understand what all the "risk" fuss is about to be fair.

@gnusenpai
Copy link
Contributor

gnusenpai commented May 25, 2024

So are there any actual requirements for being included in ALHP, other than: "it builds, ship it"? I imagine getting this endorsed here officially will take a bit more than that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants