Enable LTO in supported compilers #5874

In-line · 2024-05-04T13:14:26Z

Describe your PR, what does it fix/add?

Enabled LTO, because why not?

Is there anything you want to mention? (unchecked code, possible bugs, found problems, breaking compatibility, etc.)

No

Is it ready for merging, or does it need work?

It's ready

Agent00Ming · 2024-05-04T13:34:10Z

Benefits of LTO

LTO can give double digit performance boosts for many programs.
Can lower RAM usage per program making it very useful for limited memory systems.

Downsides of LTO

Can increase compile time by 2 to 3 times.
Uses more RAM during compiling.
Not all programs become faster or smaller.
There is an increased chance of finding build-time or runtime bugs while using it.
Always be prepared to try without it if something is acting odd.

gentoo wiki

In-line · 2024-05-04T13:54:51Z

Some stats on my machine (test before the patch)
cmake -G Ninja -B build/ -DCMAKE_BUILD_TYPE=Release -DCMAKE_INTERPROCEDURAL_OPTIMIZATION=ON/OFF

GCC (-flto=auto):
LTO ON: cmake --build build/ --clean-first 298.82s user 32.47s system 1880% cpu 17.613 total
LTO OFF: cmake --build build --clean-first 507.19s user 31.09s system 2270% cpu 23.704 total

Clang(-flto=thin)
LTO ON: cmake --build build/ --clean-first 276.76s user 10.66s system 1997% cpu 14.391 total
LTO OFF: cmake --build build --clean-first 308.75s user 10.49s system 2278% cpu 14.012 total

❯ clang --version
clang version 17.0.6
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

❯ gcc --version                     
gcc (GCC) 13.2.1 20240417
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE

❯ neofetch                                
                   -`                    codemonkey@workstation-01 
                  .o+`                   ------------------------- 
                 `ooo/                   OS: Arch Linux x86_64 
                `+oooo:                  Kernel: 6.8.9-1-cachyos-bore 
               `+oooooo:                 Uptime: 4 hours, 50 mins 
               -+oooooo+:                Packages: 2471 (pacman), 6 (flatpak) 
             `/:-:++oooo+:               Shell: zsh 5.9 
            `/++++/+++++++:              Resolution: 2560x1440 
           `/++++++++++++++:             DE: Hyprland 
          `/+++ooooooooooooo/`           WM: sway 
         ./ooosssso++osssssso+`          Theme: Adwaita [GTK2], Adwaita-dark [GTK3] 
        .oossssso-````/ossssss+`         Icons: Adwaita [GTK2/3] 
       -osssssso.      :ssssssso.        Terminal: vscode 
      :osssssss/        osssso+++.       CPU: AMD Ryzen 9 7950X (32) @ 5.881GHz 
     /ossssssss/        +ssssooo/-       GPU: AMD ATI Radeon RX 7900 XT/7900 XTX/7900M 
   `/ossssso+/:-        -:/+osssso+-     Memory: 19065MiB / 63999MiB 
  `+sso+:-`                 `.-/+oso:
 `++:.                           `-/+/                           
 .`                                 `/

JohnRTitor · 2024-05-05T00:57:04Z

Obviously this support is still experimental, but nice addition to have. Thoughts? @vaxerski

vaxerski · 2024-05-05T01:04:29Z

looking at the drawbacks, I'm not convinced this is a good idea.

JohnRTitor · 2024-05-05T01:09:11Z

I do agree that this should not be enabled by default :)
But if the user is adventurous enough to try :)

In-line · 2024-05-05T07:02:42Z

Well I'm not sure, where it's mentioned that LTO is experimental. Both GCC and Clang claim it's mature. Maybe it was experimental a few years ago, but it's not currently. Chromium, Firefox and many more much complex and bigger projects use it.

@vaxerski Is there a good CPU bottleneck benchmark I can use to compare LTO and non-LTO builds?

vaxerski · 2024-05-05T12:18:30Z

no clue, I've never used lto

JohnRTitor · 2024-05-05T12:38:13Z

@In-line can you provide a "patch" for meson based building too? I'll try to build and test it on Nix.

fufexan · 2024-05-05T12:59:16Z

@JohnRTitor you can rebase this PR on top of #5667 to test. I'm going to merge that soon.

JohnRTitor · 2024-05-05T13:09:57Z

I am not the PR author this time :)
@In-line well, you heard fufexan :)

Agent00Ming · 2024-05-05T13:31:07Z

I still think this should be left as an "option", the compile times will vary due to hardware and feature sets.

Compilation time table for me:

LTO	OFF	ON
real	0m55.257s	0m38.746s	-30%
user	12m28.347s	7m28.746s	-40%
sys	0m20.171s	0m24.944s	+25%

fufexan · 2024-05-05T13:41:41Z

I am not the PR author this time :) @In-line well, you heard fufexan :)

I meant more as: clone repo, gh pr checkout 5874, checkout cmake, git rebase In-line:lto.

But the CMake PR is now merged, so a simple rebase should get you up and running.

fufexan · 2024-05-05T14:07:49Z

What starship reports in my case:
LTO on: 1m57s
LTO off: 2m44s

JohnRTitor · 2024-05-05T14:32:17Z

GCC lto itself does not do much. Clang LTO, especially thin LTO is much better.

JohnRTitor · 2024-05-05T14:34:13Z

@vaxerski Is there a good CPU bottleneck benchmark I can use to compare LTO and non-LTO builds?

Maybe these are not what you are looking for, but can be helpful:

https://www.phoronix.com/review/clang-lto-kernel
https://www.phoronix.com/review/clang-12-opt
https://www.phoronix.com/review/gcc11-rocket-opts
They are pretty outdated though.

JohnRTitor · 2024-05-05T15:07:17Z

Clang LTO: Finished at 20:34:56 after 1m3s
GCC LTO: Finished at 20:28:26 after 1m16s

In-line · 2024-05-05T16:03:04Z

Hyprland isn't that big to be bottlenecked by CPU compilation time on modern systems. I don't think compilation time is the metric that has noticeable regression for us.

I meant CPU bottleneck benchmarks for Hyprland to see how much difference it brings in weak systems with iGPUs, where bottleneck might be on CPU side. As LTO is performance optimization, it should decrease Hyprland executable size and increase it's execution speed.

I was asking for any benchmarks I can run on slow GPU to test improvements that come with LTO.

In-line · 2024-05-05T19:17:51Z

@JohnRTitor Patches for Meson are ready

nonetrix · 2024-05-14T23:44:35Z

I don't know if this is a good idea either, even more so if we don't benchmark it at least and see if there is meaningful improvement. Has anyone tried something to get Hyprland to lag and compare with and without? Maybe a stress test would be a neat idea if someone would like to work on that if it doesn't already exist, also could prove to be useful in improving performance in general without compiler flags if we can profile it. I have compiled my whole system with Gentoo in the past with LTO and NodeJS was the only thing that caused issues so it's somewhat stable I guess but likely still not good idea. But I imagine you might get bigger gains doing -O3 or -march=native latter wouldn't be practical of course always. Maybe this could be added as like a build option for those who want it to be faster and don't mind possible bugs? But would have to check if it actually is or not, sometimes can make things slower

In-line · 2024-05-24T21:28:59Z

I think all this conversations about some abstract rick in enabling LTO are pointless. As Hyprland is included in ALHP project already https://status.alhp.dev/?pkgbase=hyprland

I don't understand what all the "risk" fuss is about to be fair.

gnusenpai · 2024-05-25T23:41:57Z

So are there any actual requirements for being included in ALHP, other than: "it builds, ship it"? I imagine getting this endorsed here officially will take a bit more than that.

In-line force-pushed the lto branch from 45d5636 to c5290e1 Compare May 5, 2024 16:05

In-line added 2 commits May 5, 2024 23:17

Enable LTO in supported compilers for CMake

efd0a86

Enable LTO in suported compilers for Meson

5f733d0

In-line force-pushed the lto branch from c5290e1 to 5f733d0 Compare May 5, 2024 19:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable LTO in supported compilers #5874

Enable LTO in supported compilers #5874

In-line commented May 4, 2024

Agent00Ming commented May 4, 2024

In-line commented May 4, 2024 •

edited

JohnRTitor commented May 5, 2024

vaxerski commented May 5, 2024

JohnRTitor commented May 5, 2024

In-line commented May 5, 2024

vaxerski commented May 5, 2024

JohnRTitor commented May 5, 2024

fufexan commented May 5, 2024 •

edited

JohnRTitor commented May 5, 2024 •

edited

Agent00Ming commented May 5, 2024 •

edited

fufexan commented May 5, 2024

fufexan commented May 5, 2024

JohnRTitor commented May 5, 2024

JohnRTitor commented May 5, 2024

JohnRTitor commented May 5, 2024

In-line commented May 5, 2024

In-line commented May 5, 2024

nonetrix commented May 14, 2024 •

edited

In-line commented May 24, 2024

gnusenpai commented May 25, 2024 •

edited

Enable LTO in supported compilers #5874

Are you sure you want to change the base?

Enable LTO in supported compilers #5874

Conversation

In-line commented May 4, 2024

Describe your PR, what does it fix/add?

Is there anything you want to mention? (unchecked code, possible bugs, found problems, breaking compatibility, etc.)

Is it ready for merging, or does it need work?

Agent00Ming commented May 4, 2024

In-line commented May 4, 2024 • edited

JohnRTitor commented May 5, 2024

vaxerski commented May 5, 2024

JohnRTitor commented May 5, 2024

In-line commented May 5, 2024

vaxerski commented May 5, 2024

JohnRTitor commented May 5, 2024

fufexan commented May 5, 2024 • edited

JohnRTitor commented May 5, 2024 • edited

Agent00Ming commented May 5, 2024 • edited

fufexan commented May 5, 2024

fufexan commented May 5, 2024

JohnRTitor commented May 5, 2024

JohnRTitor commented May 5, 2024

JohnRTitor commented May 5, 2024

In-line commented May 5, 2024

In-line commented May 5, 2024

nonetrix commented May 14, 2024 • edited

In-line commented May 24, 2024

gnusenpai commented May 25, 2024 • edited

In-line commented May 4, 2024 •

edited

fufexan commented May 5, 2024 •

edited

JohnRTitor commented May 5, 2024 •

edited

Agent00Ming commented May 5, 2024 •

edited

nonetrix commented May 14, 2024 •

edited

gnusenpai commented May 25, 2024 •

edited