Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use zstd compression #478

Closed
probonopd opened this issue Sep 16, 2017 · 60 comments · May be fixed by #1091 or AppImageCommunity/libappimage#158
Closed

Use zstd compression #478

probonopd opened this issue Sep 16, 2017 · 60 comments · May be fixed by #1091 or AppImageCommunity/libappimage#158

Comments

@probonopd
Copy link
Member

Add zstd compression and decompression support to SquashFS. zstd is a
great fit for SquashFS because it can compress at ratios approaching xz,
while decompressing twice as fast as zlib. For SquashFS in particular,
it can decompress as fast as lzo and lz4. It also has the flexibility
to turn down the compression ratio for faster compression times.

The compression benchmark is run on the file tree from the SquashFS archive
found in ubuntu-16.10-desktop-amd64.iso [1]. It uses `mksquashfs` with the
default block size (128 KB) and and various compression algorithms/levels.
xz and zstd are also benchmarked with 256 KB blocks. The decompression
benchmark times how long it takes to `tar` the file tree into `/dev/null`.
See the benchmark file in the upstream zstd source repository located under
`contrib/linux-kernel/squashfs-benchmark.sh` [2] for details.

I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM.
The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor,
16 GB of RAM, and a SSD.

| Method         | Ratio | Compression MB/s | Decompression MB/s |
|----------------|-------|------------------|--------------------|
| gzip           |  2.92 |               15 |                128 |
| lzo            |  2.64 |              9.5 |                217 |
| lz4            |  2.12 |               94 |                218 |
| xz             |  3.43 |              5.5 |                 35 |
| xz 256 KB      |  3.53 |              5.4 |                 40 |
| zstd 1         |  2.71 |               96 |                210 |
| zstd 5         |  2.93 |               69 |                198 |
| zstd 10        |  3.01 |               41 |                225 |
| zstd 15        |  3.13 |             11.4 |                224 |
| zstd 16 256 KB |  3.24 |              8.1 |                210 |

This patch was written by Sean Purcell <me@seanp.xyz>, but I will be
taking over the submission process.

[1] http://releases.ubuntu.com/16.10/
[2] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/squashfs-benchmark.sh

zstd source repository: https://github.com/facebook/zstd

Signed-off-by: Sean Purcell <me@seanp.xyz>
Signed-off-by: Nick Terrell <terrelln@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Acked-by: Phillip Lougher <phillip@squashfs.org.uk>
@develar
Copy link
Contributor

develar commented Oct 7, 2017

It would be amazing, because electron-builder uses lzma for windows nsis targets — expected size 35 MB. AppImage uses deflate and size ~51 MB (even more — because 7za is not used, compression is not good (e.g. 7za can compress the same data to 50 MB (1.5 MB difference))).

Yes, we use lzma2 for windows portable and users don't complain that start time is slow, but anyway, for now I decided to not use xz for AppImage because of decompression speed.

@Calinou
Copy link

Calinou commented Oct 8, 2017

Will AppImages using zstd work on older kernels, such as the ones found in Ubuntu LTS, CentOS or Debian?

@probonopd
Copy link
Member Author

probonopd commented Oct 9, 2017

If we would implement zstd, then we would implement it in FUSE, which means it would be independent of the kernel. But we would lose the ability to loop-mount such AppImages using old kernels (is anyone ever doing that, anyway?)

@probonopd
Copy link
Member Author

probonopd commented Oct 9, 2017

@develar if it is not too much trouble, could you give the following numbers please?

  • Windows uncompressed __ MB, Windows compressed with lzma2 __ MB = __% compression ratio
  • Linux uncompressed __ MB, Linux compressed AppImage (without xz) __ MB = __% compression ratio

@TheAssassin
Copy link
Member

But we would lose the ability to loop-mount such AppImages using the kernel

We wouldn't lose it, but it won't work on old systems, right? I assume that on newer systems, it should work out of the box, but on older ones, it'd produce an error message.

@probonopd
Copy link
Member Author

Correct. Sorry, I meant "old kernels". Corrected above.

@probonopd
Copy link
Member Author

Zstandard support has been added to squashfs-tools, so if I am not mistaken nothing is stopping us now from using Zstandard for AppImages.

Volunteers for a PR?

@probonopd
Copy link
Member Author

sudo apt install make gcc

zstd library

git clone https://github.com/facebook/zstd/
cd zstd
sudo make install
cd ..

squashfs-tools

git clone https://github.com/plougher/squashfs-tools
cd squashfs-tools/squashfs-tools
make -j8 ZSTD_SUPPORT=1 GZIP_SUPPORT=0 COMP_DEFAULT=zstd mksquashfs
cd ../..

squashfuse

git clone https://github.com/vasi/squashfuse
...
--with-zstd=PREFIX

@probonopd
Copy link
Member Author

https://github.com/vasi/squashfuse/releases/tag/0.1.101 now contains zstd support.

@TheAssassin
Copy link
Member

We should update it in a PR branch, and check whether it builds with that new version.

@Ambyjkl
Copy link

Ambyjkl commented Apr 25, 2020

Any updates on this issue? I think by this point, zstd support is pretty universal

@probonopd
Copy link
Member Author

Hi @Ambyjkl looks like no one has done any work on this. First step would be to scientifically prove that it actually provides some advantage over what we have in place now.

@Ambyjkl
Copy link

Ambyjkl commented Nov 27, 2020

@probonopd sorry I kinda forgot about this issue. From my understanding, using zstd would be a one-line change here, right?
https://github.com/AppImage/AppImageKit/blob/master/src/appimagetoolnoglib.c#L158

@brunoais
Copy link

@Ambyjkl I believe the compression level should be selectable to get a balance between how much space used and how much resources used to compress.

@denji
Copy link

denji commented Dec 1, 2020

https://github.com/mhx/dwarfs

@brunoais
Copy link

brunoais commented Dec 1, 2020

@denji I suggest you open a new issue with that instead

@mgord9518
Copy link

mgord9518 commented Apr 13, 2022

@denji how does it compare to SquashFS? From what I read it seems to target extremely redundant data, which a lot of AppImages aren't (likely only contain a few text files and sparse data, while precompressed images and binary take up the bulk).

EDIT: after doing a little more reading on it and some of my own tests, DwarFS is really impressive. I got 2x better compression than SquashFS for a given AppImage, both using ZSTD. A big issue is how big it is though, static linking gives >8MB executables which is enormous for this kind of usecase. Compressing the runtime could drop that quite a bit, but it would still require medium to large apps for that weight to break even.

@probonopd ZSTD gets significantly better performance at the same or even better compression (>3x read speed at same compression; ~2.7x at max). This gain is definitely enough to warrant adding support, if not defaulting to it. I can send a PR when I have some extra free time if implementing it is the only issue.

@Ambyjkl
Copy link

Ambyjkl commented Apr 13, 2022

@mgord9518 thanks for taking time to add this :). I got started with adding zstd support here #1091, you can reference that. Do make sure your code is well-tested or communicate properly that tests will be added in the future (or bad things will happen)

@TheAssassin
Copy link
Member

Zstd seems more and more promising nowadays. We should really pick up #1091 again. I'll provide feedback asap.

@mgord9518
Copy link

@TheAssassin

Reading the makefile for AppImageKit's runtime, there's a TODO mentioning splitting apart the builds per each compression type (which would be absolutely necessary with ZSTD considering its relatively larger library size), would the preferred method of achieving this be environment variables?

Eg:

make WITH_ZSTD=1

make WITH_GZIP=1

Etc...

@probonopd
Copy link
Member Author

probonopd commented May 29, 2022

@gsantner what you are suggesting is roughly along the lines of what @TheAssassin and I had brainstormed recently. So I think we should turn it into action soon-ish. By the way, we are always looking for volunteers. :-)

Like where could I find LibreWolf (firefox)?

appimage.github.io lists AppImages that have passed the automated test on the oldest still-supported LTS release of Ubuntu. I tried to add LibreWolf three times, but all tests so far have failed:

OK, it's now 2022 so maybe I should give it another try...

...and it seems like the test finally succeeded 👍

https://appimage.github.io/LibreWolf/

@probonopd
Copy link
Member Author

probonopd commented May 29, 2022

don't notice any (better or worse) performance difference

Maybe, if you have some time, you could benchmark this with different parameters for block sizes and compression levels? And document the result, so that we can pick ideal defaults in upcoming tools.

@Samueru-sama
Copy link

Hi, I would like to recompress some appimages I have to zstd to improve the execution time, I did some tests by manually compressing the .AppDir to gzip vs zstd:1 and the difference is quite significant in the decompression speed:

tar -xzf Brave.AppDIr.tar.gz  2.90s user 0.75s system 116% cpu 3.127 total
tar -I zstd -xf Brave.AppDIr.tar.zst  0.63s user 0.58s system 202% cpu 0.599 total

If I try to use appimage tool it tells me that "Only gzip (faster execution, larger files) and xz (slower execution, smaller files) compression is supported at the moment".

Sorry is this isn't the right place to ask this.

@probonopd
Copy link
Member Author

Please try the next-generation https://github.com/AppImage/appimagetool/ - does it work with that one?

@Samueru-sama
Copy link

Samueru-sama commented Feb 13, 2024

Please try the next-generation https://github.com/AppImage/appimagetool/ - does it work with that one?

Thanks, this one works. Here are some benchmarks:

The new Brave zstd appimage is 150 MB while the original appimage is 161 MB.

Original:

./Brave.AppImage --appimage-extract > /dev/null  1.84s user 0.29s system 85% cpu 2.499 total
./Brave.AppImage --appimage-extract > /dev/null  1.77s user 0.30s system 84% cpu 2.440 total

Zstd:

./Bravezstd.AppImage --appimage-extract  0.64s user 0.30s system 99% cpu 0.936 total
./Bravezstd.AppImage --appimage-extract  0.60s user 0.38s system 99% cpu 0.980 total

The extraction time is significantly faster.

(Something interesting is that doing a appimage-extract to the zstd appimage doesn't result in the list of files being printed on the terminal, but doing it to the original appimage does print that info).

And here are benchmarks of the startup time of the appimage:

Original AppImage:

Time taken: 2.49 seconds
Time taken: 2.69 seconds
Time taken: 2.70 seconds

Zstd appimage:

Time taken: 4.19 seconds
Time taken: 4.19 seconds
Time taken: 4.15 seconds

Huh, that's weird. It takes longer to start with zstd. How can I find which compression algo is being used on the original appimage? And do I have to do something other than ./appimagetool-x86_64.AppImage --comp zstd Brave.AppDir when creating zstd appimage that improves the startup time?

Also what's the default compression level when using zstd and can it be changed?

@probonopd
Copy link
Member Author

That's weird indeed. I mean, we picked zstandard because it should be faster at runtime... unfortunately I don't know all the answers, more investigation is welcome. At least the algorithm, block size, and compression level play into this. We have to make tradeoffs between image size, runtime speed, and zsync efficiency. To find out the compression algorithm of the first AppImage, you might run it with --appimage-offset and then remove that number of bytes from the beginning of the file (in other words, make a copy of the file skipping that number of files). This way, you get the "pure" squashfs file, which you can run the file command on. With some luck it should show you the compression algorithm. Maybe running unsquashfs on it shows even more information, I am not sure whether it shows the compression level and block size, though.

@Samueru-sama
Copy link

That's weird indeed. I mean, we picked zstandard because it should be faster at runtime... unfortunately I don't know all the answers, more investigation is welcome. At least the algorithm, block size, and compression level play into this. We have to make tradeoffs between image size, runtime speed, and zsync efficiency. To find out the compression algorithm of the first AppImage, you might run it with --appimage-offset and then remove that number of bytes from the beginning of the file (in other words, make a copy of the file skipping that number of files). This way, you get the "pure" squashfs file, which you can run the file command on. With some luck it should show you the compression algorithm. Maybe running unsquashfs on it shows even more information, I am not sure whether it shows the compression level and block size, though.

I just checked the source of the appimage, they use gzip: https://github.com/srevinsaju/Brave-AppImage/blob/master/.github/workflows/release.yml

./appimagetool.AppDir/AppRun --comp gzip "$APPDIR" -n -u

I just went and recreated the steps, this time recompressing the extracted dir to gzip with the older version of appimagetool I have (the newer one tells me that it can only do zstd) and now it took 2.7s to start. So that rules out an issue with recreating the appimage.

If you want you can try to replicate my results, I extracted it, renamed it to Brave.AppDir and used appimage tool to turn it into a zstd appimage and compare the startup time of both.

What's the default zstd compression level being used in appimagetool? If it is something like 9 maybe that is what causes the increased delay. Lots of testing and benchmark I've seen indicate that using anything higher than 2 is rarely needed, it causes something like a 20% decrease in decompression speed for a 2% gain in compression ratio.

@probonopd
Copy link
Member Author

probonopd commented Feb 13, 2024

Yes, we have never really systematically analyzed the optimal compression levels.

Looks like we are currently invoking mksquashfs without any particular compression level, so whatever it uses by default gets used:

https://github.com/AppImage/appimagetool/blob/bfe6e0c1c663b6f58c0759d942abc3a6d1729c75/src/appimagetool.c#L156-L164

Maybe you'd like to add a different compression level there and compare the results. Thanks for your contribution!

@Samueru-sama
Copy link

Samueru-sama commented Feb 13, 2024

Yes, we have never really systematically analyzed the optimal compression levels.

Looks like we are currently invoking mksquashfs without any particular compression level, so whatever it uses by default gets used:

https://github.com/AppImage/appimagetool/blob/bfe6e0c1c663b6f58c0759d942abc3a6d1729c75/src/appimagetool.c#L156-L164

Maybe you'd like to add a different compression level there and compare the results. Thanks for your contribution!

If it is using the default it means that it is using zstd:3. The difference between 1 and 3 isn't that big that I would think that what causes the near extra 2 second delay. But it would need to be tested either way.

Thanks for pointing out what to edit in the code of appimage tool, unfortunately that is beyond my knowledge at this point, the build instructions are for docker (which I don't know how to use) and there isn't an aur appimagetool-git package that I could use as reference to make it.

I might test other appimages gzip vs zstd to see if it still takes longer to start with those.

EDIT: Doing the same test with the librewolf appimage.

Original startup time:

Time taken: 2.70 seconds
Time taken: 2.81 seconds
Time taken: 2.70 seconds

Zstd:

Time taken: 3.79 seconds
Time taken: 3.57 seconds
Time taken: 3.67 seconds

It is also slower with the default zstd compression.

@JulianGro
Copy link

I wonder if a bigger AppImage, like Overte (>500MiB) would be an interesting benchmark. https://overte.org/downloads.html
Though, like you said, the ~2 second delay seems fishy. Zstd is way faster than gzip, at least that is my experience when using them as BTRFS filesystem compression.

@mgord9518
Copy link

@JulianGro Depends on implementation. Using libdeflate closes the gap but ZSTD is still faster. Libdeflate is 2x as fast as zlib, while zstd is 3x as fast for normal usage.

@probonopd
Copy link
Member Author

probonopd commented Feb 14, 2024

Thanks for pointing out what to edit in the code of appimage tool, unfortunately that is beyond my knowledge at this point, the build instructions are for docker

Alright, I see. A much easier and quicker way to do some tests would be to

  1. Extract an example AppImage, e.g., the LibreWolf one
  2. Using mksquashfs, make different images out of it with different algorithms, compression levels, block sizes
  3. Download https://github.com/AppImage/type2-runtime/releases/download/continuous/runtime-x86_64
  4. Using cat, create AppImage files by combining the runtime with the squashfs file (just append the squashfs to the runtime)
  5. Benchmark the resulting AppImages

Please let me know if you'd like me to elaborate on how exactly to do that.

@Samueru-sama
Copy link

Samueru-sama commented Feb 14, 2024

Thanks for pointing out what to edit in the code of appimage tool, unfortunately that is beyond my knowledge at this point, the build instructions are for docker

Alright, I see. A much easier and quicker way to do some tests would be to

  1. Extract an example AppImage, e.g., the LibreWolf one
  2. Using mksquashfs, make different images out of it with different algorithms, compression levels, block sizes
  3. Download https://github.com/AppImage/type2-runtime/releases/download/continuous/runtime-x86_64
  4. Using cat, create AppImage files by combining the runtime with the squashfs file (just append the squashfs to the runtime)
  5. Benchmark the resulting AppImages

Please let me know if you'd like me to elaborate on how exactly to do that.

Alright I think I got it. The only thing I'm not sure if I did right was the append part with cat, I did this with the zstd1 squashfs image:

cat zstd1.squashfs >> runtime-x86_64

And now runtime became the appimage, its size increased to match the size of the squashfs image and it launches librewolf when run so I assume I got it right.

Here are the results for for several appimages, I made several copies of the runtime and appended different squashfs images with different comp levels to each and proceeded to benchmark the startup time of each, I only changed the compression level, didn't change anything else:

zstd3 (which is 118MB):

Time taken: 1.82 seconds
Time taken: 1.93 seconds
Time taken: 1.93 seconds

zstd1 (125MB):

Time taken: 1.82 seconds
Time taken: 1.82 seconds
Time taken: 1.93 seconds

zstd6 (110MB):

Time taken: 1.93 seconds
Time taken: 2.04 seconds
Time taken: 2.04 seconds

And for reference, here is the startup time of the original librewolf appimage (likely gpzip) it is 112MB:

Time taken: 2.36 seconds
Time taken: 2.9 seconds
Time taken: 2.58 seconds

This is good, even zstd6 is faster by a considerable amount!

And here is the startup time of librewolf zstd appimage, created using ./appimagetool-x86_64.AppImage --comp zstd librewolf.AppDir

I think appimagetool is doing more than just using the default zstd 3 compression level because the resulting appimage is 97 MB in size?! That's indicative that is using a very high compression level.

Time taken: 3.43 seconds
Time taken: 3.43 seconds
Time taken: 3.64 seconds

That explains everything. There is something going on with the appimagetool that causes the very compressed appimage, which on one end is good because it compressed it considerable, however now the application is also considerably slower on startup.

Maybe for application like web-browsers the documentation should state to not go too hard on the compression level, already zstd1 is smaller than what a native package of librewolf would be (340 MB) so I only see downsides with using a too high compression level for this type of application.

Thanks for the help probono, I hope I did all the steps correctly.

Here is how I'm benchmarking the startup time, it is a launcher script that launches the appimage in the same directory and will stop counting once the i3wm window class matches the one of the appimage which for librewolf is 'LibreWolf' (I have i3 configured to automatically focus on new windows once they spawn).

#!/bin/bash

# Get the directory of the script
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )"

# Start the application (replace 'APPLICATION' with the actual application name)
"$DIR/APPLICATION" &

# Get the start time in nanoseconds
start_time=$(date +%s%N)

# Loop until the window class matches "WINDOWCLASSHERE"
while true; do
    window_class=$(i3-msg -t get_tree | jq -r ".. | select(.focused? == true) | .window_properties.class")
    if [[ "$window_class" == "WINDOWCLASSHERE" ]]; then
        break
    fi
    # Sleep for a bit to prevent high CPU usage
    sleep 0.1
done

# Get the end time in nanoseconds
end_time=$(date +%s%N)

# Calculate the time taken and convert to seconds
time_taken=$(echo "scale=2; ($end_time - $start_time) / 1000000000" | bc)

# Output the time it took to run
echo "Time taken: $time_taken seconds"

The benchmark script was written by AI, so no idea how terrible it might be, but it has worked well so far haha.

Edit: I also tested brave with the mksquashfs and append method, the startup time is 1.63 seconds when using zstd1, very good!

@brunoais
Copy link

The benchmark script was written by AI, so no idea how terrible it might be, but it has worked well so far haha.

The main issue this has is its 0.1 sleep which means that the numbers you show are only reliable to a between -0.2 to 0 seconds. The 2nd decimal digit has no reliable value and the first decimal digit is so-so.

As long as you stay in the realm of the 0.5s, it's ok

@mralusw
Copy link

mralusw commented Apr 5, 2024

@Samueru-sama @probonopd the mksquashfs default is zstd:15. Nothing to do with appimagetool.

Luckily it's simple enough to change the level, by passing appimagetool the --mksquashfs-opt flag twice (because mksquashfs doesn't take -Xcompression-level=N — it insists on two separate CLI args; hey, what did you expect from a tool that seems bent on having inconvenient usage?):

appimagetool --comp zstd --mksquashfs-opt -Xcompression-level --mksquashfs-opt 3 ...

@mralusw
Copy link

mralusw commented Apr 5, 2024

@probonopd if you want to implement different compression levels, I'd suggest accepting "zstd:N" (just as mount.btrfs accepts). There are tools that call appimagetool internally (linuxdeploy), and an extra level of CLI arg wrapping gets a bit insane.

@probonopd
Copy link
Member Author

Well, ideally we find the optimal ("balanced") settings and set them for everyone. So that we make the choice, and not every application developer has to think about which compression level to use.

@mralusw
Copy link

mralusw commented Apr 5, 2024

Well, ideally we find the optimal ("balanced") settings and set them for everyone. So that we make the choice, and not every application developer has to think about which compression level to use.

So in essence you're suggesting making it worse than it already is (because right now I can set all the knobs, including block size, to what I know is good for my specific use case). It is a trend I find in a lot of software nowadays, let's not name names.

Have your mythical "balanced" setting, but don't make me break open and re-create the appimage if your compression level, squashfs block size etc don't suit me.

@mralusw
Copy link

mralusw commented Apr 5, 2024

Frankly I doubt there is an optimal level. It depends on the contents, but also on the purpose and nature of the appimage. The app can start fast or slow independently from the unpacked size (maybe it takes ten seconds to scan the system or connect to a server before it's even usable). The developer might favor a small size for easy downloads, or little compression for a fast startup, and you wouldn't know. Etcetera.

Researching the "balanced" setting would be a full time job of surveying your "customers" (appimage developers), and still half of them, by definition won't be pleased with the default. It's your call, but maybe there are more useful things to implement (if anything new gets implemented at this point).

@Samueru-sama
Copy link

If you ask me the optimal level is zstd:1, it gives similar comp ratios to gzip while being much faster at decompression.

Something similar also happens with zstd is used on Btrfs for filesystem level compression, it is only zstd:1 that makes sense, and sometimes zstd:2 is better under some workloads but the difference is less than 2%.

@mgord9518
Copy link

@Samueru-sama If you're going to go that low, you might as well use lz4_hc, which also has compression similar to gzip but is about twice as fast as zstd and has a much smaller space overhead in the runtime.

zstd shines at mid-high compression levels, so "optimal" is probably in the range of 12-17 (15 is the default level for mksquashfs). So you'd get better compression than gzip and still get substantially faster decompression

Developers should definitely be able to choose something else if it provides a better experience for their application though

@Samueru-sama
Copy link

@Samueru-sama If you're going to go that low, you might as well use lz4_hc, which also has compression similar to gzip but is about twice as fast as zstd and has a much smaller space overhead in the runtime.

zstd shines at mid-high compression levels, so "optimal" is probably in the range of 12-17 (15 is the default level for mksquashfs). So you'd get better compression than gzip and still get substantially faster decompression

Developers should definitely be able to choose something else if it provides a better experience for their application though

How can I benchmark lz4_hc? I only found a github issue where the user was told to use lz4:9 instead.

Either way I just did a quick test using the librewolf AppDIr as a compressed tar file:

~/ time unzstd Librewolf-zstd-15.tar.zst
Librewolf-zstd-15.tar.zst: 349124608 bytes                                     
unzstd Librewolf-zstd-15.tar.zst  0.55s user 0.25s system 152% cpu 0.529 total

~/ time unzstd Librewolf-zstd-12.tar.zst
Librewolf-zstd-12.tar.zst: 349124608 bytes                                     
unzstd Librewolf-zstd-12.tar.zst  0.53s user 0.29s system 148% cpu 0.551 total

~/ time unzstd Librewolf-zstd-1.tar.zst 
Librewolf-zstd-1.tar.zst: 349124608 bytes                                      
unzstd Librewolf-zstd-1.tar.zst  0.50s user 0.24s system 150% cpu 0.491 total

~/ time unlz4 Librewolf-lz4-9.tar.lz4 
Decoding file Librewolf-lz4-9.tar 
Librewolf-lz4-9.tar. : decoded 349124608 bytes                                 
unlz4 Librewolf-lz4-9.tar.lz4  0.36s user 0.18s system 99% cpu 0.538 total

And the sizes of the files were 86.5MiB for zstd:15, 87.7MiB for zstd:12, 116.7MiB for zstd:1 and 124.8MiB for lz4:9.

Which yeah these benchmarks give the idea that zstd:12-15 is good, but as seen in the tests above when actually opening the appimage the default zstd compression of mksquashfs was adding a considerable delay, even slower than gzip.

@mralusw
Copy link

mralusw commented Apr 6, 2024

If you measure sub-second timings (which have a considerable variance as I can see from the various timings you quoted) use sharkdp/hyperfine. E.g. hyperfine -r 10 'cmd1 ...' 'cmd2 ...'. Otherwise you're measuring noise.

@mgord9518
Copy link

@Samueru-sama What I do is make SquashFS images of various levels, then test reading and other simple file operations. FUSE also caches files, so you get a significant speed-up after the first read. I've never had any issues with lz4_hc, what was the issue you saw about?

As for the AppImages, I'm not sure why higher zstd ratios would be so much slower, that shouldn't be the case because using squashfuse I didn't seem to notice any massive slow-downs, certainly not any that would make it slower than gzip.

When benchmarking SquashFS compression, tar files should be avoided, they give a good indication as to how fast the decompression itself is as a whole, but speed and file size might be different when compressed into blocks, like in SquashFS. Use mksquashfs then mount with squashfuse and test that instead

@Samueru-sama
Copy link

Samueru-sama commented Apr 6, 2024

@Samueru-sama What I do is make SquashFS images of various levels, then test reading and other simple file operations. FUSE also caches files, so you get a significant speed-up after the first read. I've never had any issues with lz4_hc, what was the issue you saw about?

As for the AppImages, I'm not sure why higher zstd ratios would be so much slower, that shouldn't be the case because using squashfuse I didn't seem to notice any massive slow-downs, certainly not any that would make it slower than gzip.

When benchmarking SquashFS compression, tar files should be avoided, they give a good indication as to how fast the decompression itself is as a whole, but speed and file size might be different when compressed into blocks, like in SquashFS. Use mksquashfs then mount with squashfuse and test that instead

I haven't had issues with lz4_hc (because I've never used it lol), just that I asked you how to benchmark it because when I searched for it the info I got is that I should use lz4:9 instead. I asked you because you said that instead of zstd:1 one should use lz4_hc.

I just repeated my tests of zstd (defaults which seems to be zstd 15) vs gzip (and modified the script a little bit) with the librewolf appimage:

(Once again I just did here was --appimage-extract the librewolf appimage and made a new zstd appimage using appimagetool using its defaults, the title of the appimage indicates the type in the results)

~/ ./benchmark-startup.sh original-LibreWolf.x86_64.AppImage
Time taken: 2.32 seconds
~/ ./benchmark-startup.sh original-LibreWolf.x86_64.AppImage
Time taken: 2.15 seconds
~/ ./benchmark-startup.sh original-LibreWolf.x86_64.AppImage
Time taken: 1.99 seconds
~/ ./benchmark-startup.sh original-LibreWolf.x86_64.AppImage
Time taken: 1.92 seconds
~/ ./benchmark-startup.sh original-LibreWolf.x86_64.AppImage
Time taken: 2.04 seconds
~/ ./benchmark-startup.sh zstd-LibreWolf.x86_64.AppImage
Time taken: 2.78 seconds
~/ ./benchmark-startup.sh zstd-LibreWolf.x86_64.AppImage
Time taken: 3.00 seconds
~/ ./benchmark-startup.sh zstd-LibreWolf.x86_64.AppImage
Time taken: 2.79 seconds
~/ ./benchmark-startup.sh zstd-LibreWolf.x86_64.AppImage
Time taken: 2.73 seconds
~/ ./benchmark-startup.sh zstd-LibreWolf.x86_64.AppImage
Time taken: 2.73 seconds

And just for the heck of it, I decided to make again the zstd1 level one appimage using the method that @mralusw suggested:

appimagetool --comp zstd --mksquashfs-opt -Xcompression-level --mksquashfs-opt 3 ./LibreWolf.AppDir

And it did make a zstd:1 appimages I think, because this appimage is 117.6 MiB, however when I tested it wasn't faster than the original appimage (gzip), which is odd because in the older tests zstd1 was significantly faster than the others:

~/ ./benchmark-startup.sh zstd:1-LibreWolf.x86_64.AppImage                     
Time taken: 2.18 seconds
~/ ./benchmark-startup.sh zstd:1-LibreWolf.x86_64.AppImage
Time taken: 2.09 seconds
~/ ./benchmark-startup.sh zstd:1-LibreWolf.x86_64.AppImage
Time taken: 1.99 seconds
~/ ./benchmark-startup.sh zstd:1-LibreWolf.x86_64.AppImage
Time taken: 2.10 seconds
~/ ./benchmark-startup.sh zstd:1-LibreWolf.x86_64.AppImage
Time taken: 2.06 seconds

So I went and made the appimage again manually using mksquashfs ./LibreWolf.AppDir Manual-zstd-1 -comp zstd -Xcompression-level 1 and cat Manual-zstd-1 >> runtime-x86_64

This zstd:1 appimage is 122.5 MiB (No idea why there's a size difference) however it does start up quite fast like it should:

~/ ./benchmark-startup.sh manual-zstd:1-LibreWolf.x86_64.AppImage
Time taken: 1.50 seconds
~/ ./benchmark-startup.sh manual-zstd:1-LibreWolf.x86_64.AppImage
Time taken: 1.49 seconds
~/ ./benchmark-startup.sh manual-zstd:1-LibreWolf.x86_64.AppImage
Time taken: 1.51 seconds
~/ ./benchmark-startup.sh manual-zstd:1-LibreWolf.x86_64.AppImage
Time taken: 1.50 seconds
~/ ./benchmark-startup.sh manual-zstd:1-LibreWolf.x86_64.AppImage
Time taken: 1.48 seconds
~/ ./benchmark-startup.sh manual-zstd:1-LibreWolf.x86_64.AppImage
Time taken: 1.51 seconds

So hmmm yeah, there's something other than the zstd compression level at play here. This is the updated script I'm using, for it to work you would need to do this test using i3wm:

#!/bin/bash

# Get the directory of the script
DIR="$(dirname "$(readlink -f "${0}")")"

# Start the application
"$DIR/"$@"" &

# Get the start time in nanoseconds
start_time=$(date +%s%N)

# Loop until the window class is 'YourAppImage'
while true; do
    window_class=$(i3-msg -t get_tree | jq -r ".. | select(.focused? == true) | .window_properties.class")
    if [[ "$window_class" == "LibreWolf" ]]; then
        break
    fi
    # Sleep for a bit to prevent high CPU usage
    sleep 0.01
done

# Get the end time in nanoseconds
end_time=$(date +%s%N)

# Calculate the time taken and convert to seconds
time_taken=$(echo "scale=2; ($end_time - $start_time) / 1000000000" | bc)

# Output the time it took to run
killall librewolf
echo "Time taken: $time_taken seconds"

And if you want the appimages I made for these tests, just let know and I'll give you link.

Edit: Also if you wonder, here is the startup time of a native librewolf (no appimage):

 ~/ ./benchmark-startup.sh librewolf                                     
Time taken: .94 seconds
~/ ./benchmark-startup.sh librewolf
Time taken: .91 seconds
~/ ./benchmark-startup.sh librewolf
Time taken: .88 seconds
~/ ./benchmark-startup.sh librewolf
Time taken: .93 seconds
~/ ./benchmark-startup.sh librewolf
Time taken: .89 seconds
~/ ./benchmark-startup.sh librewolf
Time taken: .90 seconds

@mralusw
Copy link

mralusw commented Apr 6, 2024

@Samueru-sama

And just for the heck of it, I decided to make again the zstd1 level one appimage using the method that @mralusw suggested:
appimagetool --comp zstd --mksquashfs-opt -Xcompression-level --mksquashfs-opt 3 ./LibreWolf.AppDir

That makes a zstd:3, not a zstd:1. Look at the args. Every appimagetool --mksquashfs arg passes the next arg to mksquashfs

@Samueru-sama
Copy link

Samueru-sama commented Apr 6, 2024

@Samueru-sama

And just for the heck of it, I decided to make again the zstd1 level one appimage using the method that @mralusw suggested:
appimagetool --comp zstd --mksquashfs-opt -Xcompression-level --mksquashfs-opt 3 ./LibreWolf.AppDir

That makes a zstd:3, not a zstd:1. Look at the args. Every appimagetool --mksquashfs arg passes the next arg to mksquashfs

Oh my bad, I did actually change the --mksquashfs-opt 3 to 1 in my tests, just that I pasted it here I forgot to change the number. Just checked my zsh history and it does have the 1 in there.

And just to be extra sure I did make it again this time making sure that it is --mksquashfs-opt 1 and the resulting size is 117.6 MiB like I said before, changing it to 3 makes it 104.8 MiB instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet