-
-
Notifications
You must be signed in to change notification settings - Fork 647
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
check fail_on on more paths in jpeg load #3293
base: master
Are you sure you want to change the base?
Conversation
Some truncated JPEG files can cause out of order reads since jpeg_read_scanlines() will longjmp out before it gets a chance to update y_pos. So we need to check fail_on on the longjmp path and the out of order path. See: #3289
Using the truncated jpg in the linked discussion, before this patch I see:
And after:
|
Wow, I never thought of reducing the severity of this "error". I think this might help for #2757. Using the same reproducer as mentioned in #2757 (reply in thread) and testing this PR in combination with commit kleisauke@0eeabf8, I see this on both $ vips black x.jpg 539 960
$ VIPS_STALL=1 vips reducev x.jpg x.v 2
vips_worker_work_unit: stall done, releasing y = 0 ... [x3]
$ sha256sum x.v
7be26c512c54b5e966f1239ff215e84418659ad267f35d9a9c5dbc73b5359073 x.v
$ vips black x2.jpg 10000 10000
$ VIPS_STALL=1 vips reducev x2.jpg x2.v 2
vips_worker_work_unit: stall done, releasing y = 0 ... [x39]
$ sha256sum x2.v
220ee900f622fb50d24a0b629ad0108822b3e442ee587b85a860c1b50a70d325 x2.v Thus, it produces identical images without causing an out of order read error. I double-checked just to be sure. /me falls off chair Perhaps the PNG and TIFF loaders needs something similar?: libvips/libvips/foreign/spngload.c Lines 520 to 522 in 07d3a53
libvips/libvips/foreign/vipspng.c Lines 717 to 719 in 07d3a53
libvips/libvips/foreign/tiff2vips.c Lines 2537 to 2540 in 07d3a53
FWIW, I confirm that a similar change to the spng loader also solves the issue mentioned in #3270 (comment). |
Huh this is a strange one, you're right. I made a branch with this fail-on change, plus your remove seq from reduce patch: https://github.com/libvips/libvips/tree/remove-seq-from-reduce And I can't provoke an out of order read either:
All seem to work. I wonder if this is due to this fail-on change though? Could it be caused by the new threadpool sizing system? That removes unnecessary threads from pools, so you're less likely to get threads running ahead of the group and triggering misordered reads. |
Oh haha this fails instantly:
|
I tried with a real image (not just black):
And verified that this produced the correct result:
And it all seems to work (on the If I try git master and remove the |
... so I think I don't have a good understanding of what's happening. I might go for a cycle ride and see if anything pops into my head. |
Indeed, it's a bit strange. Good idea to check non-black images as well. I'm also not sure why I never see the Have a nice cycle ride! |
FWIW, I just pushed commit 937ecf7 to that branch, to ensure it doesn't use |
Ah, this error is only displayed when a non-zero exit code is returned. It can be noticed when you pass |
I found an out of order read error in the wild with certain operations and parameters that would produce a non-deterministic image on the Detailsimport pyvips
pyvips.leak_set(True)
pyvips.cache_set_max_files(0)
# https://wsrv.nl/zebra.jpg
# vipsthumbnail zebra.jpg -o x.jpg -s "2135x1800!"
filename = 'x.jpg'
im = pyvips.Image.thumbnail(filename, 720)
im = im.flipver()
im.write_to_file('x2.jpg')
im = None
# Ensure leak checker is run, see:
# https://github.com/libvips/libvips/commit/01a9947006753d6bd4674483c619c7d2ae79bfbc
pyvips.ffi.cdef('void vips_shutdown();')
pyvips.vips_lib.vips_shutdown()
Though, the produced image on the master branch is also incorrect, so perhaps this is another issue. |
Ah, I wasn't focused enough, |
FWIW, increasing the check-nondeterminism.sh#!/usr/bin/env bash
set -ex
# Usage: ./check-nondeterminism.sh 100
n=$1
# Prepare test image
curl -LO https://github.com/libvips/vips-bench/raw/master/sample2.v
vips replicate sample2.v t1.v 40 30
vips extract_area t1.v x.jpg 0 0 10000 10000
rm t1.v sample2.v
# Do first run, check sha256 of produced image
VIPS_STALL=1 vips reducev x.jpg x.v 4 --vips-leak
expected_sha256=$(sha256sum x.v | awk '{ print $1 }')
# Check for any nondeterminism
for (( i = 0; i < n; i++ )); do
echo "run $i"
VIPS_STALL=1 vips reducev x.jpg x.v 4
echo "$expected_sha256 x.v" | sha256sum --check
done |
After a couple hours of debugging (:sweat_smile:), I might be close to a possible fix. Commit 2603bba adapts the libvips/libvips/conversion/tilecache.c Lines 956 to 957 in 76f2178
libvips/libvips/conversion/tilecache.c Lines 908 to 909 in 76f2178
Using the above
So, it bumped the libvips/libvips/conversion/tilecache.c Lines 951 to 952 in 76f2178
It does not seem to cause an out of order read error. This same "trick" can also be applied to this test: $ vips black x.jpg 10000 10000
$ VIPS_STALL=1 vips reducev x.jpg x.v 2 --vips-leak
vips_line_cache_build: max_tiles = 64
vips_worker_work_unit: stall done, releasing y = 0 ...
vips_line_cache_gen: bumped max_tiles to 66
...
vips_threadset_free: peak of 8 threads
memory: high-water mark 49.61 MB
error buffer: VipsJpeg: out of order read at line 776 And hardcoding the $ vips black x.jpg 10000 10000
$ VIPS_STALL=1 vips reducev x.jpg x.v 2 --vips-leak
vips_line_cache_build: max_tiles = 66
vips_worker_work_unit: stall done, releasing y = 0 ... [x39]
vips_threadset_free: peak of 8 threads
memory: high-water mark 49.61 MB |
I think the issue is that some old tiles are being reused too early. By the time it bumps the While investigating this, I found a few minor improvements, which I just pushed to the |
Some truncated JPEG files can cause out of order reads since
jpeg_read_scanlines()
willlongjmp()
out before it gets a chance to updatey_pos
. So we need to checkfail_on
on thelongjmp()
path and the out of order path.See: #3289