Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8.15 core dumps when processing lots of files #3786

Open
knoppmyth opened this issue Dec 16, 2023 · 8 comments
Open

8.15 core dumps when processing lots of files #3786

knoppmyth opened this issue Dec 16, 2023 · 8 comments
Labels

Comments

@knoppmyth
Copy link

Seeing the speed improvements in 8.15, I updated my docker environment and was impressed with the speeds I was seeing. However when processing a lot of images, it would core dump.

(process:9): GLib-ERROR **: 21:47:58.734: ../../../glib/gmem.c:108: failed to allocate 5665 bytes Trace/breakpoint trap (core dumped)

To recreate:
Get the tarball here
sha256 sum is: ec450666eca8d233254d8a7ca807f01d1ebe7da5f8d92cd76a855dd47665936a
It contains 2 dockerfiles, 1 for 8.14 and 1 8.15. The core dump doesn't occur with 8.14. A main.py, 2 shell scripts to enter the containers, and a SVS file where I'm seeing the issue.
NOTE, I'm seeing the issue with other large files as well. Smaller files work without issue.
Untar in ~/tmp.

Create the containers:
docker build -f Dockerfile -t libvips . && docker build -f Dockerfile_814 -t libvips:8.14 .

Enter the containers:
./enter_814.sh
./enter_815.sh

Run main.py in the containers:
./main.py slides/70946.sys

In the 8.14 container it runs as expected. However in 8.15, it core dumps when starting tiles_to_wsi funtion.

My host OS is Arch Linux. The CPU is a Ryzen 5 5600x3d with 96 GB of RAM.

Please let me know what other information I can provide. Please let me know when you have the tarball so I can remove it on my end.

Thanks!

@knoppmyth knoppmyth added the bug label Dec 16, 2023
@jcupitt
Copy link
Member

jcupitt commented Dec 17, 2023

Hi @knoppmyth,

Oh dear, that sounds bad. I'll investigate later this week.

@knoppmyth
Copy link
Author

Thanks @jcupitt

@jcupitt
Copy link
Member

jcupitt commented Dec 20, 2023

I downloaded the tarball, thanks.

I was able to reproduce this in ubuntu 23.10 with libvips master and this prog:

#!/usr/bin/env python3

import sys
import pyvips

input_directory = sys.argv[1]
output_file = sys.argv[2]
tiles_across = int(sys.argv[3])
tiles_down = int(sys.argv[4])

print(f"loading {tiles_across * tiles_down} tiles ...")
tiles = [pyvips.Image.new_from_file(f"{input_directory}/{x}_{y}.jpg",
                                    access="sequential")
         for y in range(tiles_down)
         for x in range(tiles_across)]

joined = pyvips.Image.arrayjoin(tiles, across=tiles_across)

print("saving ...")
joined.write_to_file(output_file)

It works if I run it like this:

john@banana ~/x/tmp/images/70946.svs/tiles/orig/orig_files $ ~/try/arrayjoin2.py 0 ~/x.tif[compression=jpeg,tile] 366 176
loading 64416 tiles ...
saving ...
john@banana ~/x/tmp/images/70946.svs/tiles/orig/orig_files $

But fails if I run:

john@banana ~/x/tmp/images/70946.svs/tiles/orig/orig_files $ ~/try/arrayjoin2.py 0 ~/x.tif[compression=jpeg,tile] 366 177
loading 64782 tiles ...
(process:181986): GLib-ERROR **: 15:30:25.997: ../../../glib/gmem.c:136: failed to allocate 7044 bytes
Trace/breakpoint trap (core dumped)
john@banana ~/x/tmp/images/70946.svs/tiles/orig/orig_files $

That's suspiciously close to 2^16. I wonder if it's overflowing 16 bits for a file descriptor somewhere?

@jcupitt
Copy link
Member

jcupitt commented Dec 20, 2023

I tried in C and it also fails, so it's not python:

/* compile with:
 *
 * gcc -g -Wall try345.c `pkg-config vips --cflags --libs`
 */

#include <vips/vips.h>

int
main(int argc, char **argv)
{
    if (VIPS_INIT(argv[0]))
        vips_error_exit(NULL);

    if (argc != 5)
        vips_error_exit("usage: %s base-dir outfile tiles-across tiles-down",
            argv[0]);

    VipsObject *context = VIPS_OBJECT(vips_image_new());
    int tiles_across = atoi(argv[3]);
    int tiles_down = atoi(argv[4]);
    int n_tiles = tiles_across * tiles_down;
    VipsImage **tiles = (VipsImage **)
        vips_object_local_array(context, n_tiles);

    printf("loading %d tiles ...\n", n_tiles);
    for (int y = 0; y < tiles_down; y++)
        for (int x = 0; x < tiles_across; x++) {
            int i = x + y * tiles_across;

            char filename[256];

            snprintf(filename, 256, "%s/%d_%d.jpg", argv[1], x, y);
            if (!(tiles[i] = vips_image_new_from_file(filename,
                            "access", VIPS_ACCESS_SEQUENTIAL,
                            NULL)))
                    vips_error_exit(NULL);
        }

    printf("assembling ...\n");
    VipsImage *image;
    if (vips_arrayjoin(tiles, &image, n_tiles, "across", tiles_across, NULL))
        vips_error_exit(NULL);

    printf("saving ...\n");
    if (vips_image_write_to_file(image, argv[2], NULL))
        vips_error_exit(NULL);

    return 0;
}

I see:

john@banana ~/x/tmp/images/70946.svs/tiles/orig/orig_files $ ~/try/a.out 0 ~/x.tif[compression=jpeg,tile] 366 175
loading 64050 tiles ...
assembling ...
saving ...
john@banana ~/x/tmp/images/70946.svs/tiles/orig/orig_files $

But 176 fails.

@jcupitt
Copy link
Member

jcupitt commented Dec 20, 2023

It could be a stack overflow, I guess, I'll check.

@knoppmyth
Copy link
Author

@jcupitt Thanks for taking a deeper look.

@jaume-pinyol
Copy link

jaume-pinyol commented Apr 18, 2024

Hi! @jcupitt I'm having the same issue with 8.15.2, what is the status? I was going to see if I can spot the issue but given this issue is some months old maybe someone is already working on it.

@jcupitt
Copy link
Member

jcupitt commented Apr 18, 2024

Sorry, I've been distracted on other projects. Please have a go if you have time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants