Replies: 1 comment 1 reply
-
Hey there, @jaybfn! Great to see you back diving into more adventures with code. How's everything going? Yes, your assumption is correct. The issue you're facing could indeed be due to the images being in vector image (.svg) format. The code you're using utilizes the PIL (Python Imaging Library) library, specifically the When the code attempts to open an image using As for why the LlamaIndex codebase uses PIL/Pillow for image processing instead of a library that supports both raster and vector graphics, it's likely due to its simplicity and direct support for the primary image manipulation tasks required by the codebase, such as converting images to and from base64 strings. The primary concern was handling raster images in common formats like JPEG, for which PIL/Pillow is well-suited due to its ease of use and comprehensive support for various image file formats and operations. The choice might not be related to the need for vector graphics support, which could explain the absence of a library that handles both raster and vector graphics. To resolve your issue, you might need to convert your SVG images to a raster format that PIL/Pillow can handle, such as PNG or JPEG, before running the code. Alternatively, you could modify the code to use a library that supports SVG, such as CairoSVG or svglib, but this would require a good understanding of the code and the library you're integrating. I hope this helps! If you have any more questions, feel free to ask. Sources:
Sources
|
Beta Was this translation helpful? Give feedback.
-
Hello All, I am using this code "https://github.com/run-llama/llama_parse/blob/main/examples/demo_json.ipynb" to recreate the results on my document, I am facing a problem with this code when I pass my PDF doc, it extract only some of the images and the rest it leaves it as blank json, can It be due to images being vector image (.svg) format?
Beta Was this translation helpful? Give feedback.
All reactions