-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak when opening an invalid PDF with no %%EOF in tail #3344
Labels
Comments
I cannot reproduce this with the current version, PyMuPDF-1.24.1. What version of PyMuPDF are you using? |
I am using 1.23.4
…On Thu, Apr 4, 2024 at 3:46 PM Julian Smith ***@***.***> wrote:
I cannot reproduce this with the current version, PyMuPDF-1.24.1. What
version of PyMuPDF are you using?
—
Reply to this email directly, view it on GitHub
<#3344 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVNAD45N7NHNOAWKVSIQJA3Y3W3ZXAVCNFSM6AAAAABFX2BDOGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZYGE4DINJTGE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
There have been quite a few improvements to memory handling since 1.23.4 and so it would be worth retrying with the latest version, 1.24.1. |
Ok, I'll check it out.
…On Fri, Apr 5, 2024 at 2:22 AM Julian Smith ***@***.***> wrote:
There have been quite a few improvements to memory handling since 1.23.4
and so it would be worth retrying with the latest version, 1.24.1.
—
Reply to this email directly, view it on GitHub
<#3344 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVNAD4YE2NR3ZYKPSYNQP53Y3ZGLXAVCNFSM6AAAAABFX2BDOGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZZGEZDQNRTHE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Closing this because waiting for information for over a month. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Description of the bug
If the file bytes are prematurely cut-off, then fitz will open the PDF file with 0 pages, but at the same time, cause a memory leak.
How to reproduce the bug
You can reproduce this bug by taking a large PDF file, and remove the last 50% of the bytes.
If you repeatably load files like this, there will be a memory leak even with a doc.close()
You can add a check if the file has an %%EOF with this code. If you call it before the doc.open() code, then you can return 0 pages without the need to produce the memory leak.
def has_eof_marker(file_path): try: with open(file_path, 'rb') as file: # Seek to the last 1KB of the file file.seek(-1024, os.SEEK_END) # Read the last 1KB tail = file.read() # Check if
%%EOFis in the last 1KB return b'%%EOF' in tail except Exception as e: print(f"Error reading file: {e}") return False
PyMuPDF version
1.23.8 or earlier
Operating system
Windows
Python version
3.11
The text was updated successfully, but these errors were encountered: