Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No text detected in pdf #620

Open
rafaeldepablo opened this issue Sep 16, 2022 · 2 comments
Open

No text detected in pdf #620

rafaeldepablo opened this issue Sep 16, 2022 · 2 comments
Labels
bug Something isn't working

Comments

@rafaeldepablo
Copy link

SABADELL_GOBIERNO_CORPORATIVO_2022.pdf
Summary
great software

I'm running into strange behavior on some pdfs, apparently it's not finding any text except on the first sheet.

The pdf files are normal, it is possible to copy the text and search.

Instead if it finds the tables even though the text is blank.

Steps To Reproduce

Load the pdf and try

Expected behavior
The text is processed

Actual behavior
No text is identified

Screenshots
image

Environment

sudo docker run -p 3001:3001 axarev/parsr:latest

Thanks in advance

@rafaeldepablo rafaeldepablo added the bug Something isn't working label Sep 16, 2022
@NgoDuyVu1993
Copy link

Hi @rafaeldepablo, Ignore my comment if you find it irrelevant. I am not in Parsr team, I have some problem with Table detection so I looked around to see if anyone have the same. I tried to run your document, Parsr can detect fine with your document.

image

image

I think you may missed something when you do the setting when you uploaded document. Here is how I configured

image

@rafaeldepablo
Copy link
Author

Thanks

I tried again and it crashed, but I retried again and it worked.

Regards

rafa

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants