Python-PDF-Scraper

Previous version

In this file, I used pdfQuery library and with the help of pdf->xml. I get the specific pdf data.

Newer version

This version used PyMuPDF and fitz library to able to extract the hightlighted text from pdf. it will require no xml conversion and is alot faster and fairly more accurate. Before running it, run the command: pip install fitz PyMuPDF

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
BUSTA_PAGA - 2.pdf		BUSTA_PAGA - 2.pdf
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUSTA_PAGA - 2.pdf

BUSTA_PAGA - 2.pdf

README.md

README.md

main.py

main.py

Repository files navigation

Python-PDF-Scraper

Previous version

Newer version

About

Releases

Packages

Languages

uzairkabeer1/Python-PDF-Scraper

Folders and files

Latest commit

History

Repository files navigation

Python-PDF-Scraper

Previous version

Newer version

About

Topics

Resources

Stars

Watchers

Forks

Languages