PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
-
Updated
May 17, 2024 - Python
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Web Crawler/Spider for NodeJS + server-side jQuery ;-)
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
Export definitions, and notes regarding how they work, for extracting data from MySchoolSask (an implementation of Follett Aspen)
Basic data extraction from website GEIPAN
Extracts data from a spreadsheet and outputs its contents to a '.SQL' file. Data extraction tool useful for people using SQL Server Express with no access to SSMS addon and import wizard.
Singer Tap for dbt API v2 built with the Meltano SDK
Singer tap for the StackExchange API
Make PDF Files Accessible, Extract Data from PDF, Convert PDF to HTML, Fill-in PDF Form, Stamp PDF and more...
PDFix SDK samples for Java Maven. PDF manipulation, content extraction, conversion , accessibility and more...
Make PDF Files Accessible, Extract Data from PDF, Convert PDF to HTML, Fill-in PDF Form, Stamp PDF and more...
SQLiteDiskExplorer enables you to explore, catalog, and batch extract SQLite files from disks and removable media.
Crawly, a high-level web crawling & scraping framework for Elixir.
A tool to replace data in a Unity Asset Bundle from modified files.
Mercy is an open-source Rust crate and CLI designed for building cybersecurity utilities and projects.
This UiPath project automates the process of extracting data from an Excel sheet and filling out a Google Form with the extracted information.
Extract structured data from any unstructured web page
Tiny helpful projects
This API extracts Japanese company names from text.
Add a description, image, and links to the extract-data topic page so that developers can more easily learn about it.
To associate your repository with the extract-data topic, visit your repo's landing page and select "manage topics."