Skip to content

khankhattak1/pdf_annotation_extraction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pdf Annotation Extraction Script

This a pdf extraction tool written in python and deployed on streamlit. It was created to Identify and extract pdf annotations like Underlined Text, Highlighted Text, Strikethrough Text and Text Comments etc.

The main purpose is of this tools is easy access to and viewing of annotations made in PDF documents.

Prerequisites

  • Python 3.10 or above

Instructions

In the terminal, type the following after creating a directory

create a virtual enviroment using:
                pythn -m venv virtual_enviroment_name_of_your_choice
activate venv
               .\virtual_enviroment_name_of_your_choice\scripts\activate
type command:
                pip install -r requirements.txt
run annotation.py:
                streamlit run annotation.py

Deployed Application URL

Streamlit link: https://pdfannotationextraction-tool.streamlit.app/

NOTE : Kindly suggest improvemnts or new features in the discussion tab