Text Extraction and Excel Parsing from Images

This Python script extracts text from images using Tesseract OCR and organizes it into an Excel file.

Features

Automated Installation: Checks for required Python modules (pytesseract, openpyxl, pandas) and installs them if missing.
Text Extraction: Utilizes Tesseract OCR to extract text from images.
Data Parsing: Parses extracted text to extract contact names and times seen, organizing them into an Excel file.
Logging: Logs informative messages, warnings, and errors for better tracking and debugging.
User Interaction: Prompts the user for image and output folder paths, allowing for interactive usage.

Usage

Ensure Python is installed.
Install Tesseract OCR:
- Windows:
  - Download the installer from https://github.com/UB-Mannheim/tesseract/wiki.
  - Run the installer and follow the installation instructions.
  - Add the Tesseract installation directory to the system's PATH environment variable.
  - click here to watch how install Tesseract Ocr for windows
- Linux:
  - Use your package manager to install Tesseract OCR. For example, on Ubuntu:
```
sudo apt-get update
sudo apt-get install tesseract-ocr
```
- macOS:
  - Install Tesseract OCR using Homebrew:
```
brew install tesseract
```
Clone or download the repository.
Place images to be processed in the images folder.
Run the script (main.py).
Follow the prompts to input image and output folder paths.
View the generated Excel files in the output folder.

Dependencies

Python 3.x
Tesseract OCR
Required Python modules: pytesseract, openpyxl, pandas

Author

LAKSHMI

Contribution

JETUR GAVLI

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Images		Images
output		output
LICENSE		LICENSE
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Images

Images

output

output

LICENSE

LICENSE

README.md

README.md

main.py

main.py

Repository files navigation

Text Extraction and Excel Parsing from Images

Features

Usage

Dependencies

Author

Contribution

License

About

Releases

Packages

Contributors 2

Languages

License

jeturgavli/ImageToText

Folders and files

Latest commit

History

Repository files navigation

Text Extraction and Excel Parsing from Images

Features

Usage

Dependencies

Author

Contribution

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages