Skip to content

prp-e/persian_ocr_project

Repository files navigation

Persian OCR Project

This repository is my biggest FLOSS project. I had it in my mind since the last year, when I was working on an automatic license plate number recognition project. So what I was thinking of was this a big FLOSS project and then, the idea of an OCR project came to my mind as well.

Now I started working on the whole idea and this repository will be updated for every phase of the project.

Important Notes

  • This project is published under GNU GPL version 3.0 license. I assure everyone who's concerned that as long as I, Muhammadreza Haghiri am in charge of this project the license will remain the same.
  • If some parties decide to acquire this project and want a change in license, I'll try to negotiate to keep it Free (as in freedom).
  • In the .gitignore file, we've ignored image files. It doesn't mean that our dataset won't be free. It will get so large so we've decided to ignore them in this repository, but we'll let you download data (raw or labeled) in near future.

Project Technical Details

  • Programming Language: Python 3 (3.9 on local machine, remote machines depend on where we do our tasks)
  • AI library: PyTorch
  • Model: YOLOv5

Models and Datasets

Models for June 23rd 2022

Results

Number recognition

  • From the input data:

    data

  • Screenshot from telegram:

    data

  • Final tests

    data

Letter recognition

Final test on letters

Project Phases

This part has been divided to two. First part is mostly considered lab phase since we're working as a group of data scientists and AI enthusiasts to develop and deploy our model and the second part is also considered as business/product phase and we try to present the result as a product to the outside world.

Lab phases

  • Number recognition
    • Data generation using Zarnevis.
    • Training YOLOv5 on generated data.
    • Testing the result.
      • Test on different numbers written on the same fonts.
      • Test on same or different numbers written in different fonts.
      • Test on hand-written numbers to find out how accurate our model is.
    • Asking participants to write down some random numbers (Data generation for hand-written numbers)
    • Training YOLOv5 on both hand-written and digital numbers.
    • Final Tests
      • Test on different numbers, both hand-written and digital.
  • Letter recognition
    • Data generation using Zarnevis.
      • Instead of generating our own data, data gathered from Shotor.
    • Training YOLOv5 on generated data.
    • Testing the results.
      • Test on different words with the same font.
      • Test on the same or different words written in different fonts.
    • Gathering hand-written words data.
    • Final tests.
  • Word detection
    • Training the YOLOv5 model on how to detect words in a sentence.
      • Getting books and articles
      • Converting PDFs to Images for the sake of labeling
      • Create a labeled dataset for words, numbers and maybe English words
  • Punctuation Detection
    • Training the YOLOv5 model to detect punctuations.
  • Jupyter notebook for people who want to test the model.

Business/Product phases

  • Designing a web service for production.

About

A FLOSS software for Persian Optical Character Recognition

Topics

Resources

License

Stars

Watchers

Forks