Skip to content

Ayonija/ScannedPDF-to-SearchablePDF

Repository files navigation

ScannedPDF-to-SearchablePDF

This is a C# code to convert Scanned PDF to Searchable PDF using Google Tesseract.

  • To launch this project download and open in Visual Studio with Microsoft .NET assembly or any other suitable IDE.

  • Now using Nuget Package installer, install all the packages as listed in packages.config file with their appropriate versions.

  • Next Open Program.cs file and Change the input PDF and output PDF path. Also check path of tessdata folder(you just downloaded), and replace it's path in Program.cs file.(training_data variable)

The traning data (tessdata folder) is from Google Tesseract, and can be updated with upcoming versions.