Skip to content

ahmedalm1/form-recognizer-json-to-csv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Converting receipt line-items extracted using Form Recognizer from JSON to CSV format

By default, results from Form Recognizer are generated in JSON format, which is a great way of storing the information extracted from the scanned documents. Typically, converting that JSON output to CSV format is a straightforward process, which then can be used in further analysis in Power BI for example. One way of acheiving that conversion is by using Form Recognizer REST API with Python, which can be found here in this repo Quickstart: Extract invoice data using the Form Recognizer REST API with Python.

Although in the case of receipts or invoices, converting line items from JSON to CSV can be a challenging task, due to their dynamic nature. In this repo, we tackle the challange of storing line items extracted from receipts in CSV format, using a simple Logic App flow.

Pre-requisites

  • Azure Subscription
  • Azure Logic App
  • Storage Account
  • Form Recognizer
  • Sample receipt

Steps

Step 1: Setting up the environment

Creat a new Resource Group in your Azure Subscription and provision the followng resources:

  • Storage Account
  • Logic App
  • Form Recognizer

image

You can also deploy the required resources using this ARM template:

Deploy to Azure

Step 2: Setting up the Logic App

  1. Create a blank Logic App.

image

  1. Search for "Azure Blob Storage" and select "When a blob is added or modified" trigger.

image

  1. You will be required to create a connection to the storage account. Fill in the information and click on Create to proceed.

image

  1. You will be required to select a container in the storage account to monitor. Fill in the information and click on New step to proceed.

image

5.From the "Azure Blob Storage" list of actions, select "Get blob content".

image

  1. Fill in the information and click on New step to proceed.

image

  1. Search for "Form Recognizer" and select "Analyze Receipt" from the list of actions.

image

  1. You will be required to create a connection to Form Recognizer. Fill in the information and click on Create to proceed.

image

  1. Fill in the information and click on New step to proceed.

image

  1. From the "Variables" list of actions, select "Initialize variable".

image

  1. Create an empty array variable that will hold the line items. Fill in the information and click on New step to proceed.

image

  1. From the "Control" list of actions, select "For each".

image

  1. Search for "documentResults" and select it as the loop parameter. Click on Add an action inside the loop to proceed.

image

  1. Add another "For each" loop. Search for "items" and select "Items field Items" as the loop parameter. Click on Add an action inside the second loop to proceed.

image

  1. From the "Data Operations" list of actions, select "Compose".

image

  1. Use the following structure as input and replace the placeholders with the corrosponding dynamic values. Click on Add an action inside the second loop to proceed.
{
  "item": "<Item field value Name>",
  "price": "<Item field value Price>",
  "quantity": "<Item field value Quantity>",
  "total_price": "<Item field value Total price>"
}

image

  1. From the "Variables" list of actions, select "Append to array variable".

image

  1. Select the name of the array and choose "Outputs" of "Compose" as the value. Click on New step outside the loops to proceed.

image

  1. From the "Data Operations" list of actions, select "Create CSV table".

image

  1. Fill in the information and click on New step to proceed.

image

  1. From the "Azure Blob Storage" list of actions, select "Create blob".

image

  1. Fill in the information. For "Blob name" use a concat function to append "-line-items.csv" to the file name for the generated CSV file. For "Blob content", use "Outputs" of the "Create CSV table" action. Save your Logic App to proceed.

image

Step 3: Testing the Logic App

  1. From "Run Trigger", click on "Run".

image

  1. Upload the sample receipt "Receipt.png" to the container in the storage account.

Receipt

  1. Wait for the Logic App flow to finish.

image

  1. The resulted CSV file will contain all line items from the receipt.

image

image

License

For all licensing information refer to LICENSE.

About

A Logic App that converts JSON output from Form Recognizer to CSV format, for line items extracted from receipts.

Topics

Resources

License

Stars

Watchers

Forks