With the need of digitization that's been happening the last few years, some businesses now want to automatically process the hundreds of receipts that they receive. Traditionally this has been achieved by having a human looking at a paper receipt and manually extracting the relevant information then inputting it into a database, this process is laborious and expensive. Receipt extraction technology speeds this process up by using OCR technology and directly allowing the software to scan a picture of the receipt and extract that data in just seconds. It’s a way of automating receipt scanning and extraction to collect information faster.
Receipt OCR is a tool powered by OCR to extract and digitalize meaningful data from scanned or PDF receipts. Fields commonly captured by OCR receipt include description, quantity, due date, line items, merchant and store information, unit price, bill to, receipt number, total amount, tax amount, etc.
This technology is built on multiple steps, the first step consists of preprocessing the image, usually the scanned receipts are noisy so a preprocessing with noise removal and grayscaling are needed. This step is necessary for the text extraction engines to work well. Next step is text detection with OCR (for Optical Character Recognition). It extracts a text from various file types: pdf, Docx, JPEG, PNG, etc. Their goal is only to get the texts in the document without dealing with the structure of the document.
The final step consists of data extraction and categorization, where it classifies the extracted text into keys and tags like tax and total amount, it's based on deep learning algorithms and NER (named entity recognition). The final result of the parsing is a structured form that can be readable by the computer . It’s often a JSON, XML or even a CSV file; this makes it easy to be stored into a database and automatically analyzed.
OCR receipt is mostly used in the automation and optimization process of supply chain management since it’s the backbone of many businesses. Managing tasks, information and production are very important to ensure the control of the cost of production.A digitized supply chain would give a benefit to these companies by ensuring on-time delivery. The key of digitalization is the automation of capturing data and management of a lot of this data which is in the form of receipts and invoices. Having an employee manually enter receipts has a negative impact across the supply chain and leads to unnecessary delays. If this receipt processing is digitized it can lead to substantial gains of time and efficiency.
During our study on OCR Receipts APIs, we decided to choose 8 providers APIs that provide high performance according to many blog articles and rankings.
This is the list of provider APIs we are going to test. It is interesting to note that some other APIs and open source solutions exist.
As mentioned previously, OCR receipts APIs are mainly used for supply chain management and receivables automation as their goal is to have a fully digitized supply chain. In this use case, we received 11 receipts from different stores, where each receipt is a scanned image. We are going to test different OCR receipt APIs and benchmark the results. For each of the 11 documents, we tested the 8 APIs. Of course, for a real project you will need to test on a representative part of your database to get reliable results.
In our benchmark, we wanted to compare the performance of the APIs on various fields: customer and store information (full name, address), invoice number, total receipt, tax and line items. Note that some providers extract other fields from the receipt but since we only want specific information about the receipt we only focus on these. The API response is a json response that will be used to extract specific informations.
While using different APIs for receipt digitalization, we met some challenges. Some providers can perform well on basic information like name, address and total but don’t retrieve line items and taxes, while others perform on taxes and items but not on the field invoice and basic information. Some providers can even retrieve the majority of the fields but not the items line.
Another challenge concerns returned keys; some APIs return both the store name and headline while others directly have one key full with store name that includes both of the fields. To make this easier and bypass these challenges we used Eden AI solution to use APIs from different vendors. Eden AI allows us to get multiple receipt parser APIs results with only one simple request and return a standardized response for all of them.
You don’t need to do any response formatting to compare them, so if you want to combine a result from multiple providers, it can be done easily with a few lines of code.
Or just use the web interface where you import the receipt and choose the providers that you want to test.
You can also manage and evaluate your own cost for each provider available for receipt parsing, it gives you an idea for your project as it includes the pricing per request.
Here are the results we got for this use case (percentage of recognition):
Please note that the results represent the percentage of receipts whose result is accurate; a prediction close to the real field without being accurate will be considered as a bad prediction i.e.: prediction of the store name as the store headline. Warning: These results are not an accurate representation of the performance, it will always depend on your dataset. You can’t even know which providers will be the best for your data and use case. All these providers provide good results with some types of receipts and some languages.
Depending on the data used, the best way to obtain the highest performance is always different. For some use cases, an API from provider A will be the best, for another use case provider B API is better. For a more complex use case, maybe a combination is needed and provider C + provider D will be the best option.
With Eden AI, you can get fast access to various results from various providers. So you can have a better idea about which is the solution that best fits for you.
The decision making is as follows:
1- First you run your data on Eden AI to benchmark APIs available on the market.
2 - Then you can either find a result that pushes you to choose one API that fits your need or different providers that give you good results for different fields so you can build your own custom model by combining multiple providers.
3- This process guarantees you to make the right choice to succeed in your project. Eden AI is the universal API that allows you to have flexibility in the use of all these COR receipt engines to always get the best performance/cost ratio.
Best Machine Translation (MT) / Automated Translation APIs in 2022
Best Speech-to-Text (STT) / Automatic Speech Recognition (ASR) APIs in 2022