TOP 10 Optical Character Recognition (OCR) API

Updated: 2 days ago


Top 10 OCR API

This article is brought to you by the Eden AI team. We allow you to test and use in production a large number of AI engines from different providers directly through our API and platform. You are a solution provider and want to integrate Eden AI, contact us at : contact@edenai.co

In this article, we are going to see how we can easily integrate a Optical Character Recognition (OCR) engine in your project and how to choose and access the right engine according to your data.


Definition:


Optical Character Recognition, also called OCR, is a technology that recognizes text within a digital image. The basic process of OCR involves examining the text of a document and translating the characters into code that can be used for data processing. OCR engines are made up of a combination of hardware and software that is used to convert physical documents into machine-readable text. Hardware to copy or read text while software typically handles the advanced processing.


History:


OCR traces its roots back to telegraphy. On the eve of the First World War, physicist Emanuel Goldberg invented a machine that could read characters and convert them into telegraph code. In the 1920s, he went a step further and created the first electronic document retrieval system.

Early versions of OCR had to be trained with images of each character and were limited to recognising one font at a time. In the 1970s, inventor Ray Kurzweil commercialised “omni-font OCR”, which could process text printed in almost any font.

OCR Technology became popular in the early 1990s while attempting to digitize historic newspapers. In the early 2000s, OCR became available online as a cloud-based service, accessible via desktop and mobile applications.

Today, there’s a host of OCR service providers offering technology (often accessible via APIs) capable of recognising most characters and fonts to a high level of accuracy.


Top 10 OCR API:


Abbyy

ABBYY FineReader PDF is an optical character recognition (OCR) application developed by ABBYY, with support for PDF file editing. Abbyy allows the conversion of image documents (photos, scans, PDF files) and screen captures into editable electronic formats.


Microsoft Azure - Available on Eden AI

The Computer Vision Read API is Microsoft Azure's latest OCR technology that extracts printed text (in several languages), handwritten text (in several languages), digits, and currency symbols from images and multi-page PDF documents. It's optimized to extract text from text-heavy images and multi-page PDF documents with mixed languages. It supports detecting both printed and handwritten text in the same image or document.


Available on Eden AI


OCR Space - Available on Eden AI

The OCR. space Online OCR service converts scans or (smartphone) images of text documents into editable files by using Optical Character Recognition (OCR). The OCR software also can get text from PDF. Our Online OCR service is free to use, no registration necessary. Just upload your image files.


Available on Eden AI


AWS - Available on Eden AI

Amazon Rekognition can detect text in images and videos. It can then convert the detected text into machine-readable text. You can use machine-readable text detection in images to implement solutions. Amazon Rekognition is designed to detect words in English. It might also detect words in other languages that use these characters, but it doesn't detect diacritics and other characters.


Available on Eden AI


CloudMersive

The Cloudmersive Optical Character Recognition API supports JSON, text, and XML formats when displaying image to text responses. API Key is required to authenticate. Cloudmersive provides scalable, computer vision and natural language APIs. Easily convert scanned documents, or photos of documents and receipts to digital text. Uses Machine Learning to automatically pre-process and then recognize the text across over 90 languages.


Google Cloud - Available on Eden AI

Google Cloud Vision includes OCR services. It also includes an OCR engine to extract text from documents. The Vision API can detect and extract text from images.


Available on Eden AI


Base64.ai

Base64.ai is a cloud-based artificial intelligence service that instantly and accurately extracts text, data, handwriting, photos, and signatures from all types of documents, including IDs, driver licenses, passports, visas, receipts, invoices, forms, and hundreds of other document types worldwide. In seconds, Base64.ai discerns the document's type, extracts the relevant information, verifies the results, and integrates them into the customer's systems.


Dataleon

Dataleon provides the best Machine Learning tools for data automation and processing. Ready-to-use API for data recognition and extraction are available to accelerate digital transformation powered by artificial intelligence. To resolve in the best way company’s issues, Dataleon develops innovative automation and adjustable solutions available in the cloud with IA.


Rossum

Rossum solves four key steps in document-based processes at once: receiving documents across multiple channels, automated understanding, two-way communication to resolve exceptions, and acting on the data using in-depth integrations. Whether you receive invoices, purchase orders, claims, any other documents, Rossum automates your business communication.


Tesseract (Bonus - Open Source)

Tesseract is an OCR engine with support for unicode and the ability to recognize more than 100 languages out of the box. It can be trained to recognize other languages. It can be used directly, or (for programmers) using an API to extract printed text from images. It can be used with the existing layout analysis to recognize text within a large document, or it can be used in conjunction with an external text detector to recognize text from an image of a single text line.


Use cases:


You can use OCR in numerous fields, and sometimes specific models are trained for those fields. Here are some common use cases:

  • healthcare: digitize a patient's entire medical history: health reports, X-rays, disease history, treatment tracking, diagnoses, hospital records, insurance coverage, payments

  • banking: leverages image processing to reliably convert scanned documents from images into searchable PDF files, allowing for specific information retrieval with keyword search.

  • legal: scanning, storage, and preservation in searchable databases are now possible for all printed documents: affidavits, judgments, declarations, notices, wills, etc.

  • supply chain: read instantaneous batch codes, expiration dates and serial numbers

  • insurance: claims processing can be automated by OCR and supporting technologies


Open source VS API


When you need a OCR engine, you have 2 options:

  • First option: multiple open source OCR engines exist, they are free to use. Some of them can be performant but it can be complex to set up and use. Using an open source AI library requires data science expertise. Moreover, you will need to set up a server internally to run open source engines.

  • Second option: you can use engines from your cloud provider. Actually, cloud providers like Google Cloud, AWS, Microsoft Azure, Alibaba Cloud or IBM Watson are all providing multiple AI engines including OCR. This option looks very easy because you can stay in a known environment where you might have abilities in your company and the engine is ready-to-use.

The only way you have to select the right provider is to benchmark different providers’ engines with your data and choose the best OR combine different providers’ engines results. You can also compare prices if the price is one of your priorities, as well as you can do for rapidity.

This method is the best in terms of performance and optimization but it presents many inconveniences:

  • you may not know every performant providers on the market

  • you need to subscribe and contract with all providers

  • you need to master each providers API documentation

  • you need to check their pricing

  • You need to process data in each engine to realize the benchmark

Here is where Eden AI becomes very useful. You just have to subscribe and create an Eden AI account, and you have access to many providers engines for many technologies including OCR. The platform allows you to benchmark and visualize results from different engines, and also allows you to have centralized cost for the use of different providers.

Eden AI provides the same easy to use API with the same documentation for every technology. You can use the Eden AI API to call OCR engines with a provider as a simple parameter. With only a few lines, you can set up your project in production:



Test and API:


Here is the code in Python (GitHub repo) that allows to test Eden AI for OCR:

Eden AI Python SDK: Optical Character Recognition (OCR)

Results:


{'Google Cloud': 'In mid.-april Anglesey mortd his fami ly entounage from Rome to Naplos, there to await the arrival of', 'Amazon Web Services': 'In mid-april Anglesey mored his family and entorage from Rome to Naples, there to await the arrival of'}

Platform:


Eden AI Platform: Optical Character Recognition (OCR)

Conclusion:


There are numerous OCR engines available on the market: it’s impossible to know all of them, to know those who provide good performance. The best way you have to integrate OCR technology is the multi-cloud approach that guarantees you to reach the best performance and prices depending on your data and project. This approach seems to be complex but we simplify this for you with Eden AI which centralizes best providers APIs.

127 views0 comments

Recent Posts

See All