Tutorial
Document Processing
8 min reading

How to use OCR Table to generate CSV with Python

Summarize this article with:

summary
  • It is a simple, human-readable, and widely supported format for tabular data.
  • That is precisely why we are offering a Python-based solution for converting JSON responses from Eden AI OCR table API into CSV format.
  • The API is asynchronous, meaning that we can conduct multiple requests at the same time without waiting for the previous request to execute.
  • The API returns a public_id that we can now use to get the result of the job.
  • Developers can use OCR Table to generate CSV with Python using a REST API that accepts standard inputs and returns structured JSON responses.

Quickly and easily extract tables from documents and transform them in CSV with just a few simple steps!

Why convert to CSV?

CSV is a widely accepted format for tabular data, making it ideal for data manipulation, analysis, and integration with existing systems.

It is a simple, human-readable, and widely supported format for tabular data. It’s the go-to choice when it comes to data manipulation, analysis, and integration, especially for businesses that rely on spreadsheets, databases, and data warehouses for decision-making.

That is precisely why we are offering a Python-based solution for converting JSON responses from Eden AI OCR table API into CSV format. By following these simple steps, you’ll acquire practical skills to streamline data processing and integration, ensuring you get the most out of your digitized content.

How to convert Table into a CSV

1. Get Response from OCR Table API

NOTE: For this tutorial we will concentrate on simple tables easily readable in .csv format. For tables with lots of row & column spans, it is an entire different challenge to represent them in a simple format.

First thing first, we should parse our document into JSON thanks to the Eden AI API.

The API is asynchronous, meaning that we can conduct multiple requests at the same time without waiting for the previous request to execute. This is useful when you need to parse a document spanning multiple pages, which would take a long time to process.

However, for the purpose of this example, we will just send a very simple table that can be found here.

Here is a code snippet to show you how to launch the job:


import requests

headers = {"Authorization": "Bearer 🔑 Your_API_Key"}
url="https://api.edenai.run/v2/ocr/ocr_tables_async"

file_url = "https://developer.mozilla.org/en-US/docs/Learn/HTML/Tables/Basics/numbers-table.png"
provider = "amazon"

payload={
"providers": provider,
"file_url": file_url,
"language":"en",
}

response = requests.post(url, json=payload, headers=headers)
result = response.json()

job_id = result['public_id']

The API returns a public_id that we can now use to get the result of the job. Since we don’t know when it will finish, we will poll the job and check its status every 5 seconds.


import time

def poll_ocr_table_job(job_id: str, max_poll_count = 10, poll_interval_sec = 5) -> dict:
"""
Poll asynchronous job every `poll_interval_sec` seconds
Raises Exception if job still not finished after `max_poll_count`
"""
for i in range(max_poll_count):
time.sleep(poll_interval_sec)
response = requests.get(f"{url}/{job_id}", headers=headers)
data = response.json()
if data['status'] == 'finished':
return data
raise Exception("Call took too long.")


poll_response = poll_ocr_table_job(job_id)
# we know there is only one page and one table
# in reality you can iterate over pages and create one cv file per table found
json_table = poll_response['results'][provider]['pages'][0]['tables'][0]

2. Use Python CSV library to generate CSV

Now that we got the table, we need to format it into multiple lists of strings, each list representing a row.

Example:


[ ['header1', 'header2'], ['data1', 'data2']]

Here is how to do it:


csv_table = []
for row in json_table['rows']:
csv_row = []

for cell in row['cells']:
csv_row.append(cell['text'])

csv_table.append(csv_row)

Finally we just need to create a csv file and write the data into it:


import csv
with open("table.csv", 'w') as csvfile:
tablewriter = csv.writer(csvfile)
tablewriter.writerows(csv_table)

Here is the resulting CSV file:

Chris,38

Dennis,45

Sarah,29

Karen,47

Conclusion

Here it is! We have successfully parsed a table document and transformed it into a CSV file. It’s actually very easy to do it with Python, and it shouldn’t be a problem to implement it in other languages.

FAQ — use OCR Table to generate CSV with Python

You need an API key from your chosen AI provider. Eden AI lets you access multiple providers with a single key, removing the need for separate vendor accounts.
Any language that supports HTTP requests works — Python, JavaScript, PHP, Ruby, Go, and more. Ready-to-use code snippets are available for the most common languages.
Most developers complete a basic integration in under an hour using standardized API endpoints and ready-to-use code examples.
Implement exponential backoff for rate limit errors and use try-catch blocks for network failures. Eden AI's built-in fallback routing automatically redirects requests if a provider is unavailable.
Eden AI supports GDPR-compliant provider filtering and does not store or reuse your data, ensuring compliance with European privacy regulations.

Similar articles

Tutorial
Generative AI
How to Generate Videos Using Python
9/4/2025
·
Written byTaha Zemmouri
let’s start

Start building with Eden AI

A single interface to integrate the best AI technologies into your products.