Science
Text Processing
8 min reading

Understanding LLM Billing: From Characters to Tokens

Summarize this article with:

summary
  • LLMs can be billed either per request or per token.
  • At Eden AI, we've built a system that takes into account the nuances of token billing across different providers and languages.
  • Provider Flexibility: We support multiple LLM providers, each with their own tokenization methods.
  • Understanding how to count tokens when using the Eden AI API for text generation is essential for managing costs and ensuring your prompts are effective.
  • When you make a request to the text generation endpoint, you’ll receive usage information in the response, including the number of tokens used for both the prompt and the completion.

Introduction

When working with the Eden AI API, the currency of your interactions isn't just data or results—it's tokens. But what are tokens exactly, and how do they relate to the characters in your text? This guide provides a practical understanding of LLM token billing and explains how various providers handle this crucial aspect of AI model usage.

The Challenge of LLM Token Billing

Language Model (LLM) token billing is at the heart of AI-powered applications. In this context, a “token” is not a random unit of measurement but a carefully calibrated representation of text. Tokens can be as small as a single character or as large as entire words, depending on the language and context. This variability is what makes understanding LLM billing both challenging and fascinating.

Specifically in the case of Eden AI, the tokenization is handled directly by the models behind each provider.

From Characters to Tokens: A Journey through Tokenization

Consider the sentence "Hello, World!". In English, this sentence has 13 characters. But how many tokens is it? This depends on the tokenization method used by the AI model. Some models might see "Hello" as one token and ", World!" as three tokens, making it four tokens in total. Other models might break it down differently.

For example, the famous GPT models use the Byte Pair Encoding (BPE) tokenization. In BPE, frequently occurring pairs of bytes (or characters in textual data) are iteratively replaced with a single, unused byte. This method is efficient for handling both common and rare words, and it often leads to tokens that correspond to common subwords in a language.

LLM Token Billing: Per-Request vs. Per-Token

LLMs can be billed either per request or per token. A per-request billing model charges a flat fee for each API call, regardless of the amount of data processed. A per-token model, on the other hand, charges based on the number of tokens processed.

Per-token billing can be more cost-effective for tasks that require processing small amounts of text, while per-request billing might be more economical for tasks that require processing large amounts of text in a single call.

Unraveling the Token Count in Different Languages

Language is a beautiful, complex system, and it adds an extra layer of complexity to token billing. Different languages have different tokenization patterns, which can lead to different token counts for the same content.

Consider the word “communication”. In English, this is a single word with 13 characters. But how many tokens is it? It could be broken down into "commun" and "ication", making it two tokens. In French, the same word is “communication”—identical to the English version and therefore likely the same number of tokens. But in German, it’s “Kommunikation”, which might be tokenized differently.

In languages with a non-Latin alphabet, such as Arabic or Chinese, a single character can represent a whole word or concept. This means a single token could carry a lot more information than in English. Similarly, in highly agglutinative languages like Finnish or Turkish, a single word can carry the meaning of an entire English sentence, potentially leading to a more efficient use of tokens.

How Eden AI Handles Token Billing

At Eden AI, we've built a system that takes into account the nuances of token billing across different providers and languages. Our API tracks the tokens used for your text generation tasks and bills you accordingly.

In our system, we’ve made the following design decisions to ensure fair and transparent billing:

  1. Token-Based Billing: We bill based on the number of tokens processed by the LLM, providing a fair reflection of the computational resources used.
  2. Transparency: We provide detailed billing information so that you always know exactly what you’re being billed for.
  3. Provider Flexibility: We support multiple LLM providers, each with their own tokenization methods. Our system handles the complexities of different tokenization methods, so you don’t have to.

LLM Token-Based Billing Comparison between Providers

Let’s take a look at some of the major LLM providers and compare their token-based billing. Note that these figures are approximate and can change, so always check with the provider for the most up-to-date pricing information.

OpenAI GPT-3.5

OpenAI’s GPT-3.5 offers a range of models with different capabilities and pricing. The billing is per token, with the cost varying based on the model and its capabilities. For instance, the “gpt-3.5-turbo” model, one of the most capable and efficient ones, charges a fixed amount per 1K tokens, but they distinguish between “input” and “output” tokens, with output tokens costing more.

OpenAI GPT-4

OpenAI’s GPT-4 follows a similar per-token pricing model, but it’s typically more expensive due to its enhanced capabilities. Like GPT-3.5, it differentiates between “input” and “output” tokens, with output tokens costing more.

Anthropic Claude 3

Anthropic’s Claude 3 also uses a per-token billing model, though details can vary based on the specific model and its capabilities. It’s always advisable to check Anthropic’s official documentation for the most up-to-date pricing information.

How to Count Tokens when using the Eden AI API for Text Generation?

Understanding how to count tokens when using the Eden AI API for text generation is essential for managing costs and ensuring your prompts are effective.

When you make a request to the text generation endpoint, you’ll receive usage information in the response, including the number of tokens used for both the prompt and the completion. This information is detailed in the text generation API documentation.

Here’s a step-by-step guide for Python users:

Install the Necessary Libraries

For using LLMs and counting tokens, you’ll need to install the openai, tiktoken, and requests libraries, among others. You can do this using pip.

pip install openai tiktoken requests

Import the Libraries

import openai
import tiktoken
import requests

Make a Request and Count Tokens

url = "https://api.edenai.run/v2/text/generation"
payload = {
"providers": "openai",
"text": "Hello World, ",
"temperature": 0,
"max_tokens": 100,
"model": "gpt-3.5-turbo"
}
headers = {
"accept": "application/json",
"content-type": "application/json",
"authorization": "Bearer YOUR_EDENAI_API_KEY"
}

response = requests.post(url, json=payload, headers=headers)
data = response.json()

# Print the number of tokens used
print(f'Number of input tokens: {data["openai"]["nb_input_tokens"]}')
print(f'Number of output tokens: {data["openai"]["nb_output_tokens"]}')

In this example, we make a request to the text generation endpoint and print the number of input and output tokens. This information is crucial for billing purposes and helps you understand how your usage translates into tokens.‍

FAQ — Understanding LLM Billing

Understanding LLM Billing is an AI-powered capability that helps developers and businesses automate workflows, process data at scale, and improve decision accuracy.
The process involves sending data — text, image, audio, or document — to an AI model via API, which returns structured results in JSON format.
Common applications include document processing, content moderation, data extraction, language translation, and building intelligent automation pipelines.
Eden AI aggregates the best providers under a single API, letting you compare and switch between models without managing separate accounts or API keys.
Yes. Most AI APIs offer SLAs, rate limits, and enterprise plans. Eden AI adds fallback routing and centralized monitoring to further improve reliability.

Similar articles

Science
All
How to Use OpenAI, Claude & Gemini in Europe Without GDPR Risk
6/22/2026
·
Written byTaha Zemmouri
Science
All
The Missing Ring in Europe's AI Sovereignty Chain
6/19/2026
·
Written bySamy Melaine
let’s start

Start building with Eden AI

A single interface to integrate the best AI technologies into your products.