Tutorial

VIDEO | How to Generate Voice (Text-to-Speech) using Python

TABLE OF CONTENTS

Welcome to our comprehensive tutorial on generating voice from text using AI and Python! Whether you're building a virtual assistant, creating audio content, or exploring the possibilities of AI-driven speech synthesis, this tutorial will equip you with the knowledge and tools you need.

‍

What is Text-to-Speech (Voice Generation)?

‍

Text-to-Speech (TTS), also known as voice generation, is a technology that converts written text into spoken words. Using advanced algorithms and machine learning, TTS systems can read text aloud in a natural-sounding voice. This technology has numerous applications, from assisting visually impaired individuals to enabling hands-free interaction with digital devices.

‍

Applications of Text-to-Speech

Accessibility: TTS is widely used to assist people with visual impairments or reading disabilities, providing them with audio versions of written content.
Virtual Assistants: Digital assistants like Siri, Alexa, and Google Assistant use TTS to interact with users.
Content Creation: TTS can be used to generate audio versions of articles, books, and other text-based content.
Customer Service: Automated phone systems and chatbots often use TTS to provide information and support to customers.

‍

How to Generate Voice from Text?

‍

Step 1: Set Up Your Eden AI Account

‍

1. Sign Up: If you don't have an Eden AI account, create a free one using the following link.

2. Access Speech Technologies: After logging in, navigate to the speech section of the platform.

3. Select Text-to-Speech: Choose the text-to-speech feature. You can also explore asynchronous text-to-speech depending on your needs.

‍

Step 2: Live Test TTS Models on Eden AI

Choose Providers: Scroll down to see different providers on the right side and the live testing section at the bottom.
Configure Settings: Select your preferred language and the gender of the speaker (male or female).
Input Text: Enter a sample text, for example: "Hello, I'm an assistant. How can I help you?"
Download or Visualize: Run the test, and download the audio files or visualize the results.

‍

Step 3: Implementing Text-to-Speech in Python

Now, let's implement this in Python. We'll show you how to perform text-to-speech synchronously and asynchronously.

‍

Synchronous Text-to-Speech

‍

1. Install Required Libraries: Ensure you have the necessary libraries installed. Use requests for making API calls.

pip install requests

‍

2. Sample Code‍


import requests
import base64

API_KEY = 'YOUR_EDEN_AI_API_KEY'
ENDPOINT = 'https://api.edenai.run/v2/audio/text_to_speech'

headers = {
		'Authorization': f'Bearer {API_KEY}',
    'Content-Type': 'application/json'
}

data = {
		'providers': 'openai',
    'language': 'en-US',
    'text': "Hi, how can I help you?"
    }

response = requests.post(ENDPOINT, headers=headers, json=data)

if response.status_code == 200:
		result = response.json()
    audio_base64 = result'openai''audio'
    audio_data = base64.b64decode(audio_base64)
    
    with open('output.wav', 'wb') as audio_file:
    		audio_file.write(audio_data)
    print("Audio saved as output.wav")
else:
		print(f"Error: {response.status_code}")

‍

‍3. Explanation:

This script sends a POST request to the Eden AI API endpoint with your API key.
The response contains the audio in Base64 format, which we decode and save as a .wav file.

‍

Asynchronous Text-to-Speech

‍

1. Sample Code:


import requests
import time

API_KEY = 'YOUR_EDEN_AI_API_KEY'
ENDPOINT = 'https://api.edenai.run/v2/audio/text_to_speech_async'

headers = {
    'Authorization': f'Bearer {API_KEY}',
    'Content-Type': 'application/json'
}

data = {
    'providers': 'openai',
    'language': 'en-US',
    'text': "Hi, how could I help you?"
}

# Initiate the job
response = requests.post(ENDPOINT, headers=headers, json=data)

if response.status_code == 200:
    job_id = response.json()['job_id']
    
    # Polling the job status
    status_endpoint = f'{ENDPOINT}/{job_id}'
    while True:
        status_response = requests.get(status_endpoint, headers=headers)
        if status_response.status_code == 200:
            status_data = status_response.json()
            if status_data['status'] == 'completed':
                audio_url = status_data['result']['audio_url']
                break
            else:
                print("Waiting for the job to complete...")
                time.sleep(5)  # Wait for 5 seconds before checking again
        else:
            print(f"Error: {status_response.status_code}")
            break

    # Download the audio file
    audio_response = requests.get(audio_url)
    with open('output_async.wav', 'wb') as audio_file:
        audio_file.write(audio_response.content)
    print("Asynchronous audio saved as output_async.wav")
else:
    print(f"Error: {response.status_code}")

‍‍

‍2. Explanation:

This script initiates an asynchronous text-to-speech job and retrieves the job ID.
It then polls the job status periodically until the job is completed.
Once completed, it downloads the audio file using the provided URL.

‍

Conclusion

You have now learned how to use Eden AI to generate voice from text both synchronously and asynchronously using Python. This powerful tool allows you to create AI workflows that incorporate the best Text-to-Speech Models.

Feel free to experiment with different providers and settings to find the best fit for your needs. Happy coding!

‍

Benefits of using Eden AI's unique API

Using Eden AI API is quick and easy.

‍

Save time and cost

We offer a unified API for all providers: simple and standard to use, with a quick switch that allows you to have access to all the specific features very easily (diarization, timestamps, noise filter, etc.).

‍

Easy to integrate

The JSON output format is the same for all suppliers thanks to Eden AI's standardization work. The response elements are also standardized thanks to Eden AI's powerful matching algorithms.

‍

Customization

With Eden AI you can integrate a third-party platform: we can quickly develop connectors. To go further and customize your API request with specific parameters, check out our documentation.

‍

Next step in your project

The Eden AI team can help you with your Image Similarity Search integration project. This can be done by :

‍

Organizing a product demo and a discussion to understand your needs better. You can book a time slot on this link: Contact
By testing the public version of Eden AI for free: however, not all providers are available on this version. Some are only available on the Enterprise version.
By benefiting from the support and advice of a team of experts to find the optimal combination of providers according to the specifics of your needs
Having the possibility to integrate on a third-party platform: we can quickly develop connectors.

‍

Create your Account on Eden AI

Try Eden AI for free.

You can directly start building now. If you have any questions, feel free to chat with us!

Get started Contact sales

VIDEO | How to Generate Voice (Text-to-Speech) using Python

What is Text-to-Speech (Voice Generation)?

Applications of Text-to-Speech