Text-to-Speech (TTS) - MiniMax Audio AI | 40+ Languages & Studio Quality

Speech-02 Model Options

Choose the right model for your application

HD MODEL

Speech-2.6-HD

Best Quality

Highest audio fidelity
Studio-grade output
40+ languages
7 emotion types

Best for: Audiobooks, Podcasts, Premium Content

TURBO MODEL

Speech-2.6-Turbo

Best Speed

Real-time synthesis
Low latency
40+ languages
High quality output

Best for: Live Apps, Chatbots, Gaming

LITE MODEL

Speech-2.6-Lite

Best Value

Most cost-effective
Fast processing
40+ languages
Good quality

Best for: Testing, Prototypes, Large-scale

Powerful TTS Features

Everything you need for professional voice synthesis

40+ Languages

Comprehensive language support including English, Spanish, French, German, Japanese, Korean, Arabic, Hindi, Portuguese, Russian, and many more.

Emotion Control

Generate speech with 7 different emotions: neutral, happy, sad, angry, fearful, surprised, and disgusted. Perfect for dynamic storytelling.

Speed Control

Adjust speech speed from 0.5x to 2.0x with 0.1 increments. Maintain natural prosody and intonation at any speed.

Voice Library

Access 300+ built-in voices with diverse accents, genders, and ages. Male, female, child voices with regional accents.

Pitch Control

Fine-tune voice pitch to match your needs. Adjust from -12 to +12 semitones while maintaining natural voice quality.

Real-Time Synthesis

Generate speech in real-time with ultra-low latency. Perfect for live applications, chatbots, and interactive experiences.

TTS Use Cases

Transform how you create audio content

Content Creation

Generate voiceovers for videos, podcasts, and audiobooks with studio-quality voices. Scale production without recording equipment.

• YouTube video narration
• Podcast production
• Audiobook creation
• Documentary voiceovers

E-Learning

Create engaging educational content with natural-sounding voices in any language. Perfect for course creators and trainers.

• Online course narration
• Language learning apps
• Training materials
• Educational videos

Customer Service

Power IVR systems, voice assistants, and customer support with natural conversational voices that improve user experience.

• IVR menu systems
• Voice assistants
• Automated responses
• Chatbot voice integration

Gaming & Entertainment

Generate character dialogue, NPC voices, and dynamic narration for games, apps, and interactive media.

• Game character dialogue
• NPC voice generation
• Interactive storytelling
• Virtual avatar voices

TTS API Example

API Endpoint

POST https://api.minimax.io/v1/t2a_v2

Request

curl -X POST \
https://api.minimax.io/v1/t2a_v2 \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
  "model": "speech-2.6-hd",
  "text": "Hello world",
  "voice_id": "male-qn-qingse",
  "speed": 1.0,
  "vol": 1.0,
  "pitch": 0,
  "emotion": "neutral",
  "audio_sample_rate": 32000,
  "bitrate": 128000,
  "format": "mp3"
}'

Response

{
  "audio_file": "base64_audio_data",
  "trace_id": "abc123xyz",
  "base_resp": {
    "status_code": 0,
    "status_msg": "success"
  },
  "extra_info": {
    "audio_length": 2.5,
    "audio_size": 40960,
    "audio_sample_rate": 32000,
    "bitrate": 128000
  }
}

View Complete API Documentation

Supported Languages

English

Spanish

French

German

Japanese

Korean

Chinese

Arabic

Hindi

Portuguese

Russian

Italian

Turkish

Polish

Dutch

Thai

+ 24 more languages supported

Start Creating with MiniMax TTS

Generate studio-quality speech in 40+ languages with full emotion control

Try Free Now View Pricing

Free tier available • No credit card required • Commercial license included

Studio-Quality Text-to-Speech in 40+ Languages