Studio-Quality Text-to-Speech in 40+ Languages

Transform text into natural, expressive speech with MiniMax TTS. Powered by Speech-02 technology with emotion control, speed adjustment, and real-time synthesis.

No credit card required • Real-time synthesis • Commercial license included

Speech-02 Model Options

Choose the right model for your application

HD MODEL

Speech-2.6-HD

Best Quality

  • Highest audio fidelity
  • Studio-grade output
  • 40+ languages
  • 7 emotion types

Best for: Audiobooks, Podcasts, Premium Content

TURBO MODEL

Speech-2.6-Turbo

Best Speed

  • Real-time synthesis
  • Low latency
  • 40+ languages
  • High quality output

Best for: Live Apps, Chatbots, Gaming

LITE MODEL

Speech-2.6-Lite

Best Value

  • Most cost-effective
  • Fast processing
  • 40+ languages
  • Good quality

Best for: Testing, Prototypes, Large-scale

Powerful TTS Features

Everything you need for professional voice synthesis

40+ Languages

Comprehensive language support including English, Spanish, French, German, Japanese, Korean, Arabic, Hindi, Portuguese, Russian, and many more.

Emotion Control

Generate speech with 7 different emotions: neutral, happy, sad, angry, fearful, surprised, and disgusted. Perfect for dynamic storytelling.

Speed Control

Adjust speech speed from 0.5x to 2.0x with 0.1 increments. Maintain natural prosody and intonation at any speed.

Voice Library

Access 300+ built-in voices with diverse accents, genders, and ages. Male, female, child voices with regional accents.

Pitch Control

Fine-tune voice pitch to match your needs. Adjust from -12 to +12 semitones while maintaining natural voice quality.

Real-Time Synthesis

Generate speech in real-time with ultra-low latency. Perfect for live applications, chatbots, and interactive experiences.

TTS Use Cases

Transform how you create audio content

Content Creation

Generate voiceovers for videos, podcasts, and audiobooks with studio-quality voices. Scale production without recording equipment.

  • YouTube video narration
  • Podcast production
  • Audiobook creation
  • Documentary voiceovers

E-Learning

Create engaging educational content with natural-sounding voices in any language. Perfect for course creators and trainers.

  • Online course narration
  • Language learning apps
  • Training materials
  • Educational videos

Customer Service

Power IVR systems, voice assistants, and customer support with natural conversational voices that improve user experience.

  • IVR menu systems
  • Voice assistants
  • Automated responses
  • Chatbot voice integration

Gaming & Entertainment

Generate character dialogue, NPC voices, and dynamic narration for games, apps, and interactive media.

  • Game character dialogue
  • NPC voice generation
  • Interactive storytelling
  • Virtual avatar voices

TTS API Example

API Endpoint

POST https://api.minimax.io/v1/t2a_v2

Request

curl -X POST \
https://api.minimax.io/v1/t2a_v2 \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
  "model": "speech-2.6-hd",
  "text": "Hello world",
  "voice_id": "male-qn-qingse",
  "speed": 1.0,
  "vol": 1.0,
  "pitch": 0,
  "emotion": "neutral",
  "audio_sample_rate": 32000,
  "bitrate": 128000,
  "format": "mp3"
}'

Response

{
  "audio_file": "base64_audio_data",
  "trace_id": "abc123xyz",
  "base_resp": {
    "status_code": 0,
    "status_msg": "success"
  },
  "extra_info": {
    "audio_length": 2.5,
    "audio_size": 40960,
    "audio_sample_rate": 32000,
    "bitrate": 128000
  }
}

Supported Languages

English
Spanish
French
German
Japanese
Korean
Chinese
Arabic
Hindi
Portuguese
Russian
Italian
Turkish
Polish
Dutch
Thai

+ 24 more languages supported

Start Creating with MiniMax TTS

Generate studio-quality speech in 40+ languages with full emotion control

Free tier available • No credit card required • Commercial license included