Integrate advanced text-to-speech and voice cloning capabilities into your applications with our powerful API
View Full DocumentationTo access MiniMax APIs, you need to obtain your GroupID and API Key:
https://api.minimax.io/
                        All API requests require Bearer token authentication:
Authorization: Bearer YOUR_API_KEY
                        Ultimate similarity and ultra-high quality speech synthesis with real-time response and intelligent parsing.
Ultimate value with low latency, optimized for cost-effective real-time applications.
Stronger replication similarity with high-quality voice generation and superior rhythm.
Superior rhythm and stability with enhanced multilingual capabilities and low latency.
POST https://api.minimax.io/v1/t2a_v2
                    curl -X POST https://api.minimax.io/v1/t2a_v2 \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "speech-2.6-hd",
    "text": "Welcome to MiniMax Audio API. Transform your text into natural, expressive speech.",
    "voice_id": "male-qn-qingse",
    "speed": 1.0,
    "vol": 1.0,
    "pitch": 0,
    "audio_sample_rate": 32000,
    "bitrate": 128000,
    "format": "mp3"
  }'
                    {
  "audio_file": "base64_encoded_audio_data",
  "trace_id": "abc123-def456-ghi789",
  "base_resp": {
    "status_code": 0,
    "status_msg": "success"
  }
}
                    POST https://api.minimax.io/v1/voice_clone
                    curl -X POST https://api.minimax.io/v1/voice_clone \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "file_id": "YOUR_UPLOADED_FILE_ID",
    "voice_id": "my-custom-voice-01",
    "model": "speech-2.6-hd",
    "text": "This is a test of the cloned voice.",
    "need_noise_reduction": false,
    "need_volumn_normalization": false
  }'
                    | Parameter | Type | Description | Required | 
|---|---|---|---|
model | 
                                string | Model to use (speech-2.6-hd, speech-2.6-turbo, etc.) | Yes | 
text | 
                                string | Text to synthesize (max 10,000 chars) | Yes | 
voice_id | 
                                string | System voice or custom cloned voice ID | Yes | 
speed | 
                                float | Speech speed (0.5 - 2.0) | No | 
vol | 
                                float | Volume level (0.1 - 10.0) | No | 
pitch | 
                                int | Pitch adjustment (-12 to 12) | No | 
format | 
                                string | Output format (mp3, pcm, flac, wav) | No |