# MiniMax Speech 2.6 Turbo

> Low‑latency MiniMax Speech 2.6 Turbo brings multilingual, emotional text-to-speech to Replicate with 300+ voices and real-time friendly pricing

- **Provider**: replicate
- **Model ID**: minimax/speech-2.6-turbo
- **Category**: tts_voice
- **Credits**: 108 per request
- **Pricing Type**: token_based

## API Endpoint

Base URL: https://api.core.today/v1

### Create Prediction
POST /predictions

### Get Status
GET /predictions/{job_id}

### Cancel
DELETE /predictions/{job_id}

## Authentication

Header: `X-API-Key: YOUR_API_KEY`

## Input Parameters

- `voice_id` (string, optional): Voice to synthesize. Pick any MiniMax system voice or a voice_id returned by https://replicate.com/minimax/voice-cloning. (Default: `Wise_Woman`)
- `channel` (string, optional): mono for 1 channel (default), stereo for 2 channels. (Default: `mono`; Options: `mono`, `stereo`)
- `english_normalization` (boolean, optional): Improve number/date reading for English text (adds a small amount of latency). (Default: `False`)
- `audio_format` (string, optional): File format for the generated audio. Choose mp3 for general use, wav/flac for lossless, or pcm for raw bytes. (Default: `mp3`; Options: `mp3`, `wav`, `flac`, `pcm`)
- `bitrate` (integer, optional): MP3 bitrate in bits per second. Only used when audio_format is mp3. (Default: `128000`; Options: `32000`, `64000`, `128000`, `256000`)
- `speed` (number, optional): Speech speed multiplier (0.5–2.0). Lower is slower, higher is faster. (Default: `1`; Range: min: 0.5, max: 2)
- `subtitle_enable` (boolean, optional): Return MiniMax subtitle metadata with sentence timestamps (non-streaming only). (Default: `False`)
- `language_boost` (string, optional): Optional language hint. Choose Automatic to let MiniMax detect the language, or pick a specific locale. (Default: `None`)
- `volume` (number, optional): Relative loudness. 1.0 is default MiniMax gain. Range 0–10. (Default: `1`; Range: min: 0, max: 10)
- `emotion` (string, optional): Desired delivery style. Use auto to let MiniMax choose, or pick a specific emotion. (Default: `auto`; Options: `auto`, `happy`, `sad`, `angry`, `fearful`, `disgusted`, `surprised`, `calm`, `fluent`, `neutral`)
- `sample_rate` (integer, optional): Audio sample rate in Hz. (Default: `32000`; Range: min: 8000, max: 44100)
- `text` (string, **required**): Text to narrate (max 10,000 characters). Use markers like <#0.5#> to insert pauses in seconds.
- `pitch` (integer, optional): Semitone offset applied to the voice (−12 to +12). (Default: `0`; Range: min: -12, max: 12)

## Example Request

```json
{
  "model": "minimax/speech-2.6-turbo",
  "input": {
    "voice_id": "Wise_Woman",
    "channel": "mono",
    "audio_format": "mp3",
    "english_normalization": false,
    "bitrate": 128000,
    "speed": 1,
    "language_boost": "None",
    "subtitle_enable": false,
    "volume": 1,
    "emotion": "auto",
    "sample_rate": 32000,
    "text": "Minimax just released Speech 2.6, It's really good, It builds on top of what existed before, The HD version is perfectly optimized for high-fidelity applications like voiceovers and audiobooks, And the Turbo variant is better for real-time applications with low latency.",
    "pitch": 0
  }
}
```

## Response Format

```json
{
  "job_id": "abc123",
  "status": "pending",
  "provider": "replicate",
  "model": "black-forest-labs/flux-schnell",
  "created_at": "2026-01-01T00:00:00Z",
  "result": null,
  "error": null
}
```

Status values: `pending`, `processing`, `completed`, `failed`, `cancelled`

## Usage Flow

1. POST /predictions with model and input → receive job_id
2. Poll GET /predictions/{job_id} until status is `completed` or `failed`
3. Result contains output URL(s) or data

## Output Type

url

## Tags

text-to-speech, tts, audio, minimax, multilingual, voice-synthesis, real-time, low-latency

## Documentation

https://platform.minimax.io/docs/api-reference/speech-t2a-intro


## Token Pricing

- Input: 0.108 credits/token
- Output: 0.108 credits/token