ListenHub Voice

Turn text, reference voices, or an image into speech and sound effects with end-to-end async generation, then poll the task for the finished audio.

The ListenHub Voice API generates audio end to end — plain narration, sound effects, single-voice speech, multi-speaker dialogue, voice cloning from a reference clip, or image-to-audio. Generation is asynchronous: you submit a request and poll a task until it finishes. All endpoints live under https://api.marswave.ai/openapi/v1/listenhub-voice and authenticate with Authorization: Bearer $LISTENHUB_API_KEY.

Every response is wrapped in { "code": 0, "message": "", "data": { ... } }. A non-zero code means an error — see Error Handling. The examples below read fields from data.

Model and Limits

Item	Value
`model`	`listenhub-voice-1.0` (default, the only supported value)
Rate limit	5 requests per minute, per user, on `/generate`
`text`	Up to 1400 characters
`voices`	1–3 entries (omit for plain text / sound effects)
`durationHint`	1–110 seconds (credit estimate + a hint for the target length)

Voices

voices controls who speaks. Each entry is one of two kinds:

`type`	Required field	Description
`speaker`	`id`	A built-in voice — a ListenHub voice code or a platform `voice_type`. Do not send `url` for this type.
`reference`	`url`	A custom reference audio URL (http/https) to clone the voice from. Up to 30s, ≤10MB, `wav`/`mp3`/`pcm`/`ogg_opus`. Do not send `id` for this type.

For multi-speaker dialogue, list 2–3 voices and prefix each line of text with @音频1, @音频2, … to assign lines to voices in order. Omit voices entirely to generate plain narration or pure sound effects.

voices and image are mutually exclusive — send at most one. A request that includes both is rejected. A speaker entry must carry only id; a reference entry must carry only url. Mixing them returns 33004 (invalid params).

Async Task Lifecycle

Submit a generation request to POST /v1/listenhub-voice/generate. The response carries a taskId and an initial status of pending.
Poll GET /v1/listenhub-voice/tasks/{taskId}. status moves through pending → generating → uploading → success.
On success, read audioUrl. On failed, read errorMessage.

Status	Meaning
`pending`	Created, waiting to be submitted for generation.
`generating`	Generation in progress. `audioUrl` is not yet available.
`uploading`	Generation finished, transferring the audio to storage.
`success`	Done. `audioUrl` is available.
`failed`	Failed at some stage. `errorMessage` explains why; any reserved credits are refunded.

Create a ListenHub Voice Task

POST /v1/listenhub-voice/generate

Submit text (optionally with voices or a reference image) for end-to-end audio generation. Sends JSON. Returns 202 with a taskId.

# Plain text / sound effects (no voices)
curl -X POST "https://api.marswave.ai/openapi/v1/listenhub-voice/generate" \
  -H "Authorization: Bearer $LISTENHUB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "A gentle rain falls on a quiet street at midnight.",
    "durationHint": 20
  }'

# Single voice (speaker)
curl -X POST "https://api.marswave.ai/openapi/v1/listenhub-voice/generate" \
  -H "Authorization: Bearer $LISTENHUB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Welcome to ListenHub. Here is your daily briefing.",
    "voices": [{ "type": "speaker", "id": "zh_female_warm" }]
  }'

# Voice cloning from a reference clip
curl -X POST "https://api.marswave.ai/openapi/v1/listenhub-voice/generate" \
  -H "Authorization: Bearer $LISTENHUB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "@音频1 Hi there! @音频2 Hello, how can I help?",
    "voices": [
      { "type": "reference", "url": "https://example.com/host.mp3" },
      { "type": "speaker", "id": "zh_male_calm" }
    ]
  }'

const response = await fetch(
  'https://api.marswave.ai/openapi/v1/listenhub-voice/generate',
  {
    method: 'POST',
    headers: {
      Authorization: `Bearer ${process.env.LISTENHUB_API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      text: 'Welcome to ListenHub. Here is your daily briefing.',
      voices: [{ type: 'speaker', id: 'zh_female_warm' }],
      durationHint: 20,
    }),
  },
)
const { data } = await response.json()
console.log('Task ID:', data.taskId)

import os
import requests

response = requests.post(
    'https://api.marswave.ai/openapi/v1/listenhub-voice/generate',
    headers={'Authorization': f'Bearer {os.environ["LISTENHUB_API_KEY"]}'},
    json={
        'text': 'Welcome to ListenHub. Here is your daily briefing.',
        'voices': [{'type': 'speaker', 'id': 'zh_female_warm'}],
        'durationHint': 20,
    },
)
data = response.json()['data']
print('Task ID:', data['taskId'])

For image-to-audio, send an image object instead of voices (the two are mutually exclusive):

{
  "text": "Describe this scene as a short narrated clip.",
  "image": { "url": "https://example.com/scene.jpg" }
}

Request parameters:

Field	Type	Required	Description
`model`	string	No	`listenhub-voice-1.0`. Defaults to `listenhub-voice-1.0`
`text`	string	Yes	Script to speak. Up to 1400 characters. Use `@音频N` prefixes to assign dialogue lines to voices
`voices`	array	No	1–3 voice entries (see Voices). Omit for plain text / sound effects. Mutually exclusive with `image`
`image`	object	No	Reference image for image-to-audio. Provide exactly one of `url` (http/https) or `data` (Base64, optionally with a `data:image/...;base64,` prefix). One image, ≤10MB, `jpeg`/`png`/`webp`. Mutually exclusive with `voices`
`audioConfig`	object	No	Output tuning (see below)
`durationHint`	number	No	Target length, `1`–`110` seconds. Drives the credit estimate and hints the model
`watermark`	boolean	No	Add an audio watermark

audioConfig fields:

Field	Type	Required	Description
`speechRate`	number	No	Speaking rate, `-50`–`100`
`loudnessRate`	number	No	Loudness, `-50`–`100`
`pitchRate`	number	No	Pitch, `-12`–`12`
`format`	string	No	`mp3` (default), `wav`, `pcm`, or `ogg_opus`

Returns 202:

{
  "code": 0,
  "message": "",
  "data": {
    "taskId": "68e780390fc5c9a54f695a7e",
    "status": "pending"
  }
}

Get a Task

GET /v1/listenhub-voice/tasks/{taskId}

Fetch a single task. This is the endpoint you poll after submitting a generation request.

curl -X GET "https://api.marswave.ai/openapi/v1/listenhub-voice/tasks/{taskId}" \
  -H "Authorization: Bearer $LISTENHUB_API_KEY"

const response = await fetch(
  `https://api.marswave.ai/openapi/v1/listenhub-voice/tasks/${taskId}`,
  { headers: { Authorization: `Bearer ${process.env.LISTENHUB_API_KEY}` } },
)
const { data } = await response.json()
console.log('Status:', data.status)
if (data.status === 'success') console.log('Audio:', data.audioUrl)

import os
import requests

response = requests.get(
    f'https://api.marswave.ai/openapi/v1/listenhub-voice/tasks/{task_id}',
    headers={'Authorization': f'Bearer {os.environ["LISTENHUB_API_KEY"]}'},
)
data = response.json()['data']
print('Status:', data['status'])
if data['status'] == 'success':
    print('Audio:', data['audioUrl'])

A successful task:

{
  "code": 0,
  "message": "",
  "data": {
    "id": "68e780390fc5c9a54f695a7e",
    "status": "success",
    "model": "listenhub-voice-1.0",
    "params": {
      "text": "Welcome to ListenHub. Here is your daily briefing.",
      "voices": [{ "type": "speaker", "id": "zh_female_warm" }]
    },
    "audioUrl": "https://assets.listenhub.ai/seed-audio/68e780390fc5c9a54f695a7e.mp3",
    "audioDuration": 18.4,
    "creditCharged": 12,
    "creditRefunded": 0,
    "createdAt": 1730000000000,
    "updatedAt": 1730000040000
  }
}

Task response fields:

Field	Type	Description
`id`	string	Task ID
`status`	string	`pending`, `generating`, `uploading`, `success`, or `failed`
`model`	string	`listenhub-voice-1.0`
`params`	object	Echo of the submitted request (sensitive image/audio payloads are stripped; an inline image shows as `{ "hasData": true }` with an optional `thumbnailUrl`)
`audioUrl`	string	Finished audio URL. Returned only when `status` is `success`
`audioDuration`	number	Audio length in seconds (the billed duration)
`creditCharged`	number	Credits actually charged (`0` if not yet charged)
`creditRefunded`	number	Credits refunded on failure (for reconciliation)
`errorMessage`	string	Failure reason. Returned only when `status` is `failed`
`createdAt`	number	Creation time (ms timestamp)
`updatedAt`	number	Last update time (ms timestamp)

List Tasks

GET /v1/listenhub-voice/tasks

List your ListenHub Voice tasks, newest first.

curl -X GET "https://api.marswave.ai/openapi/v1/listenhub-voice/tasks?page=1&pageSize=20&status=success" \
  -H "Authorization: Bearer $LISTENHUB_API_KEY"

const response = await fetch(
  'https://api.marswave.ai/openapi/v1/listenhub-voice/tasks?page=1&pageSize=20',
  { headers: { Authorization: `Bearer ${process.env.LISTENHUB_API_KEY}` } },
)
const { data } = await response.json()
console.log(`${data.total} tasks, showing ${data.items.length}`)

import os
import requests

response = requests.get(
    'https://api.marswave.ai/openapi/v1/listenhub-voice/tasks',
    headers={'Authorization': f'Bearer {os.environ["LISTENHUB_API_KEY"]}'},
    params={'page': 1, 'pageSize': 20},
)
data = response.json()['data']
print(data['total'], 'tasks, showing', len(data['items']))

Query parameters:

Field	Type	Required	Description
`page`	integer	No	Page number, min `1`. Defaults to `1`
`pageSize`	integer	No	Items per page, `1`–`100`. Defaults to `20`
`status`	string	No	Filter by `pending`, `generating`, `uploading`, `success`, or `failed`
`keyword`	string	No	Fuzzy match against the task `text`. Up to 64 characters

Response:

{
  "code": 0,
  "message": "",
  "data": {
    "items": [
      {
        "id": "68e780390fc5c9a54f695a7e",
        "status": "success",
        "model": "listenhub-voice-1.0",
        "audioUrl": "https://assets.listenhub.ai/seed-audio/68e780390fc5c9a54f695a7e.mp3",
        "audioDuration": 18.4,
        "creditCharged": 12,
        "creditRefunded": 0,
        "createdAt": 1730000000000,
        "updatedAt": 1730000040000
      }
    ],
    "page": 1,
    "pageSize": 20,
    "total": 1
  }
}

Each item carries the same fields as Get a Task.

Errors

Business errors return HTTP 400 with the specific code in the top-level code field.

Code	Meaning
`33001`	Task not found (or not owned by the current API user)
`33002`	Speaker not found for a `speaker` voice entry
`33003`	Generation service unavailable
`33004`	Invalid parameters (e.g. `voices` and `image` sent together, or a voice entry mixing `id` and `url`)
`33005`	Too many voices (max 3)
`33006`	Not enough credits
`33007`	Rate limited
`33008`	Generation timed out
`33009`	Per-user concurrency limit reached

HTTP status	Meaning
`400`	Invalid parameters or a business error — see the `33xxx` codes above
`429`	Rate limit exceeded (5 RPM per user on `/generate`)

Credits

Credits are reserved at submission, confirmed on success, and refunded automatically on failure. Each task reports creditCharged (actually charged) and creditRefunded (refunded on failure) for reconciliation. The billed length is audioDuration. Check your live balance with GET /v1/user/subscription, and see Pricing for credit-to-feature mapping.

On this page