ListenHub Voice
Turn text, reference voices, or an image into speech and sound effects with end-to-end async generation, then poll the task for the finished audio.
The ListenHub Voice API generates audio end to end — plain narration, sound effects, single-voice speech, multi-speaker dialogue, voice cloning from a reference clip, or image-to-audio. Generation is asynchronous: you submit a request and poll a task until it finishes. All endpoints live under https://api.marswave.ai/openapi/v1/listenhub-voice and authenticate with Authorization: Bearer $LISTENHUB_API_KEY.
Every response is wrapped in { "code": 0, "message": "", "data": { ... } }. A non-zero code means an error — see Error Handling. The examples below read fields from data.
Model and Limits
| Item | Value |
|---|---|
model | listenhub-voice-1.0 (default, the only supported value) |
| Rate limit | 5 requests per minute, per user, on /generate |
text | Up to 1400 characters |
voices | 1–3 entries (omit for plain text / sound effects) |
durationHint | 1–110 seconds (credit estimate + a hint for the target length) |
Voices
voices controls who speaks. Each entry is one of two kinds:
type | Required field | Description |
|---|---|---|
speaker | id | A built-in voice — a ListenHub voice code or a platform voice_type. Do not send url for this type. |
reference | url | A custom reference audio URL (http/https) to clone the voice from. Up to 30s, ≤10MB, wav/mp3/pcm/ogg_opus. Do not send id for this type. |
For multi-speaker dialogue, list 2–3 voices and prefix each line of text with @音频1, @音频2, … to assign lines to voices in order. Omit voices entirely to generate plain narration or pure sound effects.
voices and image are mutually exclusive — send at most one. A request that
includes both is rejected. A speaker entry must carry only id; a
reference entry must carry only url. Mixing them returns 33004 (invalid
params).
Async Task Lifecycle
- Submit a generation request to
POST /v1/listenhub-voice/generate. The response carries ataskIdand an initialstatusofpending. - Poll
GET /v1/listenhub-voice/tasks/{taskId}.statusmoves throughpending→generating→uploading→success. - On
success, readaudioUrl. Onfailed, readerrorMessage.
| Status | Meaning |
|---|---|
pending | Created, waiting to be submitted for generation. |
generating | Generation in progress. audioUrl is not yet available. |
uploading | Generation finished, transferring the audio to storage. |
success | Done. audioUrl is available. |
failed | Failed at some stage. errorMessage explains why; any reserved credits are refunded. |
Create a ListenHub Voice Task
POST /v1/listenhub-voice/generate
Submit text (optionally with voices or a reference image) for end-to-end audio generation. Sends JSON. Returns 202 with a taskId.
# Plain text / sound effects (no voices)
curl -X POST "https://api.marswave.ai/openapi/v1/listenhub-voice/generate" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"text": "A gentle rain falls on a quiet street at midnight.",
"durationHint": 20
}'
# Single voice (speaker)
curl -X POST "https://api.marswave.ai/openapi/v1/listenhub-voice/generate" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"text": "Welcome to ListenHub. Here is your daily briefing.",
"voices": [{ "type": "speaker", "id": "zh_female_warm" }]
}'
# Voice cloning from a reference clip
curl -X POST "https://api.marswave.ai/openapi/v1/listenhub-voice/generate" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"text": "@音频1 Hi there! @音频2 Hello, how can I help?",
"voices": [
{ "type": "reference", "url": "https://example.com/host.mp3" },
{ "type": "speaker", "id": "zh_male_calm" }
]
}'const response = await fetch(
'https://api.marswave.ai/openapi/v1/listenhub-voice/generate',
{
method: 'POST',
headers: {
Authorization: `Bearer ${process.env.LISTENHUB_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
text: 'Welcome to ListenHub. Here is your daily briefing.',
voices: [{ type: 'speaker', id: 'zh_female_warm' }],
durationHint: 20,
}),
},
)
const { data } = await response.json()
console.log('Task ID:', data.taskId)import os
import requests
response = requests.post(
'https://api.marswave.ai/openapi/v1/listenhub-voice/generate',
headers={'Authorization': f'Bearer {os.environ["LISTENHUB_API_KEY"]}'},
json={
'text': 'Welcome to ListenHub. Here is your daily briefing.',
'voices': [{'type': 'speaker', 'id': 'zh_female_warm'}],
'durationHint': 20,
},
)
data = response.json()['data']
print('Task ID:', data['taskId'])For image-to-audio, send an image object instead of voices (the two are mutually exclusive):
{
"text": "Describe this scene as a short narrated clip.",
"image": { "url": "https://example.com/scene.jpg" }
}Request parameters:
| Field | Type | Required | Description |
|---|---|---|---|
model | string | No | listenhub-voice-1.0. Defaults to listenhub-voice-1.0 |
text | string | Yes | Script to speak. Up to 1400 characters. Use @音频N prefixes to assign dialogue lines to voices |
voices | array | No | 1–3 voice entries (see Voices). Omit for plain text / sound effects. Mutually exclusive with image |
image | object | No | Reference image for image-to-audio. Provide exactly one of url (http/https) or data (Base64, optionally with a data:image/...;base64, prefix). One image, ≤10MB, jpeg/png/webp. Mutually exclusive with voices |
audioConfig | object | No | Output tuning (see below) |
durationHint | number | No | Target length, 1–110 seconds. Drives the credit estimate and hints the model |
watermark | boolean | No | Add an audio watermark |
audioConfig fields:
| Field | Type | Required | Description |
|---|---|---|---|
speechRate | number | No | Speaking rate, -50–100 |
loudnessRate | number | No | Loudness, -50–100 |
pitchRate | number | No | Pitch, -12–12 |
format | string | No | mp3 (default), wav, pcm, or ogg_opus |
Returns 202:
{
"code": 0,
"message": "",
"data": {
"taskId": "68e780390fc5c9a54f695a7e",
"status": "pending"
}
}Get a Task
GET /v1/listenhub-voice/tasks/{taskId}
Fetch a single task. This is the endpoint you poll after submitting a generation request.
curl -X GET "https://api.marswave.ai/openapi/v1/listenhub-voice/tasks/{taskId}" \
-H "Authorization: Bearer $LISTENHUB_API_KEY"const response = await fetch(
`https://api.marswave.ai/openapi/v1/listenhub-voice/tasks/${taskId}`,
{ headers: { Authorization: `Bearer ${process.env.LISTENHUB_API_KEY}` } },
)
const { data } = await response.json()
console.log('Status:', data.status)
if (data.status === 'success') console.log('Audio:', data.audioUrl)import os
import requests
response = requests.get(
f'https://api.marswave.ai/openapi/v1/listenhub-voice/tasks/{task_id}',
headers={'Authorization': f'Bearer {os.environ["LISTENHUB_API_KEY"]}'},
)
data = response.json()['data']
print('Status:', data['status'])
if data['status'] == 'success':
print('Audio:', data['audioUrl'])A successful task:
{
"code": 0,
"message": "",
"data": {
"id": "68e780390fc5c9a54f695a7e",
"status": "success",
"model": "listenhub-voice-1.0",
"params": {
"text": "Welcome to ListenHub. Here is your daily briefing.",
"voices": [{ "type": "speaker", "id": "zh_female_warm" }]
},
"audioUrl": "https://assets.listenhub.ai/seed-audio/68e780390fc5c9a54f695a7e.mp3",
"audioDuration": 18.4,
"creditCharged": 12,
"creditRefunded": 0,
"createdAt": 1730000000000,
"updatedAt": 1730000040000
}
}Task response fields:
| Field | Type | Description |
|---|---|---|
id | string | Task ID |
status | string | pending, generating, uploading, success, or failed |
model | string | listenhub-voice-1.0 |
params | object | Echo of the submitted request (sensitive image/audio payloads are stripped; an inline image shows as { "hasData": true } with an optional thumbnailUrl) |
audioUrl | string | Finished audio URL. Returned only when status is success |
audioDuration | number | Audio length in seconds (the billed duration) |
creditCharged | number | Credits actually charged (0 if not yet charged) |
creditRefunded | number | Credits refunded on failure (for reconciliation) |
errorMessage | string | Failure reason. Returned only when status is failed |
createdAt | number | Creation time (ms timestamp) |
updatedAt | number | Last update time (ms timestamp) |
List Tasks
GET /v1/listenhub-voice/tasks
List your ListenHub Voice tasks, newest first.
curl -X GET "https://api.marswave.ai/openapi/v1/listenhub-voice/tasks?page=1&pageSize=20&status=success" \
-H "Authorization: Bearer $LISTENHUB_API_KEY"const response = await fetch(
'https://api.marswave.ai/openapi/v1/listenhub-voice/tasks?page=1&pageSize=20',
{ headers: { Authorization: `Bearer ${process.env.LISTENHUB_API_KEY}` } },
)
const { data } = await response.json()
console.log(`${data.total} tasks, showing ${data.items.length}`)import os
import requests
response = requests.get(
'https://api.marswave.ai/openapi/v1/listenhub-voice/tasks',
headers={'Authorization': f'Bearer {os.environ["LISTENHUB_API_KEY"]}'},
params={'page': 1, 'pageSize': 20},
)
data = response.json()['data']
print(data['total'], 'tasks, showing', len(data['items']))Query parameters:
| Field | Type | Required | Description |
|---|---|---|---|
page | integer | No | Page number, min 1. Defaults to 1 |
pageSize | integer | No | Items per page, 1–100. Defaults to 20 |
status | string | No | Filter by pending, generating, uploading, success, or failed |
keyword | string | No | Fuzzy match against the task text. Up to 64 characters |
Response:
{
"code": 0,
"message": "",
"data": {
"items": [
{
"id": "68e780390fc5c9a54f695a7e",
"status": "success",
"model": "listenhub-voice-1.0",
"audioUrl": "https://assets.listenhub.ai/seed-audio/68e780390fc5c9a54f695a7e.mp3",
"audioDuration": 18.4,
"creditCharged": 12,
"creditRefunded": 0,
"createdAt": 1730000000000,
"updatedAt": 1730000040000
}
],
"page": 1,
"pageSize": 20,
"total": 1
}
}Each item carries the same fields as Get a Task.
Errors
Business errors return HTTP 400 with the specific code in the top-level code field.
| Code | Meaning |
|---|---|
33001 | Task not found (or not owned by the current API user) |
33002 | Speaker not found for a speaker voice entry |
33003 | Generation service unavailable |
33004 | Invalid parameters (e.g. voices and image sent together, or a voice entry mixing id and url) |
33005 | Too many voices (max 3) |
33006 | Not enough credits |
33007 | Rate limited |
33008 | Generation timed out |
33009 | Per-user concurrency limit reached |
| HTTP status | Meaning |
|---|---|
400 | Invalid parameters or a business error — see the 33xxx codes above |
429 | Rate limit exceeded (5 RPM per user on /generate) |
Credits
Credits are reserved at submission, confirmed on success, and refunded automatically on failure. Each task reports creditCharged (actually charged) and creditRefunded (refunded on failure) for reconciliation. The billed length is audioDuration. Check your live balance with GET /v1/user/subscription, and see Pricing for credit-to-feature mapping.