Text to Speech
Convert text to natural-sounding speech. Three endpoints covering everything from low-latency single-voice to AI-enhanced long-form audio.
ListenHub provides three text-to-speech endpoints for different use cases:
| Endpoint | Characteristics | Response | Best For |
|---|---|---|---|
/v1/tts | Single voice, low latency | Sync, binary MP3 stream | Real-time playback, voice notifications |
/v1/speech | Multi-voice scripts, sync | Sync, JSON with audioUrl | Audiobooks, radio dramas, dialogue |
/v1/flow-speech/episodes | AI polish / direct, URL support | Async, poll by episodeId | Article narration, newsletter audio |
Streaming TTS
POST /v1/tts
Low-latency single-voice text-to-speech. Returns streaming binary audio (MP3). Ideal for real-time playback and in-app voice features.
curl -X POST "https://api.marswave.ai/openapi/v1/tts" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"input": "Hello, welcome to ListenHub text-to-speech.",
"voice": "EN-Man-General-01"
}' \
--output output.mp3const response = await fetch('https://api.marswave.ai/openapi/v1/tts', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.LISTENHUB_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
input: 'Hello, welcome to ListenHub text-to-speech.',
voice: 'EN-Man-General-01',
}),
});
const blob = await response.blob();
// Save or play the audio blobimport os
import requests
response = requests.post(
'https://api.marswave.ai/openapi/v1/tts',
headers={'Authorization': f'Bearer {os.environ["LISTENHUB_API_KEY"]}'},
json={
'input': 'Hello, welcome to ListenHub text-to-speech.',
'voice': 'EN-Man-General-01',
},
stream=True
)
with open('output.mp3', 'wb') as f:
for chunk in response.iter_content(chunk_size=8192):
f.write(chunk)Request parameters:
| Field | Type | Required | Description |
|---|---|---|---|
input | string | Yes | Text to convert |
voice | string | Yes | Speaker ID (i.e. speakerId) |
model | string | No | Model name, defaults to flowtts |
The response is a streaming binary MP3 audio, not JSON. Use streaming mode in your HTTP client. On error, the response falls back to a JSON error object.
Multi-Speaker Script to Audio
POST /v1/speech
Generate multi-speaker audio from prepared scripts. Each line can use a different voice. Returns audio URL synchronously.
curl -X POST "https://api.marswave.ai/openapi/v1/speech" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"scripts": [
{
"content": "Welcome everyone to this episode.",
"speakerId": "EN-Man-General-01"
},
{
"content": "Today we are discussing an interesting topic.",
"speakerId": "EN-Woman-General-01"
},
{
"content": "Great, let us begin.",
"speakerId": "EN-Man-General-01"
}
]
}'const response = await fetch('https://api.marswave.ai/openapi/v1/speech', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.LISTENHUB_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
scripts: [
{ content: 'Welcome everyone to this episode.', speakerId: 'EN-Man-General-01' },
{ content: 'Today we are discussing an interesting topic.', speakerId: 'EN-Woman-General-01' },
{ content: 'Great, let us begin.', speakerId: 'EN-Man-General-01' },
],
}),
});
const data = await response.json();
console.log(data.data.audioUrl);import os
import requests
response = requests.post(
'https://api.marswave.ai/openapi/v1/speech',
headers={'Authorization': f'Bearer {os.environ["LISTENHUB_API_KEY"]}'},
json={
'scripts': [
{'content': 'Welcome everyone to this episode.', 'speakerId': 'EN-Man-General-01'},
{'content': 'Today we are discussing an interesting topic.', 'speakerId': 'EN-Woman-General-01'},
{'content': 'Great, let us begin.', 'speakerId': 'EN-Man-General-01'},
]
}
)
data = response.json()
print(data['data']['audioUrl'])Request parameters:
| Field | Type | Required | Description |
|---|---|---|---|
scripts | array | Yes | Script array |
scripts[].content | string | Yes | Line text |
scripts[].speakerId | string | Yes | Speaker ID |
title | string | No | Custom title (auto-generated if omitted) |
Response example:
{
"code": 0,
"message": "",
"data": {
"audioUrl": "https://assets.listenhub.ai/listenhub-public-prod/podcast/example.mp3",
"audioDuration": 12500,
"subtitlesUrl": "https://assets.listenhub.ai/listenhub-public-prod/podcast/example.srt",
"taskId": "1eed39d387a046c0a1213e6b8f139d77",
"credits": 12
}
}Response fields:
| Field | Type | Description |
|---|---|---|
audioUrl | string | MP3 audio file URL |
audioDuration | integer | Audio duration in milliseconds |
subtitlesUrl | string | SRT subtitle file URL |
taskId | string | Task ID |
credits | integer | Credits consumed |
Long-Form Text to Speech
POST /v1/flow-speech/episodes
Convert text or URL content to speech with AI enhancement or direct conversion. Runs as an async task — submit the request, then poll for results. Minimum 10 characters per request.
Smart Mode (AI Polish)
Auto-corrects grammar, formatting, and punctuation before synthesis:
curl -X POST "https://api.marswave.ai/openapi/v1/flow-speech/episodes" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"sources": [
{
"type": "text",
"content": "welcome to listenhub this text is intentionally rough and punctuation will be improved automatically"
}
],
"speakers": [
{"speakerId": "EN-Woman-General-01"}
],
"language": "en",
"mode": "smart"
}'const response = await fetch('https://api.marswave.ai/openapi/v1/flow-speech/episodes', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.LISTENHUB_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
sources: [
{
type: 'text',
content: 'welcome to listenhub this text is intentionally rough and punctuation will be improved automatically',
},
],
speakers: [{ speakerId: 'EN-Woman-General-01' }],
language: 'en',
mode: 'smart',
}),
});
const data = await response.json();
console.log('Episode ID:', data.data.episodeId);import os
import requests
response = requests.post(
'https://api.marswave.ai/openapi/v1/flow-speech/episodes',
headers={'Authorization': f'Bearer {os.environ["LISTENHUB_API_KEY"]}'},
json={
'sources': [
{
'type': 'text',
'content': 'welcome to listenhub this text is intentionally rough and punctuation will be improved automatically',
}
],
'speakers': [{'speakerId': 'EN-Woman-General-01'}],
'language': 'en',
'mode': 'smart',
}
)
data = response.json()
print('Episode ID:', data['data']['episodeId'])Direct Mode
Converts text as-is, no modification:
curl -X POST "https://api.marswave.ai/openapi/v1/flow-speech/episodes" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"sources": [
{
"type": "text",
"content": "Welcome to ListenHub. This script is already finalized and should be converted as-is."
}
],
"speakers": [
{"speakerId": "EN-Man-General-01"}
],
"language": "en",
"mode": "direct"
}'const response = await fetch('https://api.marswave.ai/openapi/v1/flow-speech/episodes', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.LISTENHUB_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
sources: [
{
type: 'text',
content: 'Welcome to ListenHub. This script is already finalized and should be converted as-is.',
},
],
speakers: [{ speakerId: 'EN-Man-General-01' }],
language: 'en',
mode: 'direct',
}),
});
const data = await response.json();
console.log('Episode ID:', data.data.episodeId);import os
import requests
response = requests.post(
'https://api.marswave.ai/openapi/v1/flow-speech/episodes',
headers={'Authorization': f'Bearer {os.environ["LISTENHUB_API_KEY"]}'},
json={
'sources': [
{
'type': 'text',
'content': 'Welcome to ListenHub. This script is already finalized and should be converted as-is.',
}
],
'speakers': [{'speakerId': 'EN-Man-General-01'}],
'language': 'en',
'mode': 'direct',
}
)
data = response.json()
print('Episode ID:', data['data']['episodeId'])Read Content from URL
curl -X POST "https://api.marswave.ai/openapi/v1/flow-speech/episodes" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"sources": [
{
"type": "url",
"content": "https://example.com/article.html"
}
],
"speakers": [
{"speakerId": "EN-Woman-General-01"}
],
"language": "en",
"mode": "smart"
}'const response = await fetch('https://api.marswave.ai/openapi/v1/flow-speech/episodes', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.LISTENHUB_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
sources: [{ type: 'url', content: 'https://example.com/article.html' }],
speakers: [{ speakerId: 'EN-Woman-General-01' }],
language: 'en',
mode: 'smart',
}),
});
const data = await response.json();
console.log('Episode ID:', data.data.episodeId);import os
import requests
response = requests.post(
'https://api.marswave.ai/openapi/v1/flow-speech/episodes',
headers={'Authorization': f'Bearer {os.environ["LISTENHUB_API_KEY"]}'},
json={
'sources': [{'type': 'url', 'content': 'https://example.com/article.html'}],
'speakers': [{'speakerId': 'EN-Woman-General-01'}],
'language': 'en',
'mode': 'smart',
}
)
data = response.json()
print('Episode ID:', data['data']['episodeId'])Request Parameters
| Field | Type | Required | Description |
|---|---|---|---|
sources | array | Yes | Content source, max 1 item |
sources[].type | string | Yes | text or url |
sources[].content | string | Yes | Text content or URL |
speakers | array | Yes | Speaker list, 1-2 items |
speakers[].speakerId | string | Yes | Speaker ID |
language | string | No | Language: en, zh, ja |
mode | string | No | smart (AI polish) or direct (as-is) |
Query Results
GET /v1/flow-speech/episodes/{episodeId}
Poll with the returned episodeId until processStatus is success:
curl -X GET "https://api.marswave.ai/openapi/v1/flow-speech/episodes/{episodeId}" \
-H "Authorization: Bearer $LISTENHUB_API_KEY"const response = await fetch(
`https://api.marswave.ai/openapi/v1/flow-speech/episodes/${episodeId}`,
{
headers: { 'Authorization': `Bearer ${process.env.LISTENHUB_API_KEY}` },
}
);
const data = await response.json();
console.log('Status:', data.data.processStatus);
console.log('Audio URL:', data.data.audioUrl);import os
import requests
response = requests.get(
f'https://api.marswave.ai/openapi/v1/flow-speech/episodes/{episode_id}',
headers={'Authorization': f'Bearer {os.environ["LISTENHUB_API_KEY"]}'}
)
data = response.json()
print('Status:', data['data']['processStatus'])
print('Audio URL:', data['data']['audioUrl'])Response when complete (processStatus: "success"):
{
"code": 0,
"message": "",
"data": {
"episodeId": "{episodeId}",
"processStatus": "success",
"credits": 18,
"title": "Article Title",
"audioUrl": "https://assets.listenhub.ai/listenhub-public-prod/podcast/{episodeId}.mp3",
"audioStreamUrl": "https://assets.listenhub.ai/listenhub-public-prod/podcast/{episodeId}.m3u8",
"scripts": "Full narration script text..."
}
}Long-form text-to-speech typically completes in 1-2 minutes. Recommended polling: wait 30 seconds after creation, then check every 10 seconds. On failure, processStatus is failed and failCode indicates the reason.