PixVerse Video
Create PixVerse AI video tasks across nine capabilities — text-to-video, image-to-video, transitions, fusion, restyle, mimic, lip sync, and marketing agents.
PixVerse generates short videos asynchronously across nine capabilities. Submit a generation request, then poll the task until it reaches success or failed. Tasks created here are queried through the same AI Video task, list, share, and delete endpoints.
All endpoints on this page use the OpenAPI base URL
https://api.marswave.ai/openapi and authenticate with your API key via the
Authorization: Bearer $LISTENHUB_API_KEY header.
Endpoints
| Method | Path | Purpose |
|---|---|---|
POST | /v1/video-generation/pixverse/generate | Create a PixVerse generation task. |
POST | /v1/video-generation/pixverse/estimate-credits | Estimate credits before generation. |
Region routing follows language. The default en uses the PixVerse
international service; zh uses the China service. PixVerse provider keys,
internal media IDs, trace IDs, and raw provider responses are never returned to
the client.
Capabilities
capability is required. It selects the generation mode and decides which assets and nested fields are needed.
| Capability | What it does | Required input |
|---|---|---|
text_to_video | Generate from a text prompt only | prompt; no assets |
image_to_video | Animate from one or more images | prompt + 1-10 images |
transition | Transition between two images | exactly 2 images + prompt |
multi_transition | Multi-clip transition sequence | pixverse.multiTransition (2-7 clips); no top-level assets |
fusion | Compose subjects/backgrounds by reference | pixverse.imageReferences (1-8) + prompt containing each @refName |
restyle | Restyle a prior PixVerse video | sourceTaskId (or pixverse.sourceVideoId) + pixverse.restyleId; no assets |
mimic | Apply a motion video to a subject image | exactly 1 image + 1 video |
lip_sync | Lip-sync a video to audio or TTS | 1 video (or sourceTaskId) + 1 audio or pixverse.tts |
agent | Marketing agent (ad_master / promo_mix) | pixverse.agentType + product images |
Request Parameters
POST /v1/video-generation/pixverse/generate
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
capability | string | Yes | - | One of the nine capabilities above. |
model | string | No | pixverse | PixVerse model version: pixverse, v6, v5, or v4.5. |
language | string | No | en | Service region: en (international) or zh (China). |
prompt | string | No | - | Up to 2048 characters. Required for text_to_video, image_to_video, transition, fusion, agent. |
quality | string | No | 720p | 360p, 540p, 720p, or 1080p. multi_transition defaults to 360p. |
aspectRatio | string | No | 16:9 | 9:16, 16:9, 1:1, 4:3, or 3:4. agent defaults to 9:16. |
duration | integer | No | 5 | Output seconds, 1-60. agent accepts only 20, 30, or 60 (default 30). |
sourceTaskId | string | No | - | A prior succeeded PixVerse task to reuse (restyle / lip_sync source video). |
images | array | No | [] | Up to 10 items, each { url, duration? }. |
videos | array | No | [] | Up to 2 items, each { url, duration? }. |
audios | array | No | [] | Up to 1 item, { url, duration? }. |
pixverse | object | No | {} | Capability-specific options. See Nested pixverse Object. |
Each asset's url is required; the optional duration is in seconds (0-180).
Nested pixverse Object
| Field | Type | Used by | Description |
|---|---|---|---|
agentType | string | agent | ad_master or promo_mix. |
motionMode | string | optional | Motion preset. |
cameraMovement | string | optional | Camera movement preset. |
templateId | string/number | optional | Template identifier. |
sourceVideoId | string/number | restyle/lip_sync | Provider source video id (alternative to sourceTaskId). |
restyleId | string/number | restyle | Required restyle style id. |
multiTransition | array | multi_transition | 2-7 clips, each { imageUrl, duration (0-30), prompt }. |
imageReferences | array | fusion | 1-8 refs, each { type: subject|background, imageUrl, refName }. |
tts | object | lip_sync | { speakerId, content } to drive lip sync from synthesized speech. |
soundEffectSwitch | boolean | optional | Enable generated sound effects. |
soundEffectContent | string | optional | Sound-effect description. |
lipSyncTtsSwitch | boolean | optional | Enable TTS-driven lip sync. |
lipSyncTtsSpeakerId | string | optional | Speaker id for TTS lip sync. |
lipSyncTtsContent | string | optional | Spoken text for TTS lip sync. |
brandSticker | object | agent | { imageUrl, position }; position is one of up, down, left, right, upper_left, lower_left, upper_right, lower_right. |
introOutroClip | object | agent | { videoUrl, position }; position is start or end. |
refName Format
A refName must match ^[A-Za-z][A-Za-z0-9_]{0,31}$ — it starts with a letter and contains only letters, digits, and underscores.
Per-Capability Constraints
The generation request is validated per capability. The most common rules:
| Capability | Constraint |
|---|---|
mimic | quality is locked to 720p. Needs exactly 1 image + 1 video. Video duration, if given, must be 5-30s. |
agent | quality must be 720p or 1080p; duration must be 20, 30, or 60. |
agent promo_mix | Needs at least 4 product images. |
agent ad_master | Needs at least 1 product image and no video. |
multi_transition | Default quality is 360p. Use pixverse.multiTransition; no top-level images/videos/audios. |
fusion | The prompt must contain @refName for every entry in pixverse.imageReferences. |
transition | Exactly 2 images. |
restyle | Requires a source (sourceTaskId or pixverse.sourceVideoId) plus pixverse.restyleId. |
lip_sync | Needs a source video (1 video or sourceTaskId) plus exactly one audio source — either 1 audio (5-60s) or pixverse.tts, not both. |
Pricing
PixVerse uses a provider-credit pricing model: ListenHub credits are derived from the provider's quoted cost. Because cost depends on capability, quality, duration, and asset mix, always call estimate-credits before generate to show the user an accurate cost. Credits are charged when the task is created and refunded automatically if generation fails.
Create PixVerse Task
POST /v1/video-generation/pixverse/generate
Returns a taskId and episodeId. Poll GET /v1/video-generation/tasks/{taskId} until the task is success or failed.
Text to Video
curl -X POST "https://api.marswave.ai/openapi/v1/video-generation/pixverse/generate" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"capability": "text_to_video",
"model": "pixverse",
"language": "en",
"prompt": "A neon-lit street in the rain, cinematic slow dolly shot",
"quality": "720p",
"aspectRatio": "16:9",
"duration": 5
}'const response = await fetch(
'https://api.marswave.ai/openapi/v1/video-generation/pixverse/generate',
{
method: 'POST',
headers: {
Authorization: `Bearer ${process.env.LISTENHUB_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
capability: 'text_to_video',
model: 'pixverse',
language: 'en',
prompt: 'A neon-lit street in the rain, cinematic slow dolly shot',
quality: '720p',
aspectRatio: '16:9',
duration: 5,
}),
},
)
const data = await response.json()
console.log('Task ID:', data.data.taskId)import os
import requests
response = requests.post(
'https://api.marswave.ai/openapi/v1/video-generation/pixverse/generate',
headers={'Authorization': f'Bearer {os.environ["LISTENHUB_API_KEY"]}'},
json={
'capability': 'text_to_video',
'model': 'pixverse',
'language': 'en',
'prompt': 'A neon-lit street in the rain, cinematic slow dolly shot',
'quality': '720p',
'aspectRatio': '16:9',
'duration': 5,
},
)
data = response.json()
print('Task ID:', data['data']['taskId'])Lip Sync
Provide one source video (or a sourceTaskId) plus exactly one audio source: either one audios item or pixverse.tts.
curl -X POST "https://api.marswave.ai/openapi/v1/video-generation/pixverse/generate" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"capability": "lip_sync",
"quality": "720p",
"videos": [
{ "url": "https://example.com/talking-head.mp4", "duration": 12 }
],
"pixverse": {
"tts": {
"speakerId": "en_male_001",
"content": "Welcome back to the channel. Today we are shipping something new."
}
}
}'const response = await fetch(
'https://api.marswave.ai/openapi/v1/video-generation/pixverse/generate',
{
method: 'POST',
headers: {
Authorization: `Bearer ${process.env.LISTENHUB_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
capability: 'lip_sync',
quality: '720p',
videos: [
{ url: 'https://example.com/talking-head.mp4', duration: 12 },
],
pixverse: {
tts: {
speakerId: 'en_male_001',
content:
'Welcome back to the channel. Today we are shipping something new.',
},
},
}),
},
)
const data = await response.json()
console.log('Task ID:', data.data.taskId)import os
import requests
response = requests.post(
'https://api.marswave.ai/openapi/v1/video-generation/pixverse/generate',
headers={'Authorization': f'Bearer {os.environ["LISTENHUB_API_KEY"]}'},
json={
'capability': 'lip_sync',
'quality': '720p',
'videos': [
{'url': 'https://example.com/talking-head.mp4', 'duration': 12}
],
'pixverse': {
'tts': {
'speakerId': 'en_male_001',
'content': 'Welcome back to the channel. Today we are shipping something new.',
}
},
},
)
data = response.json()
print('Task ID:', data['data']['taskId'])Response:
{
"code": 0,
"message": "",
"data": {
"taskId": "665f1d4e8b3a3f001234abcd",
"episodeId": "665f1d4e8b3a3f001234abce",
"status": "generating"
}
}Estimate Credits
POST /v1/video-generation/pixverse/estimate-credits
Estimate the credit cost before creating a task.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
capability | string | Yes | - | One of the nine capabilities. |
model | string | No | pixverse | pixverse, v6, v5, or v4.5. |
language | string | No | en | en (international) or zh (China). |
duration | integer | No | 5 | 1-60 seconds (agent: 20, 30, or 60). |
quality | string | No | 720p | 360p, 540p, 720p, or 1080p (multi_transition: 360p). |
pixverse.agentType | string | No | - | ad_master or promo_mix (required for agent). |
curl -X POST "https://api.marswave.ai/openapi/v1/video-generation/pixverse/estimate-credits" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"capability": "text_to_video",
"model": "pixverse",
"quality": "720p",
"duration": 5
}'Response:
{
"code": 0,
"message": "",
"data": {
"tokens": 155520,
"credits": 12
}
}Rate Limit
PixVerse generation shares the AI video generation rate limit of 5 RPM per user on the generate endpoint. Exceeding it returns error 29998 (429). Implement exponential backoff on retries.
Error Codes
| Code | HTTP | Meaning |
|---|---|---|
32001 | 404 | Task not found. |
32002 | 402 | Not enough credits. |
32003 | 500 | Provider error during generation. |
32004 | 400 | Invalid parameters or unsupported capability combination. |
32005 | 403 | The task exists but does not belong to the current API user. |
32006 | 400 | Audio input requires at least one image or video. |
32007 | 429 | Upstream provider throttling or video concurrency-slot exhaustion (distinct from the per-user 29998 request-rate limit). |
32008 | 400 | Content rejected by moderation. |