ListenHubOpenAPI
API ReferenceAI Video

PixVerse Video

Create PixVerse AI video tasks across nine capabilities — text-to-video, image-to-video, transitions, fusion, restyle, mimic, lip sync, and marketing agents.

PixVerse generates short videos asynchronously across nine capabilities. Submit a generation request, then poll the task until it reaches success or failed. Tasks created here are queried through the same AI Video task, list, share, and delete endpoints.

All endpoints on this page use the OpenAPI base URL https://api.marswave.ai/openapi and authenticate with your API key via the Authorization: Bearer $LISTENHUB_API_KEY header.

Endpoints

MethodPathPurpose
POST/v1/video-generation/pixverse/generateCreate a PixVerse generation task.
POST/v1/video-generation/pixverse/estimate-creditsEstimate credits before generation.

Region routing follows language. The default en uses the PixVerse international service; zh uses the China service. PixVerse provider keys, internal media IDs, trace IDs, and raw provider responses are never returned to the client.

Capabilities

capability is required. It selects the generation mode and decides which assets and nested fields are needed.

CapabilityWhat it doesRequired input
text_to_videoGenerate from a text prompt onlyprompt; no assets
image_to_videoAnimate from one or more imagesprompt + 1-10 images
transitionTransition between two imagesexactly 2 images + prompt
multi_transitionMulti-clip transition sequencepixverse.multiTransition (2-7 clips); no top-level assets
fusionCompose subjects/backgrounds by referencepixverse.imageReferences (1-8) + prompt containing each @refName
restyleRestyle a prior PixVerse videosourceTaskId (or pixverse.sourceVideoId) + pixverse.restyleId; no assets
mimicApply a motion video to a subject imageexactly 1 image + 1 video
lip_syncLip-sync a video to audio or TTS1 video (or sourceTaskId) + 1 audio or pixverse.tts
agentMarketing agent (ad_master / promo_mix)pixverse.agentType + product images

Request Parameters

POST /v1/video-generation/pixverse/generate

ParameterTypeRequiredDefaultDescription
capabilitystringYes-One of the nine capabilities above.
modelstringNopixversePixVerse model version: pixverse, v6, v5, or v4.5.
languagestringNoenService region: en (international) or zh (China).
promptstringNo-Up to 2048 characters. Required for text_to_video, image_to_video, transition, fusion, agent.
qualitystringNo720p360p, 540p, 720p, or 1080p. multi_transition defaults to 360p.
aspectRatiostringNo16:99:16, 16:9, 1:1, 4:3, or 3:4. agent defaults to 9:16.
durationintegerNo5Output seconds, 1-60. agent accepts only 20, 30, or 60 (default 30).
sourceTaskIdstringNo-A prior succeeded PixVerse task to reuse (restyle / lip_sync source video).
imagesarrayNo[]Up to 10 items, each { url, duration? }.
videosarrayNo[]Up to 2 items, each { url, duration? }.
audiosarrayNo[]Up to 1 item, { url, duration? }.
pixverseobjectNo{}Capability-specific options. See Nested pixverse Object.

Each asset's url is required; the optional duration is in seconds (0-180).

Nested pixverse Object

FieldTypeUsed byDescription
agentTypestringagentad_master or promo_mix.
motionModestringoptionalMotion preset.
cameraMovementstringoptionalCamera movement preset.
templateIdstring/numberoptionalTemplate identifier.
sourceVideoIdstring/numberrestyle/lip_syncProvider source video id (alternative to sourceTaskId).
restyleIdstring/numberrestyleRequired restyle style id.
multiTransitionarraymulti_transition2-7 clips, each { imageUrl, duration (0-30), prompt }.
imageReferencesarrayfusion1-8 refs, each { type: subject|background, imageUrl, refName }.
ttsobjectlip_sync{ speakerId, content } to drive lip sync from synthesized speech.
soundEffectSwitchbooleanoptionalEnable generated sound effects.
soundEffectContentstringoptionalSound-effect description.
lipSyncTtsSwitchbooleanoptionalEnable TTS-driven lip sync.
lipSyncTtsSpeakerIdstringoptionalSpeaker id for TTS lip sync.
lipSyncTtsContentstringoptionalSpoken text for TTS lip sync.
brandStickerobjectagent{ imageUrl, position }; position is one of up, down, left, right, upper_left, lower_left, upper_right, lower_right.
introOutroClipobjectagent{ videoUrl, position }; position is start or end.

refName Format

A refName must match ^[A-Za-z][A-Za-z0-9_]{0,31}$ — it starts with a letter and contains only letters, digits, and underscores.

Per-Capability Constraints

The generation request is validated per capability. The most common rules:

CapabilityConstraint
mimicquality is locked to 720p. Needs exactly 1 image + 1 video. Video duration, if given, must be 5-30s.
agentquality must be 720p or 1080p; duration must be 20, 30, or 60.
agent promo_mixNeeds at least 4 product images.
agent ad_masterNeeds at least 1 product image and no video.
multi_transitionDefault quality is 360p. Use pixverse.multiTransition; no top-level images/videos/audios.
fusionThe prompt must contain @refName for every entry in pixverse.imageReferences.
transitionExactly 2 images.
restyleRequires a source (sourceTaskId or pixverse.sourceVideoId) plus pixverse.restyleId.
lip_syncNeeds a source video (1 video or sourceTaskId) plus exactly one audio source — either 1 audio (5-60s) or pixverse.tts, not both.

Pricing

PixVerse uses a provider-credit pricing model: ListenHub credits are derived from the provider's quoted cost. Because cost depends on capability, quality, duration, and asset mix, always call estimate-credits before generate to show the user an accurate cost. Credits are charged when the task is created and refunded automatically if generation fails.

Create PixVerse Task

POST /v1/video-generation/pixverse/generate

Returns a taskId and episodeId. Poll GET /v1/video-generation/tasks/{taskId} until the task is success or failed.

Text to Video

curl -X POST "https://api.marswave.ai/openapi/v1/video-generation/pixverse/generate" \
  -H "Authorization: Bearer $LISTENHUB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "capability": "text_to_video",
    "model": "pixverse",
    "language": "en",
    "prompt": "A neon-lit street in the rain, cinematic slow dolly shot",
    "quality": "720p",
    "aspectRatio": "16:9",
    "duration": 5
  }'
const response = await fetch(
  'https://api.marswave.ai/openapi/v1/video-generation/pixverse/generate',
  {
    method: 'POST',
    headers: {
      Authorization: `Bearer ${process.env.LISTENHUB_API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      capability: 'text_to_video',
      model: 'pixverse',
      language: 'en',
      prompt: 'A neon-lit street in the rain, cinematic slow dolly shot',
      quality: '720p',
      aspectRatio: '16:9',
      duration: 5,
    }),
  },
)
const data = await response.json()
console.log('Task ID:', data.data.taskId)
import os
import requests

response = requests.post(
    'https://api.marswave.ai/openapi/v1/video-generation/pixverse/generate',
    headers={'Authorization': f'Bearer {os.environ["LISTENHUB_API_KEY"]}'},
    json={
        'capability': 'text_to_video',
        'model': 'pixverse',
        'language': 'en',
        'prompt': 'A neon-lit street in the rain, cinematic slow dolly shot',
        'quality': '720p',
        'aspectRatio': '16:9',
        'duration': 5,
    },
)
data = response.json()
print('Task ID:', data['data']['taskId'])

Lip Sync

Provide one source video (or a sourceTaskId) plus exactly one audio source: either one audios item or pixverse.tts.

curl -X POST "https://api.marswave.ai/openapi/v1/video-generation/pixverse/generate" \
  -H "Authorization: Bearer $LISTENHUB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "capability": "lip_sync",
    "quality": "720p",
    "videos": [
      { "url": "https://example.com/talking-head.mp4", "duration": 12 }
    ],
    "pixverse": {
      "tts": {
        "speakerId": "en_male_001",
        "content": "Welcome back to the channel. Today we are shipping something new."
      }
    }
  }'
const response = await fetch(
  'https://api.marswave.ai/openapi/v1/video-generation/pixverse/generate',
  {
    method: 'POST',
    headers: {
      Authorization: `Bearer ${process.env.LISTENHUB_API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      capability: 'lip_sync',
      quality: '720p',
      videos: [
        { url: 'https://example.com/talking-head.mp4', duration: 12 },
      ],
      pixverse: {
        tts: {
          speakerId: 'en_male_001',
          content:
            'Welcome back to the channel. Today we are shipping something new.',
        },
      },
    }),
  },
)
const data = await response.json()
console.log('Task ID:', data.data.taskId)
import os
import requests

response = requests.post(
    'https://api.marswave.ai/openapi/v1/video-generation/pixverse/generate',
    headers={'Authorization': f'Bearer {os.environ["LISTENHUB_API_KEY"]}'},
    json={
        'capability': 'lip_sync',
        'quality': '720p',
        'videos': [
            {'url': 'https://example.com/talking-head.mp4', 'duration': 12}
        ],
        'pixverse': {
            'tts': {
                'speakerId': 'en_male_001',
                'content': 'Welcome back to the channel. Today we are shipping something new.',
            }
        },
    },
)
data = response.json()
print('Task ID:', data['data']['taskId'])

Response:

{
  "code": 0,
  "message": "",
  "data": {
    "taskId": "665f1d4e8b3a3f001234abcd",
    "episodeId": "665f1d4e8b3a3f001234abce",
    "status": "generating"
  }
}

Estimate Credits

POST /v1/video-generation/pixverse/estimate-credits

Estimate the credit cost before creating a task.

ParameterTypeRequiredDefaultDescription
capabilitystringYes-One of the nine capabilities.
modelstringNopixversepixverse, v6, v5, or v4.5.
languagestringNoenen (international) or zh (China).
durationintegerNo51-60 seconds (agent: 20, 30, or 60).
qualitystringNo720p360p, 540p, 720p, or 1080p (multi_transition: 360p).
pixverse.agentTypestringNo-ad_master or promo_mix (required for agent).
curl -X POST "https://api.marswave.ai/openapi/v1/video-generation/pixverse/estimate-credits" \
  -H "Authorization: Bearer $LISTENHUB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "capability": "text_to_video",
    "model": "pixverse",
    "quality": "720p",
    "duration": 5
  }'

Response:

{
  "code": 0,
  "message": "",
  "data": {
    "tokens": 155520,
    "credits": 12
  }
}

Rate Limit

PixVerse generation shares the AI video generation rate limit of 5 RPM per user on the generate endpoint. Exceeding it returns error 29998 (429). Implement exponential backoff on retries.

Error Codes

CodeHTTPMeaning
32001404Task not found.
32002402Not enough credits.
32003500Provider error during generation.
32004400Invalid parameters or unsupported capability combination.
32005403The task exists but does not belong to the current API user.
32006400Audio input requires at least one image or video.
32007429Upstream provider throttling or video concurrency-slot exhaustion (distinct from the per-user 29998 request-rate limit).
32008400Content rejected by moderation.

On this page