OpenAPI Commands
Reference for every `listenhub openapi` command — the API-key namespace for scripts and CI.
The listenhub openapi namespace runs every command against your API key instead of an OAuth login. Use it on servers and in CI, where you control the environment and want a long-lived credential rather than an interactive browser flow.
Before you start, make the key available — set LISTENHUB_API_KEY or store it with listenhub openapi config set-key. See Authentication for where credentials live and how the CLI resolves them.
export LISTENHUB_API_KEY="lh_sk_..."
listenhub openapi speakers list --language enConventions used on this page
Every command in this namespace shares the same behavior:
- Output. Human-readable text by default;
--json/-jprints machine-readable JSON tostdout(errors go tostderr). Pipe JSON intojq. - Async creation. Commands that start generation submit a job and then poll until it reaches a terminal state, printing a spinner. Polling runs on a 10-second interval.
--no-waitreturns the ID immediately and exits0;--timeout <seconds>caps the wait (default varies by command, noted per group below). - Exit codes.
0success,1error,2auth,3timeout. - Credits. Generation consumes credits. Use the relevant
estimatecommand before creating, and check your balance withlistenhub openapi subscription. Never assume a fixed cost.
Every command accepts --help / -h. Run listenhub openapi <group> <command> --help to see the exact flags for your installed version.
config
Manage the stored API key. The CLI reads LISTENHUB_API_KEY first, then the file written by set-key.
| Command | Description |
|---|---|
config set-key | Prompt for a key and store it at ~/.config/listenhub/openapi.json (mode 0600). The key must start with lh_sk_. |
config show | Show the configured key (masked) and its source (env or file). Add --json. Exits 1 if nothing is configured. |
config clear | Remove the stored key file. |
listenhub openapi config set-key
listenhub openapi config show --jsonspeakers
List the voices available to your account. The ID column is the speakerId you pass to creation commands.
| Command | Options | Description |
|---|---|---|
speakers list | --language <lang>, --json | List speakers, optionally filtered by language. |
listenhub openapi speakers list --language enPrints a table of Name, ID, Gender, and Language.
tts, audio-speech, speech
Three ways to turn text into speech. The first two stream binary audio to a file; the third returns a hosted audio URL.
| Command | Description |
|---|---|
tts | Text-to-speech, streamed to a local file. |
audio-speech | Same as tts, on the OpenAI /v1/audio/speech-compatible route. |
speech | Create speech and get back a hosted audioUrl (plus duration, credits, and subtitles when available). |
tts and audio-speech share these options:
| Option | Default | Description |
|---|---|---|
--text <text> | required | Text to convert. |
--voice <speakerId> | required | Speaker ID. |
--output <file> | required | Output file path. |
--format <format> | mp3 | One of mp3, opus, aac, flac, wav, pcm. |
speech options:
| Option | Description |
|---|---|
--script <content> | Script text (required). |
--speaker-id <id> | Speaker ID (required). |
--json, -j | Output JSON. |
# Stream an MP3 to disk
listenhub openapi tts \
--text "Welcome to ListenHub." \
--voice <speaker-id> \
--output welcome.mp3
# Get a hosted audio URL instead
listenhub openapi speech --script "Welcome to ListenHub." --speaker-id <speaker-id>flow-speech
Flow Speech turns sources or scripts into a narrated episode. Creation commands poll until processStatus is success; the default --timeout is 300 seconds.
| Command | Description |
|---|---|
flow-speech create | Create an episode from --source-url / --source-text. |
flow-speech get <episodeId> | Fetch episode details. |
flow-speech tts | Create an episode directly from scripts (no source extraction). |
flow-speech text-stream <episodeId> | Stream generated text over SSE. |
flow-speech create options:
| Option | Default | Description |
|---|---|---|
--source-url <url> | — | Source URL. Repeatable. |
--source-text <text> | — | Source text. Repeatable. |
--speaker-id <id> | required | Speaker ID. Repeatable. At least one is required. |
--mode <mode> | smart | smart or direct. |
--lang <lang> | auto | Language code. |
--no-wait | — | Return the episode ID without polling. |
--timeout <seconds> | 300 | Polling timeout. |
--json, -j | — | Output JSON. |
At least one --source-url or --source-text is required.
flow-speech tts options:
| Option | Default | Description |
|---|---|---|
--script <content> | required | Script content. Repeatable. At least one required. |
--speaker-id <id> | required | Speaker ID. Repeatable. At least one required. |
--title <title> | — | Episode title. |
--no-wait, --timeout <seconds> (300), --json | — | As above. |
Scripts and speakers are paired by position: the first --script uses the first --speaker-id, and so on; if there are more scripts than speakers, the first speaker is reused.
flow-speech text-stream <episodeId> requires --event <event>, one of script or outline. It writes the raw SSE stream to stdout.
# From a source URL, two voices
listenhub openapi flow-speech create \
--source-url "https://example.com/article" \
--speaker-id <host-id> \
--speaker-id <guest-id>
# Directly from scripts
listenhub openapi flow-speech tts \
--script "Hello and welcome." --speaker-id <host-id> \
--script "Glad to be here." --speaker-id <guest-id> \
--title "Episode 1"podcast
Generate a podcast episode. You can produce text and audio in one step, or split the two: generate the text content first, review or stream it, then generate audio. Creation commands default to a 300-second --timeout.
| Command | Description |
|---|---|
podcast create | Generate a full episode (text + audio) from --query and/or sources. |
podcast get <episodeId> | Fetch episode details. |
podcast text-content | Generate the script only, no audio. Polls until contentStatus is text-success. |
podcast generate-audio <episodeId> | Generate audio for an existing text episode. Polls until contentStatus is audio-success. |
podcast text-stream <episodeId> | Stream generated text over SSE. |
podcast create options:
| Option | Default | Description |
|---|---|---|
--query <text> | — | Topic or prompt for the episode. |
--source-url <url> | — | Source URL. Repeatable. |
--source-text <text> | — | Source text. Repeatable. |
--speaker-id <id> | required | Speaker ID. Repeatable. At least one required. Pass more than one for a multi-voice episode. |
--mode <mode> | — | Generation mode. |
--lang <lang> | auto | Language code. |
--no-wait, --timeout <seconds> (300), --json | — | Standard async flags. |
podcast text-content takes the same source and speaker options (--query, --source-url, --source-text, --speaker-id, --mode) plus the async flags. At least one of --query, --source-url, or --source-text is required.
podcast text-stream <episodeId> requires --event <event>, one of script or outline.
# One-shot: text + audio
listenhub openapi podcast create \
--query "AI agent trends in 2026" \
--speaker-id <host-id> \
--mode quick
# Two-step: text first, then audio
ID=$(listenhub openapi podcast text-content \
--query "Weekly recap" --speaker-id <host-id> --no-wait -j | jq -r '.episodeId')
listenhub openapi podcast text-stream "$ID" --event script
listenhub openapi podcast generate-audio "$ID"storybook
Storybook produces explainer and slides episodes, optionally with video. Creation polls until processStatus is success; the default --timeout is 300 seconds.
| Command | Description |
|---|---|
storybook create | Create an episode from sources. |
storybook get <episodeId> | Fetch episode details. |
storybook generate-video <episodeId> | Kick off video generation for an episode. |
storybook create options:
| Option | Default | Description |
|---|---|---|
--source-url <url> | — | Source URL. Repeatable. |
--source-text <text> | — | Source text. Repeatable. |
--speaker-id <id> | — | Speaker ID. Repeatable. Optional. |
--skip-audio | off | Generate without audio. |
--style <style> | — | Storybook style. |
--mode <mode> | info | One of info, story, slides. |
--lang <lang> | auto | Language code. |
--no-wait, --timeout <seconds> (300), --json | — | Standard async flags. |
listenhub openapi storybook create \
--source-url "https://example.com/explainer" \
--mode slides --skip-audioimage
Generate an AI image from a prompt, optionally conditioned on reference images. --reference accepts both local file paths and URLs; local files are read and sent inline, URLs are passed by reference.
| Option | Description |
|---|---|
--prompt <text> | Image description (required). |
--provider <provider> | Provider name (required). |
--model <model> | Model name. |
--size <size> | One of 1K, 2K, 4K. |
--ratio <ratio> | One of 16:9, 4:3, 1:1, 3:4, 9:16, 21:9. |
--reference <path-or-url> | Reference image, local path or URL. Repeatable. |
--json, -j | Output JSON. |
listenhub openapi image create \
--provider <provider> \
--prompt "A neon-lit city skyline at dusk" \
--ratio 16:9 --size 2Kvideo
AI video generation. This group has two surfaces: the generic video commands (text/image/reference driven) and the video pixverse subcommands (PixVerse capability API). Both poll until status is success with a default --timeout of 1200 seconds, and both take a 24-character hex task ID.
| Command | Description |
|---|---|
video create | Create a generation task. |
video get <taskId> | Fetch task details. |
video list | List tasks. |
video estimate | Estimate credits before creating. |
video pixverse generate | Create a PixVerse task. |
video pixverse estimate | Estimate credits for a PixVerse task. |
video create
The prompt is required; everything else selects an input mode. Frame mode (--first-frame / --last-frame) and reference mode (--reference-image / --reference-video / --reference-audio) are mutually exclusive.
| Option | Default | Description |
|---|---|---|
--prompt <text> | required | Video description / prompt. |
--first-frame <url> | — | First-frame image URL. |
--last-frame <url> | — | Last-frame image URL. Requires --first-frame. |
--reference-image <url> | — | Reference image URL. Repeatable, max 9. |
--reference-video <url> | — | Reference video URL. Repeatable, max 3. Requires --input-video-duration. |
--reference-audio <url> | — | Reference audio URL. Repeatable, max 3. Requires --reference-image or --reference-video. |
--input-video-duration <seconds> | — | Input video duration, 2–15. Required with --reference-video. |
--model <model> | — | Model name, e.g. doubao-seedance-2-pro. |
--resolution <res> | — | One of 480p, 720p, 1080p. |
--ratio <ratio> | — | One of 16:9, 4:3, 1:1, 3:4, 9:16, 21:9. |
--duration <seconds> | — | Output duration, 4–15. |
--no-generate-audio | audio on | Disable audio generation. |
--seed <number> | — | Random seed, -1 to 4294967295. |
--no-wait, --timeout <seconds> (1200), --json | — | Standard async flags. |
# Text to video
listenhub openapi video create \
--prompt "A timelapse of clouds over a mountain range" \
--model doubao-seedance-2-pro \
--resolution 1080p --duration 8
# First/last frame interpolation
listenhub openapi video create \
--prompt "Smooth morph between the two frames" \
--first-frame "https://example.com/a.jpg" \
--last-frame "https://example.com/b.jpg"video list and estimate
video list options: --page <n> (default 1), --page-size <n> (default 20), --status <status> (one of pending, generating, uploading, success, failed), --json.
video estimate requires --model, --resolution, and --duration, and accepts --ratio, --has-video-input, and --input-video-duration (required when --has-video-input is set):
listenhub openapi video estimate \
--model doubao-seedance-2-pro --resolution 1080p --duration 8video pixverse
PixVerse exposes atomic generation capabilities plus a marketing agent. Pick one with --capability:
| Capability | What it does |
|---|---|
text_to_video | Generate from a text prompt. |
image_to_video | Animate a still image. |
transition | Transition between two assets. |
multi_transition | Transition across multiple assets. |
fusion | Fuse multiple inputs into one clip. |
restyle | Restyle an existing PixVerse video. |
mimic | Mimic a reference motion/style. |
lip_sync | Drive lip motion from audio or TTS. |
agent | Marketing agent (ad_master, promo_mix). |
Shared enums:
- Model (
--model):pixverse,v6,v5,v4.5(defaultpixverse). - Language / region (
--language):zh,en(defaulten). - Quality (
--quality):360p,540p,720p,1080p(default720p). - Aspect ratio (
--aspect-ratio):9:16,16:9,1:1,4:3,3:4(default16:9). - Agent type (
--agent-type, with--capability agent):ad_master,promo_mix.
video pixverse generate options:
| Option | Default | Description |
|---|---|---|
--capability <capability> | required | One of the capabilities above. |
--model <model> | pixverse | Model. |
--language <lang> | en | Service region. |
--prompt <text> | — | Prompt, max 2048 chars. |
--quality <quality> | 720p | Output quality. |
--aspect-ratio <ratio> | 16:9 | Aspect ratio. |
--duration <seconds> | 5 | Integer 1–60. |
--source-task-id <id> | — | Reuse a prior succeeded PixVerse task (for restyle / lip_sync). |
--image <url[:duration]> | — | Image asset, optional :duration suffix. Repeatable, max 10. |
--video <url[:duration]> | — | Video asset, optional :duration suffix. Repeatable, max 2. |
--audio <url[:duration]> | — | Audio asset, optional :duration suffix. Repeatable, max 1. |
--agent-type <type> | — | ad_master or promo_mix (with --capability agent). |
--source-video-id <id> | — | PixVerse source video id (restyle). |
--restyle-id <id> | — | PixVerse restyle id (restyle). |
--lip-sync-tts | off | Enable lip-sync TTS (--capability lip_sync). |
--lip-sync-speaker-id <id> | — | Lip-sync TTS speaker id. |
--lip-sync-content <text> | — | Lip-sync TTS content. |
--pixverse-json <json> | — | Escape hatch: raw JSON for the nested pixverse object. Merged with flag-derived fields; flags win. |
--no-wait, --timeout <seconds> (1200), --json | — | Standard async flags. |
Asset flags accept an optional trailing :duration in seconds — for example https://example.com/clip.mp4:5. Only a trailing :<integer> is treated as a duration, so URLs with their own colons are safe.
# Lip-sync from TTS
listenhub openapi video pixverse generate \
--capability lip_sync \
--video "https://example.com/face.mp4" \
--lip-sync-tts \
--lip-sync-speaker-id <speaker-id> \
--lip-sync-content "Hi, here's our product update."
# Marketing agent
listenhub openapi video pixverse generate \
--capability agent \
--agent-type ad_master \
--prompt "30-second ad for a noise-cancelling headset" \
--image "https://example.com/product.jpg"video pixverse estimate takes --capability (required), plus --model, --language, --quality, --duration, and --agent-type:
listenhub openapi video pixverse estimate \
--capability text_to_video --quality 1080p --duration 5music
AI music generation, backed by Mureka. Async commands (generate, remix, instrumental, soundtrack, track) poll until status is success; the default --timeout is 600 seconds. The analysis commands (recognize, describe, stem) run synchronously.
| Command | Description |
|---|---|
music generate | Generate from a prompt and/or lyrics. |
music remix [audio] | Remix an existing song with new lyrics. |
music instrumental | Generate a standalone instrumental. |
music soundtrack | Generate music from an image or video. |
music track [audio] | Generate a single instrument or vocal track. |
music recognize | Recognize lyrics with timestamps from audio. |
music describe | Analyze audio: description, tags, genres, instruments. |
music stem | Separate audio into stems, returns download URLs. |
music list | List music tasks. |
music get <taskId> | Fetch task details. |
Model values across the generation commands: auto, mureka-7.6, mureka-8, mureka-9, mureka-o2.
music generate options:
| Option | Description |
|---|---|
--prompt <text> | Music description. At least one of --prompt or --lyrics is required. |
--lyrics <text> | Song lyrics. |
--style <text> | Music style / mood. |
--title <text> | Track title. |
--model <model> | One of the model values above. |
--instrumental | Instrumental only, no vocals. |
--vocal-id <id> | Reusable vocal id. |
--no-wait, --timeout <seconds> (600), --json | Standard async flags. |
music remix [audio] takes the audio as a positional file, or --audio-url <url>, or --provider-song-id <id> — exactly one. Requires --lyrics and --prompt.
music instrumental requires exactly one of --prompt or --reference-audio <file> (mp3/m4a, max 10MB); accepts --model.
music soundtrack requires exactly one of --image <file> or --video <file>; accepts --prompt and --model.
music track [audio] takes the audio as a positional file or --provider-song-id <id> (exactly one). Requires --generate-type (Vocals, Instrumental, Drums, Bass, Guitar, …) and --prompt; --lyrics is required when --generate-type is Vocals. Optional --vocal-gender <male|female>, --generate-start <seconds>, --generate-end <seconds>.
music recognize, music describe, and music stem each require --audio <file> (mp3/m4a, max 10MB). music stem also accepts --model (audio-separation-1 or audio-separation-2).
music list options: --page <n> (default 1), --page-size <n> (default 20), --status <status> (pending, generating, uploading, success, failed), --json.
# Generate from a prompt
listenhub openapi music generate \
--prompt "Upbeat synthwave with a driving bassline" --title "Night Drive"
# Separate an mp3 into stems
listenhub openapi music stem --audio track.mp3content
Extract readable content from a URL, optionally summarized. Async; polls until status is completed with a default --timeout of 300 seconds.
| Command | Description |
|---|---|
content extract | Extract content from a URL. |
content get <taskId> | Fetch the extraction result. |
content extract options:
| Option | Default | Description |
|---|---|---|
--url <url> | required | URL to extract from. |
--summarize | off | Summarize the extracted content. |
--max-length <n> | — | Maximum content length. |
--no-wait, --timeout <seconds> (300), --json | — | Standard async flags. |
listenhub openapi content extract --url "https://example.com/article" --summarizesubscription
Show your subscription plan and credit balance — total available credits, the monthly allotment used/total, permanent credits, plan name, and expiry.
| Command | Options | Description |
|---|---|---|
subscription | --json | Show subscription and credits info. |
listenhub openapi subscription --jsonNext steps
CLI overview
The two auth modes, command namespaces, and global flags at a glance.
Authentication
OAuth login vs. API key, where credentials live, and switching between them.
OAuth commands
Every bare listenhub command for interactive use under your signed-in account.
JavaScript SDK
The library the CLI wraps — call ListenHub from code with the OpenAPIClient.