ListenHubSDKs & CLI
CLI

OAuth Commands

Full reference for the listenhub OAuth commands — auth, podcast, tts, explainer, slides, music, image, video, speakers, lyrics, and creation.

This page documents the OAuth command set: the listenhub <domain> ... commands that act as your signed-in user account. They require a browser login (listenhub auth login) and read tokens from ~/.config/listenhub/credentials.json. For the API-key command set used in scripts and CI, see OpenAPI commands.

Conventions

These behaviors apply across the commands below.

  • Global flags. Every command accepts --json / -j (machine-readable output on stdout, errors on stderr) and --help / -h. Creation commands also accept --no-wait (return the ID immediately without polling) and --timeout <seconds> (cap how long polling waits; the default varies per command).
  • Polling. Creation commands submit a job, then poll status every 10 seconds until it reaches a terminal state. On timeout the command exits with code 3; the job keeps running server-side and you can fetch it later by ID.
  • Language auto-detection. Where a command has --lang and you omit it, the CLI infers the language from your input text: Kana → ja, other CJK characters → zh, otherwise en.
  • Speaker resolution. --speaker <name> is resolved to an inner ID by listing speakers for the detected language; --speaker-id <id> is passed through directly. If you supply neither, the CLI picks a default voice for the detected language.
  • File vs. URL auto-detect. Flags that accept <path-or-url> auto-detect their input: an http(s) URL is passed through unchanged, while a local path is validated (extension and size) and uploaded to cloud storage before the API call. Supported uploads — audio: .mp3, .wav, .flac, .m4a, .ogg, .aac (max 20 MB); image: .jpg, .jpeg, .png, .webp, .gif (max 10 MB); video: .mp4, .mov (max 50 MB). Some Mureka music subcommands accept local files only (no URL) with their own limits, noted per command.
  • Exit codes. 0 success, 1 error, 2 authentication required or invalid, 3 timeout.

Generation consumes credits. The OAuth command set has no per-command credit estimator except video estimate. Before generating, check your remaining balance with listenhub openapi subscription, and use listenhub openapi video estimate (or listenhub video estimate) for video cost. Never assume a fixed cost — query it.

auth

Manage your login session.

listenhub auth login
listenhub auth logout
listenhub auth status [-j]
CommandDescription
loginOpen the browser to complete OAuth. On success, writes tokens to ~/.config/listenhub/credentials.json (mode 0600). Tokens refresh automatically.
logoutRevoke tokens and remove the stored credentials.
statusShow the current login state. Accepts -j.
listenhub auth login
listenhub auth status

podcast

Generate a podcast episode from a topic and/or reference sources.

listenhub podcast create [options]
listenhub podcast list [options]

podcast create

FlagValuesDefaultMeaning
--query <text>stringTopic or prompt for the episode.
--source-url <url>URL (repeatable)[]Reference URL to ground the episode. Repeat for multiple.
--source-text <text>string (repeatable)[]Reference text to ground the episode. Repeat for multiple.
--mode <mode>quick, deep, debatequickGeneration mode.
--lang <lang>en, zh, jaautoOutput language. Auto-detected from --query if omitted.
--speaker <name>string (repeatable)Speaker by name. One speaker → solo; two or more → multi-voice.
--speaker-id <id>string (repeatable)Speaker by inner ID. Use instead of --speaker.
--no-waitflagpollReturn the episode ID immediately without polling.
--timeout <seconds>number300Polling timeout.
--json, -jflagfalseJSON output.
listenhub podcast create --query "AI agent trends in 2026" --mode quick

podcast list

FlagValuesDefaultMeaning
--page <n>number1Page number.
--page-size <n>number20Items per page.
--json, -jflagfalseJSON output.
listenhub podcast list --page 1 --page-size 20

tts

Convert text to speech in one voice.

listenhub tts create [options]
listenhub tts list [options]

tts create

FlagValuesDefaultMeaning
--text <text>stringText to convert to speech.
--source-url <url>URL (repeatable)[]Reference URL. Repeat for multiple.
--source-text <text>string (repeatable)[]Reference text. Repeat for multiple.
--mode <mode>smart, directsmartsmart rewrites the input for speech; direct reads it as-is.
--lang <lang>en, zh, jaautoOutput language. Auto-detected from --text if omitted.
--speaker <name>stringSpeaker by name.
--speaker-id <id>stringSpeaker by inner ID.
--no-waitflagpollReturn the ID immediately without polling.
--timeout <seconds>number300Polling timeout.
--json, -jflagfalseJSON output.
listenhub tts create --text "Hello, world" --lang en

tts list

FlagValuesDefaultMeaning
--page <n>number1Page number.
--page-size <n>number20Items per page.
--json, -jflagfalseJSON output.
listenhub tts list

explainer

Generate an explainer video (narrated visual segments).

listenhub explainer create [options]
listenhub explainer list [options]

explainer create

Audio narration is on by default; pass --skip-audio to produce a silent video.

FlagValuesDefaultMeaning
--query <text>stringTopic or prompt.
--source-url <url>URL (repeatable)[]Reference URL. Repeat for multiple.
--source-text <text>string (repeatable)[]Reference text. Repeat for multiple.
--mode <mode>info, storyinfoGeneration mode.
--lang <lang>en, zh, jaautoOutput language. Auto-detected from --query if omitted.
--speaker <name>stringSpeaker by name.
--speaker-id <id>stringSpeaker by inner ID.
--skip-audioflagfalseSkip audio narration (silent video).
--image-size <size>2K, 4K2KRendered image resolution.
--aspect-ratio <ratio>16:9, 9:16, 1:116:9Frame aspect ratio.
--style <style>stringVisual style hint.
--no-waitflagpollReturn the ID immediately without polling.
--timeout <seconds>number300Polling timeout.
--json, -jflagfalseJSON output.
listenhub explainer create --query "How vaccines work" --mode info --aspect-ratio 16:9

explainer list

FlagValuesDefaultMeaning
--page <n>number1Page number.
--page-size <n>number20Items per page.
--json, -jflagfalseJSON output.
listenhub explainer list

slides

Generate a slide deck from a topic and/or sources.

listenhub slides create [options]
listenhub slides list [options]

slides create

Slides are silent by default. Pass --no-skip-audio to add voice narration.

FlagValuesDefaultMeaning
--query <text>stringTopic or prompt.
--source-url <url>URL (repeatable)[]Reference URL. Repeat for multiple.
--source-text <text>string (repeatable)[]Reference text. Repeat for multiple.
--lang <lang>en, zh, jaautoOutput language. Auto-detected from --query if omitted.
--speaker <name>stringSpeaker by name (used when narration is enabled).
--speaker-id <id>stringSpeaker by inner ID.
--no-skip-audioflagsilentGenerate voice narration (off by default).
--image-size <size>2K, 4K2KRendered image resolution.
--aspect-ratio <ratio>16:9, 9:16, 1:116:9Slide aspect ratio.
--style <style>stringVisual style hint.
--no-waitflagpollReturn the ID immediately without polling.
--timeout <seconds>number300Polling timeout.
--json, -jflagfalseJSON output.
listenhub slides create --query "Q3 product roadmap" --no-skip-audio

slides list

FlagValuesDefaultMeaning
--page <n>number1Page number.
--page-size <n>number20Items per page.
--json, -jflagfalseJSON output.
listenhub slides list

music

Generate, transform, and analyze music. Generation subcommands poll (default timeout 600); the analysis subcommands (recognize, describe, stem) are synchronous and print immediately. Subcommands marked Mureka accept local files only for their reference inputs (no URL pass-through).

listenhub music generate [options]
listenhub music cover --audio <path-or-url> [options]
listenhub music extend --audio <path-or-url> --model <v> --continue-at <s> [options]
listenhub music remix [audio] [options]
listenhub music instrumental [options]
listenhub music soundtrack [options]
listenhub music track [audio] [options]
listenhub music recognize --audio <path> [-j]
listenhub music describe --audio <path> [-j]
listenhub music stem --audio <path> [options]
listenhub music list [options]
listenhub music get <taskId> [-j]

music generate

Generate music from a text prompt.

FlagValuesDefaultMeaning
--prompt <text>string (required)Music description.
--style <text>stringStyle or mood.
--title <text>stringTrack title.
--instrumentalflagfalseInstrumental only, no vocals.
--no-waitflagpollReturn the ID immediately without polling.
--timeout <seconds>number600Polling timeout.
--json, -jflagfalseJSON output.
listenhub music generate --prompt "Upbeat electronic dance" --style "EDM" --title "Night Drive"

music cover

Create a cover from reference audio. --audio accepts a local file or URL.

FlagValuesDefaultMeaning
--audio <path-or-url>path or URL (required)Reference audio.
--prompt <text>stringMusic description.
--style <text>stringStyle or mood.
--title <text>stringTrack title.
--instrumentalflagfalseInstrumental only, no vocals.
--no-waitflagpollReturn the ID immediately without polling.
--timeout <seconds>number600Polling timeout.
--json, -jflagfalseJSON output.
listenhub music cover --audio ./original.mp3 --title "My Remix"

music extend

Extend music from a reference, continuing from a chosen time point. --audio accepts a local file or URL.

FlagValuesDefaultMeaning
--audio <path-or-url>path or URL (required)Reference audio.
--model <version>V4, V4_5, V4_5PLUS, V4_5ALL, V5, V5_5 (required)Model version.
--continue-at <seconds>number (required)Time point to start extending from.
--prompt <text>stringLyrics or description.
--style <text>stringStyle or mood.
--title <text>stringTrack title.
--instrumentalflagfalseInstrumental only, no vocals.
--negative-tags <text>stringStyles to exclude.
--vocal-gender <gender>m, fVocal gender.
--style-weight <weight>number 01Style guidance weight.
--weirdness <weight>number 01Creativity/weirdness constraint.
--audio-weight <weight>number 01Input audio influence weight.
--no-waitflagpollReturn the ID immediately without polling.
--timeout <seconds>number600Polling timeout.
--json, -jflagfalseJSON output.
listenhub music extend --audio ./song.mp3 --model V5 --continue-at 30

music remix

Remix an existing song with new lyrics (Mureka). Provide exactly one source: a positional [audio] local file (.mp3/.m4a, max 10 MB), --audio-url, or --provider-song-id.

FlagValuesDefaultMeaning
[audio]local fileReference audio file (positional).
--audio-url <url>URLReference audio URL instead of a file.
--provider-song-id <id>stringMureka song id instead of a file.
--lyrics <text>string (required)Lyrics for the remixed song.
--prompt <text>string (required)Music description.
--no-waitflagpollReturn the ID immediately without polling.
--timeout <seconds>number600Polling timeout.
--json, -jflagfalseJSON output.
listenhub music remix ./original.mp3 --lyrics "New verse..." --prompt "Lo-fi hip hop"

music instrumental

Generate a standalone instrumental (Mureka). Provide exactly one of --prompt or --reference-audio.

FlagValuesDefaultMeaning
--prompt <text>stringMusic description.
--reference-audio <path>local fileReference audio (.mp3/.m4a, max 10 MB).
--model <version>auto, mureka-7.6, mureka-8, mureka-o2Model version.
--no-waitflagpollReturn the ID immediately without polling.
--timeout <seconds>number600Polling timeout.
--json, -jflagfalseJSON output.
listenhub music instrumental --prompt "Cinematic orchestral build-up" --model mureka-8

music soundtrack

Generate music from an image or video (Mureka). Provide exactly one of --image or --video.

FlagValuesDefaultMeaning
--image <path>local fileSource image (.jpg/.jpeg/.png/.webp, max 10 MB).
--video <path>local fileSource video (.mp4/.mov/.avi/.mkv/.webm, max 10 MB).
--prompt <text>stringMusic description.
--model <version>auto, mureka-7.6, mureka-8, mureka-9, mureka-o2Model version.
--no-waitflagpollReturn the ID immediately without polling.
--timeout <seconds>number600Polling timeout.
--json, -jflagfalseJSON output.
listenhub music soundtrack --image ./cover.png --prompt "Dreamy synthwave"

music track

Generate a single instrument or vocal track (Mureka). Provide exactly one source: a positional [audio] local file (.mp3/.m4a/.wav, max 10 MB) or --provider-song-id. --lyrics is required when --generate-type is Vocals.

FlagValuesDefaultMeaning
[audio]local fileReference audio file (positional).
--provider-song-id <id>stringMureka song id instead of a file.
--generate-type <type>Vocals, Instrumental, Drums, Bass, Guitar, Keyboard, Percussion, Strings, Synth, FX, Brass, Woodwinds (required)Track type to generate.
--prompt <text>string (required)Music description.
--lyrics <text>stringLyrics. Required when --generate-type is Vocals.
--vocal-gender <gender>male, femaleVocal gender.
--generate-start <seconds>numberRange start in seconds.
--generate-end <seconds>numberRange end in seconds.
--no-waitflagpollReturn the ID immediately without polling.
--timeout <seconds>number600Polling timeout.
--json, -jflagfalseJSON output.
listenhub music track ./song.mp3 --generate-type Drums --prompt "Punchy breakbeat"

music recognize

Recognize lyrics with timestamps from audio (Mureka). Synchronous; prints immediately.

FlagValuesDefaultMeaning
--audio <path>local file (required)Audio file (.mp3/.m4a, max 10 MB).
--json, -jflagfalseJSON output.
listenhub music recognize --audio ./song.mp3

music describe

Analyze audio — description, tags, genres, instruments (Mureka). Synchronous.

FlagValuesDefaultMeaning
--audio <path>local file (required)Audio file (.mp3/.m4a, max 10 MB).
--json, -jflagfalseJSON output.
listenhub music describe --audio ./song.mp3

music stem

Separate audio into stems and return download URLs (Mureka). Synchronous.

FlagValuesDefaultMeaning
--audio <path>local file (required)Audio file (.mp3/.m4a, max 10 MB).
--model <model>audio-separation-1, audio-separation-2Separation model.
--json, -jflagfalseJSON output.
listenhub music stem --audio ./song.mp3 --model audio-separation-2

music list

FlagValuesDefaultMeaning
--page <n>number1Page number.
--page-size <n>number20Items per page.
--status <status>pending, generating, uploading, success, failedFilter by status.
--json, -jflagfalseJSON output.
listenhub music list --status success

music get

FlagValuesDefaultMeaning
<taskId>string (required)Music task ID (positional).
--json, -jflagfalseJSON output.
listenhub music get <task-id>

image

Generate and manage AI images.

listenhub image create --prompt <text> [options]
listenhub image list [options]
listenhub image get <id> [-j]
listenhub image delete <id...> [-j]

image create

--reference accepts a local file or URL and is repeatable up to 5.

FlagValuesDefaultMeaning
--prompt <text>string (required)Image description.
--model <model>stringModel name.
--lang <lang>stringPrompt language hint.
--aspect-ratio <ratio>string1:1Aspect ratio.
--size <size>1K, 2K, 4K2KImage size.
--reference <path-or-url>path or URL (repeatable, max 5)[]Reference image.
--no-waitflagpollReturn the ID immediately without polling.
--timeout <seconds>number120Polling timeout.
--json, -jflagfalseJSON output.
listenhub image create --prompt "a dragon in watercolor style" --reference ./sketch.png

image list

FlagValuesDefaultMeaning
--page <n>number1Page number.
--page-size <n>number20Items per page.
--json, -jflagfalseJSON output.

image get / image delete

CommandArgumentMeaning
image get <id>image IDGet image details. Accepts -j.
image delete <id...>one or more image IDsDelete one or more AI images. Accepts -j.
listenhub image get <id>
listenhub image delete <id1> <id2>

video

Generate video with SeeDance models. Image/video/audio inputs accept <path-or-url> (local files are uploaded, URLs pass through).

listenhub video create --prompt <text> [options]
listenhub video get <taskId> [-j]
listenhub video list [options]
listenhub video estimate [options]

video create

FlagValuesDefaultMeaning
--prompt <text>string (required)Video description.
--model <model>happyhorse, doubao-seedance-2-pro, doubao-seedance-2-fasthappyhorseGeneration model.
--resolution <res>480p, 720p, 1080pOutput resolution.
--ratio <ratio>16:9, 4:3, 1:1, 3:4, 9:16, 21:9, 4:5, 5:4Aspect ratio.
--duration <seconds>number 315Video duration.
--first-frame <path-or-url>path or URLFirst frame image.
--last-frame <path-or-url>path or URLLast frame image (requires --first-frame).
--reference-image <path-or-url>path or URL (repeatable, max 9)[]Reference image.
--reference-video <path-or-url>path or URL (repeatable, max 3)[]Reference video.
--reference-audio <path-or-url>path or URL (repeatable, max 3)[]Reference audio.
--input-video-duration <seconds>number 215Reference video duration. Required with --reference-video.
--no-generate-audioflagaudio onDisable audio generation.
--audio-setting <mode>auto, originAudio handling for video-edit.
--seed <number>number -14294967295Random seed.
--no-waitflagpollReturn the ID immediately without polling.
--timeout <seconds>number1200Polling timeout.
--json, -jflagfalseJSON output.
listenhub video create --prompt "A cat playing piano" --resolution 720p --duration 5

video estimate

Estimate credit cost before creating.

FlagValuesDefaultMeaning
--model <model>string (required)Model name.
--resolution <res>string (required)Resolution.
--duration <seconds>number (required)Duration.
--ratio <ratio>string16:9Aspect ratio.
--has-video-inputflagfalseHas reference video input.
--input-video-duration <seconds>numberReference video duration.
--json, -jflagfalseJSON output.
listenhub video estimate --model doubao-seedance-2-pro --resolution 1080p --duration 10

video get / video list

CommandFlagsMeaning
video get <taskId>-jGet video task details.
video list--page, --page-size, --status (pending/generating/uploading/success/failed), -jList video tasks.
listenhub video list --status success
listenhub video get <task-id>

speakers

List the voices available to your account.

listenhub speakers list [options]
FlagValuesDefaultMeaning
--lang <lang>en, zh, jaFilter by language.
--json, -jflagfalseJSON output.

Prints a table of Name, ID, Gender, and Personality. The ID column is what you pass to --speaker-id on creation commands.

listenhub speakers list --lang en

lyrics

Generate song lyrics from a prompt.

listenhub lyrics generate --prompt <text> [options]
listenhub lyrics list [options]
listenhub lyrics get <taskId> [-j]

lyrics generate

FlagValuesDefaultMeaning
--prompt <text>string (required, max 200 chars)Lyrics description.
--no-waitflagpollReturn the ID immediately without polling.
--timeout <seconds>number120Polling timeout.
--json, -jflagfalseJSON output.
listenhub lyrics generate --prompt "A hopeful anthem about new beginnings"

lyrics list / lyrics get

CommandFlagsMeaning
lyrics list--page, --page-size, --status (pending/generating/success/failed), -jList lyrics tasks.
lyrics get <taskId>-jGet lyrics task details.
listenhub lyrics list --status success
listenhub lyrics get <task-id>

creation

Fetch or delete any creation (episode, image, or other generated item) by ID.

listenhub creation get <id> [-j]
listenhub creation delete <id...> [-j]
CommandArgumentMeaning
creation get <id>creation IDGet creation details. Accepts -j.
creation delete <id...>one or more IDsDelete one or more creations. Accepts -j.
listenhub creation get <id>
listenhub creation delete <id1> <id2>

Next steps

On this page