ListenHub OpenAPI

Core Concepts

Key entities, generation modes, and data flow in ListenHub OpenAPI.

2. Core Concepts

1. Basic Terms

  • Episode: The basic content unit in ListenHub
    • Each episode has a unique episodeId
    • Contains audio, scripts, and metadata
  • Speaker: Defines the voice characteristics used for generation
    • Identified by speakerId
    • Includes attributes such as language and gender
    • How to get speakers: call GET /v1/speakers/list

2. Generation Modes

ModeSub-modeCharacteristicsTypical use casesGeneration timeAPI endpoint
PodcastdeepIn-depth analysis with higher content qualityProfessional knowledge sharing, deep commentary2-4 min/v1/podcast/episodes
quickFaster generation with efficiency priorityNews briefs, time-sensitive content1-2 min
debateTwo-host debate style outputOpinion discussions, multi-angle analysis2-4 min
FlowSpeechsmartAI improves readability and fixes text issuesFix awkward sentences and typos1-2 min/v1/flow-speech/episodes
directDirect text-to-speech conversionWell-prepared scripts and announcements1-2 min

Important: Podcast mode supports selecting 1-2 speakers.

3. Data Stream Types

  • Text stream (Server-Sent Events format)
    • Podcast: outline and scripts are usually available after 20-60 seconds
    • FlowSpeech: outline and scripts are usually available after about 3 seconds
  • Audio outputs
    • Streaming audio (M3U8): suitable for real-time playback, field audioStreamUrl
    • Full audio (MP3): suitable for download and offline playback, field audioUrl

3. Playground Quick Experience

1. Multi-speaker TTS

ListenHub Playground provides an online multi-speaker speech synthesis demo that can be tested without writing code.

URL: https://assets.listenhub.ai/listenhub-public-prod/static/playgroud-tts.html

Highlights:

  • Multi-role dialogue generation in a single request
  • Flexible assignment of speaker per script line
  • Online script editing with instant listening preview

Typical scenarios:

  • Audiobook and radio drama production
  • Conversational content generation
  • Rapid product demo production

On this page