ListenHubSkills

TTS

Convert text to natural-sounding speech — single voice narration or multi-character dialogue dubbing.

Convert text or URL content into natural-sounding speech audio. Two modes: single-voice narration for everyday reading and casual TTS, and multi-character scripts for dialogue and dubbed content.

For AI Agents: The full content of this page is available as text at https://listenhub.ai/docs/en/skills/tts.mdx. Use WebFetch to read it before helping the user with this skill.

Trigger

Invoke this skill with /tts, or use any of these phrases:

PhraseLanguage
read aloud / read this aloudEnglish
TTS / text to speechEnglish
voice narrationEnglish
朗读这段Chinese
配音 / 语音合成Chinese

Requires ListenHub Skills to be installed — see Getting Started.

Quick Example

Read this article aloud: https://en.wikipedia.org/wiki/Podcast

The AI fetches the content, selects a voice, and generates natural speech audio.

When to Use TTS vs Podcast

Both skills can produce multi-speaker audio, but they serve different purposes:

Use caseSkill
Topic-based discussion with natural conversation flowPodcast
Precise control over every line and speakerTTS (Multi-Character)
Reading an article or text aloudTTS (Single Voice)

Two Modes

Convert text or URL content to speech with a single voice. Fast and simple (~1-2 minutes).

Best for reading articles aloud, casual TTS conversion, and everyday voice narration.

Processing modes:

ModeDescription
directReads text exactly as provided (default)
smartAuto-fixes grammar and punctuation before reading

Multi-character audio with per-segment voice assignment. Moderate speed (~2-3 minutes).

Best for dialogue dubbing, multi-character narration, and scripted voiced content.

Script format:

{
  "scripts": [
    {"content": "Hello everyone, welcome to the show.", "speakerId": "cozy-man-english"},
    {"content": "Thanks for having me!", "speakerId": "travel-girl-english"}
  ]
}

Each segment is spoken by the assigned speaker in order.

Parameters

ParameterOptionsDefault
InputText or URL
Languagezh (Chinese), en (English)Auto-detected
Modedirect, smart (Single Voice only)direct
PathSingle Voice, Multi-Character ScriptSingle Voice

When to Use Each Mode

ScenarioMode
Read an article or text aloudSingle Voice
Casual TTS conversionSingle Voice
Dialogue with multiple charactersMulti-Character Script
Precise per-line voice controlMulti-Character Script

Multi-Speaker Script Tips

  • Keep segments at natural speech boundaries (sentences or paragraphs)
  • Alternate speakers for a dialogue feel
  • Each speakerId must be a valid ID from the speakers API
  • All speakers should share the same language

Limits

  • FlowTTS text input: max 10,000 characters
  • For longer content, use a URL input — the API fetches and processes it automatically

Output

After generation:

  • Listen link — stream on ListenHub
  • Audio download — say "download audio" to save locally

API Reference

See the TTS API endpoints for technical details.

On this page