Speech Recognition (ASR)

Transcribe audio files to text using local speech recognition — no API key required.

Transcribe audio files to text using coli asr, which runs fully offline via local speech recognition models. No API key or internet connection required after setup.

No ListenHub API key required. This skill runs entirely on your machine. It requires the coli CLI tool — see Prerequisites below.

Prerequisites

Install the coli CLI before using this skill:

npm install -g @marswave/coli

Optional but recommended: Install ffmpeg to support more audio formats (MP4, M4A, AAC, etc.):

# macOS
brew install ffmpeg

# Ubuntu / Debian
sudo apt install ffmpeg

WAV files work without ffmpeg. Other formats require it.

On first transcription, coli automatically downloads the required speech model (~60 MB) to ~/.coli/models/.

Trigger

Invoke this skill with /asr, or use any of these phrases:

Phrase	Language
`transcribe` / `transcribe this`	English
`ASR`	English
`转录` / `识别音频`	Chinese
`语音转文字`	Chinese
`把这段音频转成文字`	Chinese

Quick Example

Transcribe this file: meeting.m4a

The AI checks prerequisites, reads your config, confirms the settings, and runs the transcription locally. The result appears directly in the conversation.

Models

Model	Languages	Notes
`sensevoice` (default)	Chinese, English, Japanese, Korean, Cantonese	Also detects language, emotion, and audio events
`whisper-tiny.en`	English only	Lighter model, English only

sensevoice is recommended for multilingual content or when language is unknown.

{audio-filename}-transcript.md

The Markdown file includes a front-matter header with source file, date, model, duration, and detected language.

Composability

This skill produces transcript text that can be passed directly to other skills:

Transcribe a recorded interview → feed into /podcast as reference material
Transcribe a voice memo → use as input for /explainer

API Reference

No API calls. This skill uses the local coli asr command only.

Speech Recognition (ASR)

Prerequisites

Trigger

Quick Example

Models

Options

AI Polish

Output

Composability

API Reference

On this page