ListenHubSkills

Content Parser

Extract and parse content from any URL — articles, videos, tweets, PDFs, and more.

Extract structured content from any URL. Supports articles, YouTube videos, tweets, WeChat posts, PDFs, and more. Use standalone or as a preprocessing step for other skills.

For AI Agents: The full content of this page is available as text at https://listenhub.ai/docs/en/skills/content-parser.mdx. Use WebFetch to read it before helping the user with this skill.

Trigger

Invoke this skill with /content-parser, or use any of these phrases:

PhraseLanguage
parse this URL / parse thisEnglish
extract content / extract fromEnglish
解析链接Chinese
提取内容Chinese

Requires ListenHub Skills to be installed — see Getting Started.

Quick Example

Parse this article: https://en.wikipedia.org/wiki/Topology

The AI extracts the content, returns a preview, and offers to save or use it in another skill.

Supported Sources

PlatformURL patternsContent type
YouTubeyoutube.com/watch?v=, youtu.be/Video transcripts
Bilibilibilibili.com/video/Video transcripts
Twitter/Xtwitter.com/, x.com/Tweets (profile or single)
WeChatmp.weixin.qq.com/s/Public articles
PDFDirect .pdf URLDocument text
DOCXDirect .docx URLDocument text
ImagesDirect image URLOCR / description
Any webpageAny HTTP(S) URLArticle text

Parameters

ParameterDescriptionDefault
summarizeGenerate a summaryfalse
maxLengthMaximum content length (characters)100,000 (max 500,000)
twitter.countTweets to fetch (Twitter/X profiles only)20 (max 100)

Common Use Cases

Standalone Extraction

Extract the content from this YouTube video: https://youtube.com/watch?v=...

Parse + Generate

Combine with other skills in a single conversation:

Parse this article and turn it into a podcast: https://example.com/article
Extract this YouTube video and make an explainer video from it

See the Composing Skills guide for more workflow examples.

Output

After extraction completes, the AI receives the following structured data:

  • content — full extracted text (up to maxLength, default 100,000 characters)
  • metadata — title, author, publication date, and other page metadata
  • references — URLs referenced in the content
  • Summary — returned instead of full content if summarize was enabled

Limitations

  • Paywalled content may not be accessible
  • JavaScript-rendered content may be partially extracted
  • Very long content may be truncated
  • Some platforms may block automated access

Credits

Content extraction is billed based on extracted content length:

RuleDetails
Pre-deduction5 credits reserved when extraction starts
Rate100 credits per 100,000 characters; if actual is less than 5, the actual amount is charged
Default limit100,000 characters (100 credits)
Maximum limit500,000 characters (500 credits)
Failure refundAll pre-deducted credits refunded on failure

API Reference

See the Content Parser API endpoints for technical details.

On this page