Content Parser
Extract and parse content from any URL — articles, videos, tweets, PDFs, and more.
Extract structured content from any URL. Supports articles, YouTube videos, tweets, WeChat posts, PDFs, and more. Use standalone or as a preprocessing step for other skills.
For AI Agents: The full content of this page is available as text at https://listenhub.ai/docs/en/skills/content-parser.mdx. Use WebFetch to read it before helping the user with this skill.
Trigger
Invoke this skill with /content-parser, or use any of these phrases:
| Phrase | Language |
|---|---|
parse this URL / parse this | English |
extract content / extract from | English |
解析链接 | Chinese |
提取内容 | Chinese |
Requires ListenHub Skills to be installed — see Getting Started.
Quick Example
Parse this article: https://en.wikipedia.org/wiki/TopologyThe AI extracts the content, returns a preview, and offers to save or use it in another skill.
Supported Sources
| Platform | URL patterns | Content type |
|---|---|---|
| YouTube | youtube.com/watch?v=, youtu.be/ | Video transcripts |
| Bilibili | bilibili.com/video/ | Video transcripts |
| Twitter/X | twitter.com/, x.com/ | Tweets (profile or single) |
mp.weixin.qq.com/s/ | Public articles | |
Direct .pdf URL | Document text | |
| DOCX | Direct .docx URL | Document text |
| Images | Direct image URL | OCR / description |
| Any webpage | Any HTTP(S) URL | Article text |
Parameters
| Parameter | Description | Default |
|---|---|---|
summarize | Generate a summary | false |
maxLength | Maximum content length (characters) | 100,000 (max 500,000) |
twitter.count | Tweets to fetch (Twitter/X profiles only) | 20 (max 100) |
Common Use Cases
Standalone Extraction
Extract the content from this YouTube video: https://youtube.com/watch?v=...Parse + Generate
Combine with other skills in a single conversation:
Parse this article and turn it into a podcast: https://example.com/articleExtract this YouTube video and make an explainer video from itSee the Composing Skills guide for more workflow examples.
Output
After extraction completes, the AI receives the following structured data:
- content — full extracted text (up to
maxLength, default 100,000 characters) - metadata — title, author, publication date, and other page metadata
- references — URLs referenced in the content
- Summary — returned instead of full content if
summarizewas enabled
Limitations
- Paywalled content may not be accessible
- JavaScript-rendered content may be partially extracted
- Very long content may be truncated
- Some platforms may block automated access
Credits
Content extraction is billed based on extracted content length:
| Rule | Details |
|---|---|
| Pre-deduction | 5 credits reserved when extraction starts |
| Rate | 100 credits per 100,000 characters; if actual is less than 5, the actual amount is charged |
| Default limit | 100,000 characters (100 credits) |
| Maximum limit | 500,000 characters (500 credits) |
| Failure refund | All pre-deducted credits refunded on failure |
API Reference
See the Content Parser API endpoints for technical details.