ListenHubOpenAPI
API Reference

Content Extract

Asynchronously extract text content from URLs including web articles, Twitter/X profiles, tweets, YouTube videos, and WeChat Official Account posts.

Content Extract

Extract text content from any URL asynchronously. Submit a URL to create a task, then poll for results.

Supported Sources

X / Twitter

Profile pages and individual tweets — fetch recent posts from any public account, or extract a single tweet with its full context.

YouTube

Extract video transcripts and metadata from any public YouTube video URL.

WeChat Official Accounts

Extract article content from WeChat Official Account posts (mp.weixin.qq.com).

Web Articles

Any publicly accessible web page — article text, metadata, and reference links.

Use Cases

  • Podcast source material -- extract articles, tweets, or WeChat posts as input for podcast generation
  • Content summarization -- pull long-form content and generate summaries
  • Social media monitoring -- batch-extract tweets from key accounts
  • Research aggregation -- collect and structure content from multiple URLs

Create Extraction Task

POST /v1/content/extract

Request example:

curl -X POST "https://api.marswave.ai/openapi/v1/content/extract" \
  -H "Authorization: Bearer $LISTENHUB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "source": {
      "type": "url",
      "uri": "https://example.com/article"
    }
  }'
const response = await fetch('https://api.marswave.ai/openapi/v1/content/extract', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.LISTENHUB_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    source: {
      type: 'url',
      uri: 'https://example.com/article',
    },
  }),
});
const data = await response.json();
const taskId = data.data.taskId;
console.log('Task ID:', taskId);
import os
import requests

response = requests.post(
    'https://api.marswave.ai/openapi/v1/content/extract',
    headers={'Authorization': f'Bearer {os.environ["LISTENHUB_API_KEY"]}'},
    json={
        'source': {
            'type': 'url',
            'uri': 'https://example.com/article',
        }
    }
)
data = response.json()
task_id = data['data']['taskId']
print('Task ID:', task_id)

With options (summarize + max length):

curl -X POST "https://api.marswave.ai/openapi/v1/content/extract" \
  -H "Authorization: Bearer $LISTENHUB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "source": {
      "type": "url",
      "uri": "https://example.com/long-article"
    },
    "options": {
      "summarize": true,
      "maxLength": 5000
    }
  }'
const response = await fetch('https://api.marswave.ai/openapi/v1/content/extract', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.LISTENHUB_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    source: {
      type: 'url',
      uri: 'https://example.com/long-article',
    },
    options: {
      summarize: true,
      maxLength: 5000,
    },
  }),
});
const data = await response.json();
console.log(data);
import os
import requests

response = requests.post(
    'https://api.marswave.ai/openapi/v1/content/extract',
    headers={'Authorization': f'Bearer {os.environ["LISTENHUB_API_KEY"]}'},
    json={
        'source': {
            'type': 'url',
            'uri': 'https://example.com/long-article',
        },
        'options': {
            'summarize': True,
            'maxLength': 5000,
        },
    }
)
data = response.json()
print(data)

Twitter/X profile URL with tweet count:

# For Twitter/X profile URLs, use the twitter option to control how many tweets to fetch
curl -X POST "https://api.marswave.ai/openapi/v1/content/extract" \
  -H "Authorization: Bearer $LISTENHUB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "source": {
      "type": "url",
      "uri": "https://x.com/elonmusk"
    },
    "options": {
      "twitter": {
        "count": 50
      }
    }
  }'
const response = await fetch('https://api.marswave.ai/openapi/v1/content/extract', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.LISTENHUB_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    source: {
      type: 'url',
      uri: 'https://x.com/elonmusk',
    },
    options: {
      twitter: {
        count: 50,
      },
    },
  }),
});
const data = await response.json();
console.log(data);
import os
import requests

response = requests.post(
    'https://api.marswave.ai/openapi/v1/content/extract',
    headers={'Authorization': f'Bearer {os.environ["LISTENHUB_API_KEY"]}'},
    json={
        'source': {
            'type': 'url',
            'uri': 'https://x.com/elonmusk',
        },
        'options': {
            'twitter': {
                'count': 50,
            },
        },
    }
)
data = response.json()
print(data)

WeChat Official Account article:

curl -X POST "https://api.marswave.ai/openapi/v1/content/extract" \
  -H "Authorization: Bearer $LISTENHUB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "source": {
      "type": "url",
      "uri": "https://mp.weixin.qq.com/s/XXXXXXXXXXXXXXXX"
    }
  }'
const response = await fetch('https://api.marswave.ai/openapi/v1/content/extract', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.LISTENHUB_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    source: {
      type: 'url',
      uri: 'https://mp.weixin.qq.com/s/XXXXXXXXXXXXXXXX',
    },
  }),
});
const data = await response.json();
const taskId = data.data.taskId;
import os
import requests

response = requests.post(
    'https://api.marswave.ai/openapi/v1/content/extract',
    headers={'Authorization': f'Bearer {os.environ["LISTENHUB_API_KEY"]}'},
    json={
        'source': {
            'type': 'url',
            'uri': 'https://mp.weixin.qq.com/s/XXXXXXXXXXXXXXXX',
        }
    }
)
data = response.json()
task_id = data['data']['taskId']

Request body:

FieldTypeRequiredDescription
sourceobjectYesSource to extract from
source.typestringYesMust be "url"
source.uristringYesThe URL to extract content from
optionsobjectNoExtraction options
options.summarizebooleanNoWhether to generate a summary
options.maxLengthintegerNoMaximum content length (default: 100,000, max: 500,000)
options.twitterobjectNoTwitter/X specific options
options.twitter.countintegerNoNumber of tweets to fetch (1-100, default 20)

Response example:

{
  "code": 0,
  "message": "success",
  "data": {
    "taskId": "{taskId}"
  }
}

Query Task Status

GET /v1/content/extract/{taskId}

Request example:

curl -X GET "https://api.marswave.ai/openapi/v1/content/extract/{taskId}" \
  -H "Authorization: Bearer $LISTENHUB_API_KEY"
const result = await fetch(`https://api.marswave.ai/openapi/v1/content/extract/${taskId}`, {
  headers: {
    'Authorization': `Bearer ${process.env.LISTENHUB_API_KEY}`,
  },
});
const data = await result.json();
console.log('Status:', data.data.status);
import os
import requests

result = requests.get(
    f'https://api.marswave.ai/openapi/v1/content/extract/{task_id}',
    headers={'Authorization': f'Bearer {os.environ["LISTENHUB_API_KEY"]}'}
)
data = result.json()
print('Status:', data['data']['status'])

Path parameters:

FieldTypeDescription
taskIdstring24-character hex string returned from the create endpoint

Response when complete (status: "completed"):

{
  "code": 0,
  "message": "success",
  "data": {
    "taskId": "{taskId}",
    "status": "completed",
    "createdAt": "2025-04-09T12:00:00Z",
    "data": {
      "content": "The extracted article text content...",
      "metadata": {
        "title": "Article Title",
        "author": "Author Name",
        "publishedAt": "2025-04-01T08:00:00Z"
      },
      "references": [
        "https://example.com/related-article"
      ]
    },
    "credits": 5,
    "failCode": null,
    "message": null
  }
}

While in progress, status is "processing" and data is null. On failure, status is "failed" and failCode / message describe the error.

Response fields:

FieldTypeDescription
taskIdstringTask identifier
statusstringprocessing, completed, or failed
createdAtstringISO 8601 timestamp
dataobjectExtracted content (null until completed)
data.contentstringExtracted text content
data.metadataobjectPage metadata (title, author, etc.)
data.referencesarrayReferenced URLs found in the content
creditsintegerCredits consumed
failCodestringError code (null on success)
messagestringError message (null on success)

Notes

Twitter/X profile URLs:

  • When the source URL is a Twitter/X profile (e.g. https://x.com/username), the API fetches recent tweets
  • Use options.twitter.count to control how many tweets to retrieve (1-100, default 20)
  • This option is ignored for non-Twitter URLs

WeChat Official Accounts:

  • Use the full mp.weixin.qq.com/s/... article URL
  • Extracted content includes the article body text and metadata

Task lifecycle:

StatusDescription
processingExtraction is in progress
completedContent extracted successfully
failedExtraction failed -- check failCode and message

Polling recommendations:

  • Initial wait: 5 seconds after task creation
  • Poll interval: 5 seconds
  • Typical completion time: 10-30 seconds depending on URL complexity

On this page