getcontentapi
v1.0.0
Published
Official TypeScript/Node.js SDK for ContentAPI — extract content from any URL
Maintainers
Readme
contentapi
Official TypeScript/Node.js SDK for ContentAPI — extract structured content from any URL.
Features
- 🌐 Web extraction — articles, blogs, docs → markdown/text
- 📺 YouTube — transcripts, metadata, AI summaries
- 🐦 Twitter/X — tweets and thread extraction
- 🤖 Reddit — posts with comments
- 🔍 Search — web search with structured results
- 📦 Batch — extract multiple URLs in one call
- 🧠 Summarize — AI-powered content summarization
- ⚡ Zero runtime dependencies (uses native
fetch) - 🔄 Automatic retries with exponential backoff
- 📝 Full TypeScript types with strict mode
- 📦 ESM + CommonJS dual build
Requirements
- Node.js 18+ (uses native
fetch)
Installation
npm install contentapiQuick Start
import { ContentAPI } from 'contentapi';
const client = new ContentAPI({ apiKey: 'sk_live_...' });
// Extract a web page
const page = await client.web.extract('https://example.com', { format: 'markdown' });
console.log(page.title, page.content, page.word_count);Usage
Initialize
import { ContentAPI } from 'contentapi';
const client = new ContentAPI({
apiKey: 'sk_live_...', // Required
baseUrl: 'https://...', // Optional (default: https://api.getcontentapi.com/api/v1)
timeout: 30000, // Optional (default: 30s)
maxRetries: 3, // Optional (default: 3)
});Web Extraction
const result = await client.web.extract('https://example.com', {
format: 'markdown', // 'markdown' | 'text' (default: 'markdown')
});
console.log(result.title); // Page title
console.log(result.content); // Extracted content
console.log(result.word_count); // Word count
console.log(result.author); // Author (if available)
console.log(result.content_type); // 'article', 'page', etc.
console.log(result.cache_hit); // Whether result was cachedWeb Metadata
const meta = await client.web.metadata('https://example.com');
console.log(meta.title, meta.description, meta.og_image);YouTube
// Get transcript with segments
const transcript = await client.youtube.transcript('https://youtube.com/watch?v=dQw4w9WgXcQ', {
language: 'en',
});
console.log(transcript.title);
console.log(transcript.full_text);
console.log(transcript.word_count);
transcript.transcript.forEach(s => {
console.log(`[${s.start}s] ${s.text}`);
});
// Get metadata
const meta = await client.youtube.metadata('https://youtube.com/watch?v=dQw4w9WgXcQ');
console.log(meta.title, meta.view_count, meta.duration, meta.channel);
// Get AI summary (uses /summarize endpoint)
const summary = await client.youtube.summary('https://youtube.com/watch?v=dQw4w9WgXcQ');
console.log(summary.summary, summary.key_points);Twitter/X
// Extract a thread
const thread = await client.twitter.thread('https://x.com/user/status/123456789');
// Extract a single tweet
const tweet = await client.twitter.tweet('https://x.com/user/status/123456789');const post = await client.reddit.post('https://reddit.com/r/programming/comments/...', {
comments: 50, // Max comments (default 25, max 100)
depth: 5, // Reply depth (default 3, max 10)
});Search
const results = await client.search('python RAG tutorial', {
count: 5, // 1-10 results (default 5)
extract: true, // Also extract content from result pages
country: 'US',
language: 'en',
});Batch Extraction
const batch = await client.batch([
'https://example.com',
'https://youtube.com/watch?v=dQw4w9WgXcQ',
'https://reddit.com/r/programming/comments/...',
]);Summarize
// Summarize a URL
const summary = await client.summarize('https://example.com/long-article');
console.log(summary.summary, summary.key_points, summary.topics);
// Summarize raw content
const summary2 = await client.summarize('', {
content: 'Long text to summarize...',
title: 'Optional title for context',
});Error Handling
The SDK provides typed error classes for precise error handling:
import {
ContentAPI,
ContentAPIError,
AuthenticationError,
RateLimitError,
QuotaExceededError,
ExtractionError,
} from 'contentapi';
const client = new ContentAPI({ apiKey: 'sk_live_...' });
try {
const result = await client.web.extract('https://example.com');
} catch (error) {
if (error instanceof AuthenticationError) {
// Invalid or expired API key (401/403)
console.error('Auth failed:', error.message);
} else if (error instanceof RateLimitError) {
// Too many requests (429) — SDK retries automatically
console.error('Rate limited. Retry after:', error.retryAfter, 'seconds');
} else if (error instanceof QuotaExceededError) {
// Account quota exhausted (402 or 429 with QUOTA_EXCEEDED)
console.error('Quota exceeded:', error.message);
} else if (error instanceof ExtractionError) {
// Bad request / extraction failed (400/422)
console.error('Extraction failed:', error.message);
} else if (error instanceof ContentAPIError) {
// Other API error
console.error(`API Error [${error.status}]: ${error.message} (${error.code})`);
}
}Error Properties
| Error Class | Status | Properties |
|---|---|---|
| ContentAPIError | varies | status, code, message |
| AuthenticationError | 401/403 | Inherits from ContentAPIError |
| RateLimitError | 429 | retryAfter |
| QuotaExceededError | 402/429 | Inherits from ContentAPIError |
| ExtractionError | 400/422 | url |
Retries
The SDK automatically retries on:
- 429 Too Many Requests — with exponential backoff + jitter
- 503 Service Unavailable — with exponential backoff + jitter
Default: 3 retries (1s → 2s → 4s base delay with random jitter).
const client = new ContentAPI({
apiKey: 'sk_live_...',
maxRetries: 5, // More retries
timeout: 60000, // Longer timeout per request
});Set maxRetries: 0 to disable retries.
TypeScript
All response types are fully typed and exported:
import type {
WebExtractResult,
WebMetadataResult,
YouTubeTranscriptResult,
YouTubeMetadataResult,
SummarizeResult,
TwitterThreadResult,
RedditPostResult,
SearchResult,
BatchResult,
TranscriptSegment,
} from 'contentapi';CommonJS
const { ContentAPI } = require('contentapi');
const client = new ContentAPI({ apiKey: 'sk_live_...' });
client.web.extract('https://example.com')
.then(result => console.log(result.title));API Endpoints Reference
| Method | SDK Method | HTTP |
|---|---|---|
| Web extract | client.web.extract(url) | GET /web/extract?url=... |
| Web metadata | client.web.metadata(url) | GET /web/metadata?url=... |
| YouTube transcript | client.youtube.transcript(url) | GET /youtube/transcript?url=... |
| YouTube metadata | client.youtube.metadata(url) | GET /youtube/metadata?url=... |
| YouTube summary | client.youtube.summary(url) | POST /summarize |
| Twitter thread | client.twitter.thread(url) | GET /twitter/thread?url=... |
| Twitter tweet | client.twitter.tweet(url) | GET /twitter/tweet?url=... |
| Reddit post | client.reddit.post(url) | GET /reddit/post?url=... |
| Search | client.search(query) | GET /web/search?q=... |
| Batch | client.batch(urls) | POST /batch |
| Summarize | client.summarize(url) | POST /summarize |
License
MIT — see LICENSE.
