mcp-content
v0.10.1
Published
Thin MCP client for the mcp-content pipeline: researches a keyword and writes one ICP-aligned article. Scrapes locally through MCP Scraper; all generation (prompts + models) runs on a private access-gated server.
Readme
mcp-content
A standalone TypeScript MCP server that researches a keyword and writes one ICP-aligned article through a fully sequential, atomic-step pipeline.
No REST API. No front end. No embeddings, no Python, no Modal, no clustering.
This package is a thin client. It scrapes locally through MCP Scraper and orchestrates the run, but every generation step (note-taking, ICP inference, headings, writing) is delegated to a private access-gated server that holds the prompts, the model strategy, and the OpenRouter key. The client ships no prompts and no model logic.
How it works
The orchestrating agent (Claude or Codex) calls the Step Tools in order. Each tool does one atomic thing against a Session Folder on disk and returns only a small summary — bulk scraped pages, transcripts, and the 30 parallel OpenRouter transform calls never pass through the agent's context.
start_run(keyword) → sessionId
ingest_paa(sessionId) → 50 PAA questions [MCP Scraper: harvest_paa]
ingest_serp(sessionId) → AI overview + ~20 organic (2 pages) + YT urls, scrape each page
[MCP Scraper: search_serp pages=2, extract_url]
ingest_youtube(sessionId) → transcribe SERP-surfaced videos [MCP Scraper: youtube_transcribe] (optional)
ingest_reddit(sessionId) → browser-agent thread search + scrape [MCP Scraper: browser_*, extract_url] (optional)
ingest_research(sessionId) → agent-packet deep research → Brief + Evidence [MCP Scraper: workflow_run] (optional)
research_stats(sessionId) → stat-targeted harvest → sourced stats bank [MCP Scraper + server: /api/extract-stats] (optional)
transform_sources(sessionId) → Note Versions, up to 8 parallel server calls [server: /api/transform]
run_textrazor(sessionId) → entities for top 5 SERP sources [server: /api/textrazor] (optional)
infer_icp(sessionId) → icp.md / icp.json [server: /api/icp]
propose_headings(sessionId) → question-formatted H1–H3, sources mapped per section [server: /api/headings]
write_sections(sessionId) → one section per server call, intro skipped [server: /api/write-section]
assemble_article(sessionId) → article.md
improve(sessionId, focus?) → critique → more research → re-craft (optional) [server: /api/critique]
add_images(sessionId) → 5-stage visual plan → fal.ai photos + baked text overlays (opt) [server: /api/image-plan, /api/image, /api/compose]
add_graphics(sessionId) → infographic GATE → SVG data graphics, gated & inserted (opt) [server: /api/infographic-plan, /api/graphic]
publish_preview(sessionId) → ephemeral 24h Vercel URL (Article + Analysis tabs) [server: /api/publish]improve runs before the image steps (it regenerates the article from sections). Image system:
add_images— a 5-stage art-director plan (analyze → expand → finalize by visual-communication principles → text-overlay design → prompt) curating ~3-5 photos (one hero), generated with fal.ainano-banana-provia the async queue. Overlay text is composed as crisp SVG and baked into the photo with sharp (/api/compose) — AI image models can't render legible text, so text lives in a vector layer.research_stats— runs stat-targeted searches via MCP Scraper, scrapes results, and extracts a deduped, source-attributed stats bank (value · claim · source · year). Feeds the narrative gate so data viz is grounded and honest (NN/g's "informational honesty").add_graphics— a narrative gate (/api/visual-plan): an art director finds the article's core story, designs ONE composite lead-infographic (tall multi-panel SVG: header → hero stat → icon flow → comparison → source) plus 1-2 supporting graphics, grounding every number in the stats bank. Full data-viz toolkit, all real SVG: lead-infographic, statement-card, stat-card, diagram (flow/timeline), breakdown, comparison, bar-chart, trend, before-after, pictograph, ranked-bars, gauge, funnel, quadrant, scatter, stacked-100, quote-card — with an icon library and NN/g rules (one focal point, honest scale, limited palette, hierarchy). Never forced.
YouTube and Reddit run by default. Set includeYoutube: false / includeReddit: false on start_run to skip them.
Session Folder layout
sessions/<keyword>-<id>/
├── manifest.json run options, per-step status, source registry
├── raw/ paa.json, serp.json, <sourceId>.txt, textrazor-entities.json
├── notes/ <sourceId>.json (Note Versions)
├── icp.md / icp.json
├── headings.json
├── sections/NN.md
└── article.md final outputGeneration server
All generation is delegated to a private server (the mcp-content-server Vercel project). The server holds the prompts, the model profiles (transform → Qwen 3.6; icp/headings/writer → GPT-OSS-120B Nitro, with Qwen fallbacks), the OpenRouter key, and the TextRazor key. It validates MCP_CONTENT_API_KEY on every request. The client never sees prompts or model ids — only the tool results.
Setup
npm install
npm run buildSet environment (see .env.example):
MCP_CONTENT_API_KEY— required. Your issued access key. The server returns 401 without it.MCP_CONTENT_SERVER_URL— optional. Defaults to the hosted server; override for your own deployment orvercel dev.MCP_SCRAPER_CMD/MCP_SCRAPER_ARGS— how to launch MCP Scraper (defaults to~/Desktop/mcp-scraper/dist/bin/mcp-scraper.js).MCP_CONTENT_MAX_PARALLEL— transform fan-out cap (default 8).MCP_CONTENT_SESSIONS_DIR— where Session Folders live (default./sessions).
OpenRouter and TextRazor keys are not set on the client — they live on the server.
Register with an MCP client
{
"mcpServers": {
"mcp-content": {
"command": "npx",
"args": ["-y", "mcp-content"],
"env": { "MCP_CONTENT_API_KEY": "..." }
}
}
}The same client must also have MCP Scraper available to the host environment, since mcp-content launches it as a child stdio client.
MCP Scraper
mcp-content depends on the MCP Scraper server for all external scraping. Tool-name and argument shapes are read defensively (src/scraper/parse.ts) so minor response-shape differences degrade gracefully rather than crash.
Develop
npm run dev # run the server from source over stdio
npm run typecheck
npm test