contextmd-cli
v1.0.2
Published
The ultimate Agentic AI Context Generator for Documentation.
Downloads
317
Maintainers
Readme
🧠 ContextMD
🚀 maintain your AI Documentation effortlessly?
Check out SuperDocs.cloud — The platform to build, host, and maintain AI-ready documentation in 1-click directly from your GitHub repository.
- Always up to date. Zero config. Production ready.
ContextMD is the ultimate terminal utility for turning complex documentation websites into a single, high-density AI Context File.
Modern LLMs and Agents (like Claude 3.5 Sonnet, GPT-4o, or Gemini 1.5 Pro) are powerful, but they struggle to navigate multi-page documentation sites effectively. They get lost in navigation bars, footers, duplicate content, and fragmented pages.
ContextMD solves this. It crawls, cleans, and chemically refines entire documentation sites into a single context.md file that you can drop directly into your LLM's context window.
☁️ Go Pro with SuperDocs.cloud
Love the CLI but want automation?
SuperDocs.cloud takes this concept to the enterprise level:
- 🔄 Auto-Sync: Automatically updates your context whenever you push to GitHub.
- 🌍 Hosted URLs: Get a permanent, shareable URL for your documentation context.
- 🧠 Smart Versioning: maintain multiple versions of your docs (v1, v2) context.
- ⚡ 1-Click Setup: Just connect your repo, and we handle the scraping, cleaning, and hosting.
Start Building on SuperDocs.cloud →
✨ Features
- 🕷️ Deep Crawling: Intelligently traverses documentation sites, following links and building a comprehensive map of the content.
- 🧠 AI-Powered Refinement: Uses OpenAI's models (configurable) to "read" each page and rewrite it for machine comprehension, stripping fluff and prioritizing logic, API signatures, and examples.
- 🧹 Noise Reduction: Automatically detects and separates main content from sidebars, headers, footers, and advertisements.
- ⚡ High Performance: Concurrent processing with a beautiful, real-time CLI dashboard.
- 📄 Single File Output: Produces a consolidated Markdown file with clear headers and structure, perfect for RAG systems or direct LLM context.
🚀 Installation
Ensure you have Node.js 18+ installed.
Global Install (Recommended)
npm install -g contextmd-cliRun via npx (No install required)
npx contextmd https://docs.example.com🛠️ Usage
Quick Start
- Get an OpenAI API Key: ContextMD uses AI to compress and refine the content.
- Run the tool:
export OPENAI_API_KEY=sk-proj-...
contextmd https://docs.turso.techThis will generate a context.md file in your current directory.
Command Line Options
Usage: contextmd [options] <url>
Arguments:
url Base URL of the documentation to convert
Options:
-k, --key <key> OpenAI API Key (can also be set via OPENAI_API_KEY env var)
-o, --output <path> Output file path (default: "context.md")
-l, --limit <number> Max pages to crawl (default: "100")
-h, --help display help for commandExamples
Crawl a specific documentation site with a page limit:
contextmd https://developer.spotify.com/documentation/web-api --limit 50Save to a specific location:
contextmd https://stripe.com/docs/api -o ./stripe-context.md🏗️ How It Works
ContextMD operates in a three-stage pipeline:
The Crawler:
- Starts at the provided
url. - Uses a Breadth-First Search (BFS) algorithm to find internal links.
- Filters out external links, social media, and irrelevant pages.
- Respects the
--limitflag to prevent infinite loops on massive sites.
- Starts at the provided
The Processor (The "Brain"):
- Downloads the raw HTML of each discovered page.
- Uses
turndownandcheerioto convert HTML to Markdown. - AI Step: Sends the raw Markdown to an LLM with a specialized system prompt designed to:
- Summarize verbose sections.
- Preserve code blocks and API schemas exactly.
- Remove marketing fluff.
- Standardize formatting.
The Compiler:
- Stitches all processed pages into a single
context.mdfile. - Adds a metadata header and table of contents structure (implicitly via markdown headers).
- Stitches all processed pages into a single
📦 For Developers
Want to build this from source?
Clone the repo:
git clone https://github.com/UditAkhourii/contextmd.git cd contextmdInstall dependencies:
npm installBuild:
npm run buildRun locally:
node dist/index.js https://example.com
🤝 Contributing
We welcome contributions! Please open an issue or submit a PR if you have ideas for:
- Support for local LLMs (Ollama, etc.)
- Better crawling heuristics for SPA (Single Page Apps).
- Output formats (JSON, JSONL for fine-tuning).
📄 License
CC BY-NC 4.0 © Udit Akhouri
