npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

mdrip

v0.1.7

Published

Fetch markdown snapshots of web pages using Cloudflare Markdown for Agents

Readme

mdrip

Fetch clean markdown snapshots of any web page — optimized for AI agents, RAG pipelines, and context-aware workflows.

Reduces token overhead by ~90% compared to raw HTML while preserving the content structure LLMs need.

Why

AI agents and LLMs work better with markdown than HTML. Feeding raw HTML into a context window wastes tokens on tags, scripts, styles, and boilerplate. mdrip solves this by fetching any URL and returning clean, structured markdown.

  • ~90% fewer tokens than raw HTML
  • Automatic HTML-to-markdown fallback when native markdown isn't available
  • Works everywhere — CLI, Node.js, Cloudflare Workers, or via remote MCP
  • Token-aware — reports estimated token counts so you can manage context budgets

Sites that support Cloudflare's Markdown for Agents return markdown natively at the edge. For all other sites, mdrip's built-in converter handles headings, links, lists, code blocks, tables, blockquotes, and more, while filtering hidden/non-visible content (including hidden attributes, aria-hidden, inline hidden styles, templates/forms, and HTML comments).

Installation

npm install -g mdrip

Or use directly with npx:

npx mdrip <url>

CLI Usage

Fetch pages

# Fetch one page
mdrip https://example.com/docs/getting-started

# Fetch multiple pages
mdrip https://example.com/docs https://example.com/api

# Custom timeout (ms)
mdrip https://example.com --timeout 45000

# Strict mode — only accept native markdown, no HTML fallback
mdrip https://example.com --no-html-fallback

# Raw mode — print markdown to stdout, no file writes
mdrip https://example.com --raw

List fetched pages

mdrip list
mdrip list --json

Remove pages

mdrip remove https://example.com/docs/getting-started

Clean snapshots

# Remove all
mdrip clean

# Remove only one domain
mdrip clean --domain example.com

Raw mode for agent runtimes

--raw prints markdown to stdout and skips all file writes and prompts. Useful for piping content directly into agent loops.

mdrip https://example.com --raw | your-agent-cli

Programmatic API

npm install mdrip

Method reference

| Import path | Method | Returns | Purpose | |---|---|---|---| | mdrip | fetchMarkdown(url, options?) | Promise<MarkdownResponse> | Fetch one URL to markdown with metadata | | mdrip | fetchRawMarkdown(url, options?) | Promise<string> | Fetch one URL to markdown string only | | mdrip/node | fetchMarkdown(url, options?) | Promise<MarkdownResponse> | Node entrypoint alias for in-memory fetch | | mdrip/node | fetchRawMarkdown(url, options?) | Promise<string> | Node entrypoint alias for markdown-only fetch | | mdrip/node | fetchToStore(url, options?) | Promise<FetchResult> | Fetch one URL and persist to mdrip/pages/... | | mdrip/node | fetchManyToStore(urls, options?) | Promise<FetchResult[]> | Fetch many URLs and persist successful results | | mdrip/node | listStoredPages(cwd?) | Promise<PageEntry[]> | List tracked snapshots from mdrip/sources.json |

FetchMarkdownOptions supports: timeoutMs, userAgent, htmlFallback, fetchImpl, tokenModel, tokenEncoding. Default token encoding is o200k_base (recommended for GPT-4o/4.1/5-family style counting). StoreFetchOptions extends that with cwd.

Workers / Edge / In-memory

import { fetchMarkdown } from "mdrip";

const page = await fetchMarkdown("https://example.com/docs");

console.log(page.markdown);       // clean markdown content
console.log(page.markdownTokens); // estimated token count
console.log(page.source);         // "cloudflare-markdown" or "html-fallback"

Node.js (fetch and store to disk)

import { fetchToStore, listStoredPages } from "mdrip/node";

const result = await fetchToStore("https://example.com/docs", {
  cwd: process.cwd(),
});

if (result.success) {
  console.log(`Saved to ${result.path}`);
}

const pages = await listStoredPages(process.cwd());

Remote MCP + HTTP API

mdrip is available as a remote service at mdrip.createmcp.dev with MCP transports and a direct JSON API.

| Endpoint | Transport | Use case | |---|---|---| | /mcp | Streamable HTTP MCP | Recommended for MCP clients | | /sse | SSE MCP | Legacy MCP client compatibility | | /api | JSON over HTTP | Direct non-MCP integration |

MCP tools

fetch_markdown:

  • Inputs: url (required), timeout_ms (optional), html_fallback (optional)
  • Output: markdown + metadata (resolvedUrl, status, contentType, source, markdownTokens, contentSignal)

batch_fetch_markdown:

  • Inputs: urls (required array, 1-10), timeout_ms (optional), html_fallback (optional)
  • Output: one result per URL, with success/error details

HTTP API (/api)

GET /api expects query params:

  • url (required)
  • timeout (optional ms)
  • html_fallback (optional true/false)
curl "https://mdrip.createmcp.dev/api?url=https://example.com&timeout=30000&html_fallback=true"

POST /api supports both single and batch bodies:

{ "url": "https://example.com", "timeout_ms": 30000, "html_fallback": true }
{
  "urls": ["https://example.com", "https://example.com/docs"],
  "timeout_ms": 30000,
  "html_fallback": true
}

Single responses return one fetch result object. Batch responses return { "results": [...] } with success: true|false per URL.

Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "mdrip": {
      "command": "npx",
      "args": ["mcp-remote", "https://mdrip.createmcp.dev/mcp"]
    }
  }
}

Claude Code

claude mcp add mdrip-remote --transport sse https://mdrip.createmcp.dev/sse

Cloudflare AI Playground

Enter mdrip.createmcp.dev/sse at playground.ai.cloudflare.com.

OpenClaw Integration

Option 1: Dedicated OpenClaw skill (recommended)

Install skills from this repo in your OpenClaw workspace:

cd ~/.openclaw/workspace
npx skills add charl-kruger/mdrip

Enable the OpenClaw-focused skill in ~/.openclaw/openclaw.json:

{
  "skills": {
    "entries": {
      "mdrip-openclaw": {
        "enabled": true
      }
    }
  }
}

If you want OpenClaw to load skills directly from this local repository:

{
  "skills": {
    "load": {
      "extraDirs": ["<absolute-path-to-fetchmd>/skills"]
    }
  }
}

Option 2: Direct CLI usage in OpenClaw workflows

# In-memory markdown only (best for tool pipelines)
mdrip https://example.com/docs --raw

# Persist reusable snapshots in workspace
mdrip https://example.com/docs https://example.com/api
mdrip list --json

File modifications

On first run, mdrip can optionally update:

  • .gitignore — adds mdrip/
  • tsconfig.json — excludes mdrip/
  • AGENTS.md — adds a section pointing agents to your snapshots

Choice is stored in mdrip/settings.json. Use --modify or --modify=false to skip the prompt.

--raw mode bypasses this entirely.

Output structure

mdrip/
├── settings.json
├── sources.json
└── pages/
    └── example.com/
        └── docs/
            └── getting-started/
                └── index.md

Benchmark

Measured on February 13, 2026 (values vary as pages change):

| Page | Mode | Chars saved | Tokens saved | |------|------|------------:|-------------:| | blog.cloudflare.com/markdown-for-agents | cloudflare-markdown | 94.3% | 96.2% | | developers.cloudflare.com/.../markdown-for-agents | cloudflare-markdown | 95.5% | 97.3% | | en.wikipedia.org/wiki/Markdown | html-fallback | 73.8% | 76.4% | | github.com/cloudflare/skills | html-fallback | 96.7% | 98.0% | | Average | | 90.1% | 92.0% |

pnpm build && pnpm benchmark

Token counts use mdrip's tokenizer-based estimator (default encoding: o200k_base).

AI Skills

This repo includes an AI-consumable skills catalog in skills/, following the agentskills format.

  • mdrip: general-purpose mdrip skill (CLI, package APIs, remote MCP/API)
  • mdrip-openclaw: OpenClaw-focused skill and config/workflow reference
npx skills add charl-kruger/mdrip

Requirements

  • Node.js 18+

Author

Charl Kruger

License

Apache-2.0