npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

intercept-mcp

v5.5.0

Published

MCP server with multi-tier fallback chain for fetching web content as clean markdown

Readme

intercept-mcp

Give your AI the ability to read the web. One command, no API keys required.

Without it, your AI hits a URL and gets a 403, a wall, or a wall of raw HTML. With intercept, it almost always gets the content — clean markdown, ready to use.

Handles tweets, YouTube videos (with transcripts when available), arXiv papers, PDFs, Wikipedia articles, and GitHub repos. If the first strategy fails, it tries up to 14 more before giving up.

Works with any MCP client: Claude Code, Claude Desktop, Codex, Cursor, Windsurf, Cline, and more.

Install

Claude Code

claude mcp add intercept -s user -- npx -y intercept-mcp

Codex

codex mcp add intercept -- npx -y intercept-mcp

Cursor

Settings → MCP → Add Server:

{
  "mcpServers": {
    "intercept": {
      "command": "npx",
      "args": ["-y", "intercept-mcp"]
    }
  }
}

Windsurf

Settings → MCP → Add Server → same JSON config as above.

Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "intercept": {
      "command": "npx",
      "args": ["-y", "intercept-mcp"]
    }
  }
}

Other MCP clients

Any client that supports stdio MCP servers can run npx -y intercept-mcp.

No API keys needed for the fetch tool.

How it works

URLs are processed in four stages:

1. Site-specific handlers

Known URL patterns are routed to dedicated handlers before the fallback pipeline:

| Pattern | Handler | What you get | |---------|---------|-------------| | twitter.com/*/status/*, x.com/*/status/* | Twitter/X | Tweet text, author, media, engagement stats (via third-party APIs) | | youtube.com/watch?v=*, youtu.be/* | YouTube | Title, channel, duration, views, description, transcript (when captions available) | | arxiv.org/abs/*, arxiv.org/pdf/* | arXiv | Paper metadata, authors, abstract, categories | | *.pdf | PDF | Extracted text (text-layer PDFs only) | | *.wikipedia.org/wiki/* | Wikipedia | Clean article content via Wikimedia REST API | | github.com/{owner}/{repo} | GitHub | Raw README.md content |

2. Shared cache (agentsweb.org)

Before hitting any fetcher, every request checks agentsweb.org — a global shared markdown cache for AI agents. If another agent already fetched this URL, you get the result in under 50ms.

Every successful fetch contributes back automatically. Entries gain trust through a self-healing consensus model: when independent instances fetch the same URL and confirm the same content, confidence increases.

Opt out entirely with INTERCEPT_SHARED_CACHE=false, or use read-only mode (consume but never contribute) with INTERCEPT_CACHE_READ_ONLY=true.

agentsweb.org API

agentsweb.org also exposes standalone endpoints for direct use:

  • /web?q= — search the web
  • /research?q= — search + fetch + cache in one call
  • /fetch?url= — fetch on demand, auto-cached

See agentsweb.org/docs for full API documentation.

3. Fallback pipeline

If no handler matches (or the handler returns nothing), the URL enters the multi-tier pipeline:

| Tier | Fetcher | Strategy | |------|---------|----------| | 0 | agentsweb.org | Global shared markdown cache — instant if another agent already fetched this URL | | 1 | Cloudflare Browser Run | JS rendering + markdown extraction (optional, needs API token) | | 1 | Jina Reader | Clean markdown extraction service | | 2 | Wayback Machine | Archived version from archive.org | | 2 | archive.ph | Archived snapshots via timemap API + stealth TLS fetch | | 2 | Google Cache | Google's cached page version | | 2 | Arquivo.pt | Portuguese web archive (broad international coverage) | | 2 | Codetabs | CORS proxy | | 3 | Raw fetch | Direct GET with browser headers + Turndown markdown conversion | | 3 | Stealth fetch | Browser TLS fingerprint impersonation via got-scraping (opt-in, see below) | | 4 | RSS, CrossRef, Semantic Scholar, HN, Reddit | Metadata / discussion fallbacks | | 5 | OG Meta | Open Graph tags (guaranteed fallback) |

Tier 2 fetchers run in parallel. When multiple succeed, the highest quality result wins. All other tiers run sequentially.

All fetchers return proper Markdown (headings, links, bold, tables, code blocks) via Turndown — not plain text.

4. Caching

Results are cached in-memory with TTL (30 min for successes, 5 min for failures). Max 100 entries with LRU eviction. Failed URLs are cached to prevent re-attempting known-dead URLs.

Tools

fetch

Fetch a URL and return its content as clean markdown.

  • url (string, required) — URL to fetch
  • maxTier (number, optional, 1-5) — Stop at this tier for speed-sensitive cases

search

Search the web and return results.

  • query (string, required) — Search query
  • count (number, optional, 1-20, default 5) — Number of results

Uses Brave Search API if BRAVE_API_KEY is set, then SearXNG if SEARXNG_URL is set, then DuckDuckGo as an unreliable last resort.

Prompts

research-topic

Search for a topic and fetch the top results for a multi-source summary.

  • topic (string) — The topic to research
  • depth (string, default "3") — Number of top results to fetch

extract-article

Fetch a URL and extract the key points from the content.

  • url (string) — The URL to fetch and summarize

Environment variables

| Variable | Required | Description | |----------|----------|-------------| | BRAVE_API_KEY | No | Brave Search API key for search | | SEARXNG_URL | No | Self-hosted SearXNG instance URL (recommended) | | CF_API_TOKEN | No | Cloudflare API token with "Browser Rendering - Edit" permission | | CF_ACCOUNT_ID | No | Cloudflare account ID (required if CF_API_TOKEN is set) | | USE_STEALTH_FETCH | No | Set to true to enable stealth fetcher (see warning below) | | INTERCEPT_SHARED_CACHE | No | Set to false to disable the agentsweb.org shared cache | | INTERCEPT_CACHE_READ_ONLY | No | Set to true to consume but never contribute to the shared cache | | INTERCEPT_CACHE_TTL_MS | No | In-memory cache TTL for successful fetches in ms (default 3600000 = 60 min) | | INTERCEPT_CACHE_FAILURE_TTL_MS | No | In-memory cache TTL for failed fetches in ms (default 300000 = 5 min) | | INTERCEPT_CACHE_SIZE | No | Max in-memory cache entries (default 250) | | HTTPS_PROXY / HTTP_PROXY | No | Standard proxy passthrough — routes all outbound fetches (including stealth) through the proxy. Honors NO_PROXY. |

Search: Has a DuckDuckGo fallback but it's rate-limited and unreliable. For production use, self-host SearXNG and set SEARXNG_URL (see below), or get a Brave Search API key.

Fetch: Works without any keys. Set CF_API_TOKEN + CF_ACCOUNT_ID to enable Cloudflare Browser Run (formerly Browser Rendering) for JavaScript-heavy pages (SPAs, React sites).

Stealth fetch (USE_STEALTH_FETCH)

Use at your own risk. When enabled, this adds a fetcher that impersonates real browser TLS fingerprints (Chrome/Firefox cipher suites, HTTP/2 settings, header ordering) using got-scraping. This can bypass bot detection and CAPTCHA triggers on sites that would otherwise block automated requests.

This fetcher runs at tier 3 after the regular raw fetch. If the raw fetch gets blocked (CAPTCHA, Cloudflare challenge, 403), the stealth fetcher retries with browser impersonation.

This may violate the terms of service of some websites. The authors of intercept-mcp take no responsibility for how this feature is used. It is disabled by default and must be explicitly opted into.

Bring-your-own proxy (HTTPS_PROXY)

If raw fetches start getting flagged, the most effective fix is usually a clean outbound IP — not a fancier fingerprint. intercept-mcp honors the standard HTTPS_PROXY / HTTP_PROXY / NO_PROXY env vars, so you can route all outbound traffic through whatever proxy you already have:

HTTPS_PROXY=http://user:[email protected]:8080 npx intercept-mcp

This works with any HTTP(S) proxy — a self-hosted Squid, a Tailscale exit node, a $5 VPS running 3proxy, or commercial residential proxies (Bright Data, Oxylabs, etc.). The stealth fetcher and got-scraping calls also pick this up automatically.

Self-hosting SearXNG

For reliable search, self-host SearXNG with Docker. A config is included in the repo:

git clone https://github.com/bighippoman/intercept-mcp.git
cd intercept-mcp/searxng && docker compose up -d

Then set SEARXNG_URL=http://localhost:8888. No rate limits, no CAPTCHAs, aggregates Google + Bing + DuckDuckGo + Wikipedia + Brave.

Or use any existing SearXNG instance — just set SEARXNG_URL to its URL.

URL normalization

Incoming URLs are automatically cleaned:

  • Strips 60+ tracking params (UTM, click IDs, analytics, A/B testing, etc.)
  • Removes hash fragments
  • Upgrades to HTTPS
  • Cleans AMP artifacts
  • Preserves functional params (ref, format, page, offset, limit)

Content quality detection

Each fetcher result is scored for quality. Automatic fail on:

  • CAPTCHA / Cloudflare challenges
  • Login walls
  • HTTP error pages in body
  • Content under 200 characters

Requirements

  • Node.js >= 18
  • No API keys required for basic use