npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@crawlbrulee/mcp

v0.1.1

Published

Official MCP server for the crawlbrulee API — scrape and map URLs from MCP-aware agents (Claude Code, Codex, Cursor).

Readme

🍮 crawlbrulee MCP

The official Model Context Protocol server for the crawlbrulee web-scraping API. Lets MCP-aware AI agents — Claude Code, Codex, Cursor, Claude Desktop — scrape pages, map sites, and check their crawlbrulee usage as native tool calls.

  • npx-runnable — zero install.
  • Wraps the @crawlbrulee/sdk under the hood; this MCP is just a thin protocol adapter.
  • Stdio transport for terminal-based agents.
  • Strict, fully-described tool schemas — agents see what every parameter does without reading docs.

Status: v0.1.0 (beta). Tool surface is stabilizing — expect minor changes between 0.x releases.


Install

# Claude Code
claude mcp add crawlbrulee \
  --env CRAWLBRULEE_API_KEY=cble_... \
  -- npx -y @crawlbrulee/mcp

# Cursor — add to ~/.cursor/mcp.json:
{
  "mcpServers": {
    "crawlbrulee": {
      "command": "npx",
      "args": ["-y", "@crawlbrulee/mcp"],
      "env": { "CRAWLBRULEE_API_KEY": "cble_..." }
    }
  }
}

The same pattern works for Codex, Claude Desktop, and any other host that accepts a stdio MCP launch command — set command: npx, args: ["-y", "@crawlbrulee/mcp"], and forward CRAWLBRULEE_API_KEY via the env block.

Configuration

| Env var | Required | Description | | --------------------- | -------- | ------------------------------------------------------------------------------ | | CRAWLBRULEE_API_KEY | yes | API key sent as Authorization: Bearer …. Get one at https://crawlbrulee.com. |

The MCP reads the env var on first tool invocation — not at startup — so a typo in your config surfaces as a clear tool-error message rather than the server failing to come up.


Tools

scrape

Fetch a single URL and return the requested content (markdown, cleaned HTML, raw HTML, links, images, screenshot, page metadata).

Input — only url is required; everything else has sane defaults.

{
  "url": "https://example.com",
  "extract": {
    "markdown": true,
    "links": true,
    "screenshot": { "type": "full_page", "device_mode": "desktop" },
  },
  "require_js": false,
  "proxy": "basic",
  "exclude_selectors": ["nav", "footer"],
  "cache": { "max_age": 3600 },
  "location": { "locale": "en-US", "country": "US" },
}

Output — full scrape result. Screenshots are returned as signed download URLs the agent can fetch separately.

map

Build (or fetch a cached) link-map for a website. Combines sitemap discovery with homepage link extraction. Use this to enumerate a site before scraping selected pages.

{
  "url": "https://example.com",
  "sitemap_only": false,
  "types": { "internal": true, "external": false, "internal_subdomains": true },
  "max_urls": 5000,
  "page": 1,
  "limit": 1000,
}

usage

Returns the current billing-cycle snapshot: total / used / available credits, used quota percent, max concurrency, and cycle reset timestamp. Takes no arguments.

whoami

Returns the organization name, token name, and truncated token preview for the configured API key. Useful for confirming which account is in use before credit-consuming operations.


Errors

Every tool returns an MCP error envelope (isError: true) when the API call fails. The error text follows a stable format:

[<errorName>] <message> (HTTP <status>)

Agents can branch on the errorName code. The set comes from the SDK's ApiErrorName union plus two synthetic codes added by this MCP (missing_api_key, internal_error):

| Code | Meaning | | ------------------------ | ------------------------------------------------------------- | | missing_api_key | CRAWLBRULEE_API_KEY is not set in the MCP host's env. | | invalid_credentials | Server rejected the API key (revoked, wrong env, etc.). | | too_many_requests | Rate limit hit — back off and retry. | | usage_allocation_error | Plan credit / concurrency cap exceeded. Show usage to user. | | validation_error | Input failed server validation. | | invalid_url | Target URL was rejected before fetching. | | blocked_url | Target URL is on the blocklist. | | antibot_blocked | Origin's anti-bot defenses blocked the fetch. | | scrape_error | Origin returned an error during scraping. | | not_found | Async job ID unknown (reserved for future async tools). | | request_timeout | Network / read timeout. Safe to retry. | | client_closed_request | Caller cancelled before completion. | | internal_server_error | Unhandled server-side failure. | | crawlbrulee_error | SDK error without a typed name. | | internal_error | Bug in this MCP — please open an issue. |


Development

pnpm install
pnpm typecheck   # tsc --noEmit
pnpm lint        # eslint
pnpm test        # vitest run
pnpm build       # tsup → dist/index.js with shebang
pnpm verify      # all of the above

Run the built MCP locally:

CRAWLBRULEE_API_KEY=cble_... node ./dist/index.js

It will block waiting for an MCP client on stdio. Combine with the MCP Inspector for interactive debugging.

Schema sync

The src/schemas/*.ts files are vendored copies of the canonical Zod schemas in crawlbrulee/packages/shared/core/src/model/common/Api*.ts. The ecosystem policy is that tool repos do not import the shared subtree. When the canonical schemas change:

  1. Copy the updated Api*.ts and supporting ScrapeScreenshot*.ts files into src/schemas/.
  2. Keep the // VENDORED from … banner intact and update the path if the source moved.
  3. Re-run pnpm verify.

A future @crawlbrulee/types npm package will replace this manual sync.

License

AGPL-3.0-only