npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

browsepilot

v0.1.0

Published

MCP server for AI browser automation. Semantic page snapshots, annotated screenshots, and full browser control — built for Claude, GPT, and any MCP client.

Readme

🧭 BrowsePilot

MCP server for AI browser automation. Semantic page snapshots, annotated screenshots, and full browser control — built for Claude, GPT, and any MCP client.

BrowsePilot gives AI models a deep understanding of web pages through three-tree merge snapshots — combining accessibility trees, DOM structure, and runtime state into a single, AI-readable page model. No other browser MCP server does this.

Quick Start

npx browsepilot

That's it. Add it to Claude Desktop:

{
  "mcpServers": {
    "browsepilot": {
      "command": "npx",
      "args": ["browsepilot"]
    }
  }
}

Connect to Your Existing Chrome

Keep your logins, cookies, and extensions. Launch Chrome with remote debugging:

# macOS
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222

# Linux
google-chrome --remote-debugging-port=9222

# Windows
chrome.exe --remote-debugging-port=9222

Then configure:

{
  "mcpServers": {
    "browsepilot": {
      "command": "npx",
      "args": ["browsepilot", "--connect-existing", "--cdp-port", "9222"]
    }
  }
}

What Makes This Different

Three-Tree Merge Snapshots

Every other browser MCP server gives you either a raw accessibility tree or a screenshot. BrowsePilot merges three data sources via CDP into a unified page model:

  1. Accessibility tree — semantic roles, names, values
  2. DOM snapshot — visibility, bounding boxes, paint order, computed styles
  3. Runtime evaluation — scroll position, shadow DOM, scrollable containers, page type

The result is a structured, categorized view of the page:

Page: Google Search [search] | https://www.google.com/
Scroll: 0.0 pages above | 3.2 pages below
Stats: 47 links | 12 interactive | 3 inputs | 340 total
[Start of page]

── INPUTS ──
  e1  textbox  "Search"  = ""
* e5  option   "Suggested: weather"

── ACTIONS ──
  e2  button   "Google Search"
  e3  button   "I'm Feeling Lucky"

── LINKS ──
  e10  link  "Gmail"
  e11  link  "Images"

── SCROLL CONTAINERS ──
  sc1  div#results — 4200px tall, at top, 3400px remaining
[End of page]
  • * = element appeared since last snapshot (change detection)
  • Elements organized by category (INPUTS, ACTIONS, LINKS, etc.)
  • Scroll position awareness (pages above/below)
  • Page type detection (auth, form, search, data-table, article, dashboard)
  • Shadow DOM and scrollable container awareness

Annotated Screenshots

Visual context with numbered bounding boxes on interactive elements, color-coded by type:

  • 🔵 Blue = inputs
  • 🟢 Green = actions/buttons
  • 🟠 Orange = links

Resilient Actions

Click, type, and other actions use multi-strategy fallback chains:

  • Playwright click → scroll into view → partial name match → CDP click
  • If one strategy fails, the next one tries automatically

Tools (22)

| Tool | Description | |------|-------------| | browser_snapshot | Semantic page snapshot with element refs | | browser_screenshot | Annotated screenshot with bounding boxes | | browser_navigate | Navigate to URL | | browser_click | Click element by ref | | browser_type | Type text into element | | browser_press | Press keyboard key/shortcut | | browser_scroll | Scroll page or element into view | | browser_select | Select dropdown option | | browser_hover | Hover over element | | browser_fill_form | Fill multiple form fields | | browser_evaluate | Run JavaScript on page | | browser_extract | Extract structured data (tables, lists, articles, links) | | browser_read_text | Read text content | | browser_force_click | JavaScript click (for covered/shadow DOM elements) | | browser_wait | Wait for conditions | | browser_tab_new | Open new tab | | browser_tab_switch | Switch tabs | | browser_tab_list | List open tabs | | browser_tab_close | Close tab | | browser_go_back | Navigate back | | browser_go_forward | Navigate forward | | browser_scroll_container | Scroll embedded containers |

Options

npx browsepilot [options]

--headless            Run Chrome in headless mode
--connect-existing    Connect to already-running Chrome
--cdp-port <port>     CDP port (default: random)
--user-data-dir <dir> Chrome profile directory
-v, --version         Show version
-h, --help            Show help

Requirements

  • Node.js 18+
  • Chrome, Brave, Edge, or Chromium installed (auto-detected)
  • Optional: canvas npm package for annotated screenshots

How It Works

MCP Client (Claude Desktop, Cursor, etc.)
  → stdio transport
    → BrowsePilot MCP Server
      → Chrome via Playwright + raw CDP
        → Three-tree merge snapshot
        → Annotated screenshots
        → All browser actions

BrowsePilot doesn't run its own AI — it's a pure tool layer. Your MCP client's model makes the decisions, BrowsePilot executes them.

License

MIT