npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

browser-agent-mcp

v0.2.0

Published

Chrome extension and MCP server for comprehensive browser automation with AI agents

Readme

Browser Agent MCP

A Chrome extension and MCP (Model Context Protocol) server that enables AI agents to fully automate and interact with web browsers using your existing Chrome instance.

🎯 Project Goals

This project aims to create the most comprehensive browser automation MCP that combines the best features of existing solutions while addressing their limitations:

  • Full browser automation (navigation, clicking, typing, form filling)
  • Complete Chrome DevTools access (console logs, network monitoring, element inspection)
  • DOM manipulation and querying (search elements, extract data, modify content)
  • Works with your existing Chrome browser (no separate browser instances required)
  • Reliable and fast (direct extension APIs, no flaky WebSocket connections)

🚀 Why This Project?

Current Browser MCP Limitations

| Tool | Navigation | DevTools | Existing Browser | Reliability | | ---------------------------- | ---------- | -------- | ---------------- | ----------- | | Microsoft Playwright MCP | ✅ | ❌ | ❌ | ✅ | | Browser MCP | ❌ | ⚠️ | ✅ | ⚠️ | | BrowserTools MCP | ❌ | ✅ | ✅ | ⚠️ | | Our Solution | ✅ | ✅ | ✅ | ✅ |

Key Problems We're Solving

  1. Navigation Gap: Existing browser-based MCPs don't provide navigation automation
  2. Reliability Issues: Current solutions are flaky and prone to connection drops
  3. Limited DevTools Access: Most tools only provide basic console/network monitoring
  4. Complex Setup: CDP-based solutions require debug mode and complex configuration

🏗️ Architecture Overview

AI Agent (Claude/Cursor/etc.)
    ↓ MCP Protocol
MCP Server (Node.js)
    ↓ WebSocket/Native Messaging
Chrome Extension
    ↓ Chrome APIs + Content Scripts
Web Pages (DOM Manipulation)

Components

  1. Chrome Extension

    • Content scripts for DOM interaction
    • Background service worker for tab management
    • DevTools integration for debugging access
    • Native messaging for MCP communication
  2. MCP Server

    • Implements Model Context Protocol
    • Translates AI commands to browser actions
    • Manages extension communication
    • Provides structured responses to AI agents
  3. Communication Bridge

    • WebSocket or Native Messaging
    • Real-time bidirectional communication
    • Event streaming for live updates
    • Error handling and reconnection logic

🛠️ Planned Features

Core Automation

  • Navigation: navigate(url), back(), forward(), reload()
  • Element Interaction: click(selector), type(selector, text), submit(form)
  • Waiting: waitForElement(selector), waitForNavigation(), waitForText(text)
  • Scrolling: scrollTo(selector), scrollIntoView(element)

Advanced Interaction

  • Form Handling: fillForm(data), selectOption(selector, value), uploadFile(selector, path)
  • Drag & Drop: dragAndDrop(from, to)
  • Keyboard/Mouse: pressKey(key), hover(selector), rightClick(selector)
  • Multi-tab: openTab(url), switchTab(index), closeTab()

DevTools Integration

  • Console Access: Live console logs, error monitoring, JavaScript execution
  • Network Monitoring: Request/response tracking, performance metrics
  • Element Inspector: DOM tree access, CSS inspection, element highlighting
  • Performance: Memory usage, CPU profiling, page load metrics

DOM & Data Extraction

  • Element Querying: querySelector(), findByText(), findByAttribute()
  • Data Extraction: getText(), getAttribute(), getHTML(), getTableData()
  • Page Analysis: getLinks(), getForms(), getImages(), getMetadata()
  • Screenshot: captureScreenshot(), captureElement(selector)

AI-Friendly Features

  • Smart Element Detection: Find clickable elements, form fields, navigation menus
  • Content Understanding: Extract structured data, identify page sections
  • Error Recovery: Automatic retry logic, fallback selectors
  • Context Awareness: Track page state, navigation history, user sessions

🎯 Target Use Cases

Web Automation

  • Form Filling: Automatically fill out job applications, surveys, registrations
  • Data Extraction: Scrape product information, research data, contact details
  • Testing: Automated UI testing, regression testing, accessibility testing
  • Monitoring: Track website changes, price monitoring, availability checking

AI Agent Integration

  • Research Tasks: Navigate websites, extract information, compile reports
  • E-commerce: Product research, price comparison, order tracking
  • Social Media: Content posting, engagement tracking, audience analysis
  • Productivity: Calendar management, email automation, document processing

Development & Debugging

  • Performance Analysis: Page speed testing, resource optimization
  • Accessibility Auditing: WCAG compliance checking, screen reader testing
  • Cross-browser Testing: Compatibility verification, feature detection
  • API Testing: Frontend-backend integration testing

🚀 Getting Started

Prerequisites

  • Chrome browser (latest version)
  • Node.js 18+
  • MCP-compatible AI client (Claude Desktop, Cursor, etc.)

Installation

# Clone the repository
git clone https://github.com/your-username/browser-agent-mcp.git
cd browser-agent-mcp

# Install dependencies
npm install

# Build the extension
npm run build

# Load extension in Chrome
# 1. Open chrome://extensions/
# 2. Enable "Developer mode"
# 3. Click "Load unpacked" and select the dist/ folder

# Start the MCP server
npm run start

Configuration

{
  "mcpServers": {
    "browser-agent": {
      "command": "node",
      "args": ["path/to/browser-agent-mcp/server.js"]
    }
  }
}

🤝 Contributing

We welcome contributions! This project aims to be the definitive browser automation solution for AI agents.

Development Priorities

  1. Core automation features (navigation, clicking, typing)
  2. DevTools integration (console, network, elements)
  3. Reliability improvements (error handling, reconnection)
  4. AI-friendly APIs (smart element detection, context awareness)
  5. Performance optimization (efficient DOM queries, memory management)

Areas for Contribution

  • Extension Development: Chrome APIs, content scripts, background workers
  • MCP Server: Protocol implementation, command handling, response formatting
  • Testing: Automated testing, browser compatibility, edge case handling
  • Documentation: API docs, tutorials, example use cases

📋 Roadmap

Minimal Viable Product (MVP)

Core features needed for basic AI agent browser interaction:

  • [ ] DOM Read Access: Query elements, extract text/attributes, get page structure
  • [ ] Console Logs Read Access: Monitor JavaScript console output, errors, warnings
  • [ ] Network Logs Read Access: Track HTTP requests/responses, API calls, resource loading

Phase 1: Foundation (Weeks 1-2)

  • [ ] Basic Chrome extension structure (manifest, content scripts, background worker)
  • [ ] MCP server implementation (protocol handling, command routing)
  • [ ] MVP: DOM read access (querySelector, getText, getAttributes, getHTML)
  • [ ] MVP: Console logs monitoring (capture console.log, errors, warnings)
  • [ ] MVP: Network request tracking (monitor XHR, fetch, resource requests)

Phase 2: Core Automation (Weeks 3-4)

  • [ ] Navigation commands (navigate, back, forward, reload)
  • [ ] Element interaction (click, type, submit)
  • [ ] Basic waiting mechanisms (waitForElement, waitForNavigation)
  • [ ] Screenshot capabilities (captureScreenshot, captureElement)

Phase 3: Advanced Features (Weeks 5-6)

  • [ ] Form automation (fillForm, selectOption, uploadFile)
  • [ ] Multi-tab management (openTab, switchTab, closeTab)
  • [ ] Advanced DOM querying (findByText, findByAttribute, getTableData)
  • [ ] Drag & drop functionality (dragAndDrop)

Phase 4: AI Optimization (Weeks 7-8)

  • [ ] Smart element detection (find clickable elements, form fields, navigation menus)
  • [ ] Context-aware responses (track page state, navigation history)
  • [ ] Error recovery mechanisms (automatic retry logic, fallback selectors)
  • [ ] Performance optimization (efficient DOM queries, memory management)

Phase 5: Polish & Distribution (Weeks 9-10)

  • [ ] Comprehensive testing (automated testing, browser compatibility)
  • [ ] Documentation completion (API docs, tutorials, example use cases)
  • [ ] Chrome Web Store preparation (store listing, permissions review)
  • [ ] Community feedback integration (user testing, feature requests)

📄 License

MIT License - see LICENSE for details.

🙏 Acknowledgments

Inspired by and building upon:


Goal: Create the most comprehensive, reliable, and AI-friendly browser automation MCP that works seamlessly with your existing Chrome browser.