mcp-browser-agent

v0.2.1

Published

4 months ago

MCP server giving AI agents structured perception of web pages — semantic UI tree, actions, and linting via Playwright + CDP

0High
0Medium
0Low

grg-akshay

mcp ai agent ui web accessibility testing playwright perception linting model-context-protocol

mcp-browser-agent

MCP server giving AI agents structured perception of web pages — semantic UI tree, actions, and linting via Playwright + CDP.

Installation

npm install mcp-browser-agent

Chromium is downloaded automatically by Playwright on first run. If you need to install it manually:

npx playwright install chromium

Setup

Claude Code

claude mcp add mcp-browser-agent -- npx mcp-browser-agent serve

Cursor

Add to ~/.cursor/mcp.json:

{
  "mcpServers": {
    "mcp-browser-agent": {
      "command": "npx",
      "args": ["mcp-browser-agent", "serve"]
    }
  }
}

Windsurf

Add to ~/.codeium/windsurf/mcp_config.json:

{
  "mcpServers": {
    "mcp-browser-agent": {
      "command": "npx",
      "args": ["mcp-browser-agent", "serve"]
    }
  }
}

VS Code (Copilot)

Add to .vscode/mcp.json in your project:

{
  "servers": {
    "mcp-browser-agent": {
      "command": "npx",
      "args": ["mcp-browser-agent", "serve"]
    }
  }
}

Other MCP Clients

The server uses stdio transport. Run npx mcp-browser-agent serve and connect via stdin/stdout.

CLI (standalone)

No MCP client needed — use directly from the terminal:

npx mcp-browser-agent observe https://example.com
npx mcp-browser-agent lint https://example.com
npx mcp-browser-agent screenshot https://example.com
npx mcp-browser-agent act click 'role=button name~="Submit"' -u https://example.com

MCP Tools

| Tool | Description | |------|-------------| | ui_connect | Connect to a URL and start a browser session | | ui_observe | Get the full semantic UI tree, ranked actions, and lint issues | | ui_locate | Find elements by semantic query (e.g., role=button name~="Submit") | | ui_act | Execute actions: click, type, select, scroll, focus, upload | | ui_screenshot | Capture a screenshot of the current page | | ui_lint | Run UI detectors/linters (overflow, contrast, target-size, etc.) | | ui_navigate | Navigate to a different URL | | ui_state | Get page state info and transition history | | ui_watch | Start/stop continuous hot observation for DOM changes |

What Agents See

When an agent calls ui_observe, they get:

Semantic UI tree — a pruned accessibility tree with layout, styles, and stable IDs
Ranked actions — interactive elements scored by visibility, size, and role
Issues — UI lint findings (overflow, contrast, target-size, focus-visible, heading-scale, text-truncation, misalignment, spacing, line-length, layout-shift)
State — page state fingerprint and transition history

Locate Query Language

Find elements using a semantic query syntax:

role=button name~="Submit"           # button with name containing "Submit"
role=link name="Sign in"             # exact name match
role=textbox in=form                 # textbox inside a form
role=heading level=2                 # h2 heading

Use as a Library

import { BrowserManager, observe, locate, act } from 'mcp-browser-agent';

const manager = new BrowserManager();
const session = await manager.connect('https://example.com');

// Observe the page
const result = await observe(session);
console.log(result.ui_tree);

// Find elements
const matches = locate(result.ui_tree, { role: 'button', namePattern: /Submit/ });

// Take actions
await act(session, { action: 'click', target: 'role=button name~="Submit"' });

Detectors

10 built-in UI detectors:

| Detector | What it catches | |----------|----------------| | overflow | Content overflowing its container | | contrast | Text with insufficient color contrast (WCAG AA) | | target-size | Interactive elements below minimum tap/click size | | focus-visible | Missing focus indicators on interactive elements | | heading-scale | Skipped heading levels (e.g., h1 → h3) | | text-truncation | Text being clipped or truncated | | misalignment | Elements misaligned with their siblings | | spacing | Padding/margins not following a consistent scale | | line-length | Text lines exceeding recommended character count | | layout-shift | Elements that may cause layout shift |

Requirements

Node.js >= 20

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

mcp-browser-agent

Installation

Setup

Claude Code

Cursor

Windsurf

VS Code (Copilot)

Other MCP Clients

CLI (standalone)

MCP Tools

What Agents See

Locate Query Language

Use as a Library

Detectors

Requirements

License