@uditgoenka/better-testing
v0.3.25
Published
AGPL-3.0 browser-testing MCP server with lifecycle-safe process management.
Maintainers
Readme
Better Testing
Turn Claude Code, OpenAI Codex, Cursor, Gemini CLI, OpenCode, Kiro, Autohand, and other MCP clients into reliable browser-testing workflows.
"Install → Register MCP → Run tests. Zero required dependencies."
Playwright-powered browser automation. 41 MCP tools. 7 agents supported. Visual regression. Self-healing selectors. Memory profiling. Network mocking. XSS scanning. Session persistence. Turbo mode. Cookie management. npm audit. Annotated screenshots. Dev server mode with HMR detection.
How It Works · Commands · Quick Start · MCP Clients · Troubleshooting
Why This Exists
Browser-testing agents often fail for boring reasons: stale helper processes, bad MCP registration, slow startup, terminal shutdown races, or cleanup that misses child processes.
Better Testing focuses on the operational layer that keeps those sessions stable:
- Native stdio MCP server with zero production dependencies
- Playwright-powered browser automation with visual overlay
- Multi-browser support: Chromium, Firefox, WebKit
- Provider setup for Claude Code, Codex, Cursor, Gemini CLI, OpenCode, Kiro, Autohand
- Advanced performance: Core Web Vitals + INP + LoAF script attribution + resource breakdown
- Compact/interactive snapshot modes (60-80% token reduction for AI agents)
- Snapshot diffing with Myers algorithm (track page state changes)
- Session recording with Playwright traces
- Security audit (headers, mixed content, insecure forms, open redirects)
- Sensitive data filtering in console logs (API keys, tokens, passwords auto-masked)
- Stale process detection and cleanup
- Machine-readable diagnostics for automation
- Automatic browser cleanup on session disconnect
Better Testing vs Playwright
Better Testing is built on top of Playwright — it adds an MCP server layer, AI-agent integrations, and 41 specialized tools that raw Playwright doesn't provide out of the box.
| Capability | Better Testing | Playwright (raw) |
| --- | --- | --- |
| MCP server (stdio JSON-RPC) | Built-in, zero-config | Not available |
| AI agent support | Claude Code, Codex, Cursor, Gemini CLI, OpenCode, Kiro, Autohand | Manual integration required |
| Tools | 41 MCP tools callable by AI agents | Script-only API |
| Setup | npm install -g → init --agent all → done | Write test scripts, configure runner |
| Visual overlay | Cursor glow, click animations, action labels via Shadow DOM | None |
| Annotated screenshots | Arrows, circles, rects, text callouts, numbered markers | page.screenshot() only |
| Snapshot modes | Full, compact (-60% tokens), interactive (-80% tokens) | Full ARIA tree only |
| Snapshot diffing | Built-in Myers algorithm diff between snapshots | Manual comparison |
| Self-healing selectors | Fallback chain with ref IDs from ARIA tree | Fixed selectors |
| Visual regression | Pixel-diff with pixelmatch, threshold control | Requires toHaveScreenshot() config |
| Dev server mode (HMR) | Auto-detects Vite, Next.js, Webpack, Turbopack HMR events | Not available |
| Watch mode | Re-run BT tools on HMR events, self-terminating | Requires external file watcher |
| Persistent sessions | persistent: true — no idle timeout, survives MCP reconnects | Session per test file |
| Security audit | Headers, mixed content, insecure forms, open redirects, CSP analysis | Not included |
| XSS scanner | Non-destructive payload injection + detection | Not included |
| Form fuzzer | Boundary, XSS, SQLi, format fuzzing | Not included |
| npm audit | Built-in vulnerability scanning via MCP tool | Not included |
| Memory profiling | CDP HeapProfiler leak detection | Manual CDP scripting |
| Performance metrics | Core Web Vitals + INP + LoAF + Server Timing + resource breakdown | Basic page.metrics() |
| Network mocking | MCP tool for route interception + response override | page.route() API |
| HAR export | Filtered HAR with sensitive data masking, 5K entry cap | routeFromHAR() only |
| WebSocket monitoring | CDP frame capture with pattern matching | Not built-in |
| Cookie management | Import/export with domain validation, masking, sameSite normalization | context.addCookies() only |
| Session recording | Playwright traces via MCP tools (start/stop/save) | context.tracing API |
| CDP connection | Auto-discover running Chrome, retry with backoff, safe disconnect | connectOverCDP() only |
| System Chrome | channel: "chrome" for Keychain, profiles, extensions | channel option available |
| Turbo mode | Performance flags + analytics/tracker blocking | Manual arg configuration |
| Device emulation | Full Playwright device registry via MCP param | devices descriptor |
| Parallel sessions | Multi-context with session IDs, close_all, state isolation | Multi-context via API |
| Test codegen | Generate test scripts from recorded interactions | codegen CLI tool |
| Lighthouse | Built-in CLI integration via MCP tool | Separate tool |
| Stale process cleanup | Auto-detect and clean orphaned browser workers | Manual process management |
| Plugin conflict detection | Auto-scans for conflicting MCP plugins at startup | Not applicable |
| Sensitive data filtering | Auto-masks API keys, tokens, passwords in console logs | Not included |
| Diagnostics | doctor tool: registration, Node version, Playwright status, conflicts | Not included |
| CI integration | add github-action generates workflow YAML | Manual workflow setup |
| Multi-browser | Chromium, Firefox, WebKit via single MCP param | Chromium, Firefox, WebKit via API |
| Dependencies | Zero production deps (Playwright is optional peer) | Playwright is the dependency |
| License | AGPL-3.0 | Apache-2.0 |
TL;DR: Playwright is the engine. Better Testing is the cockpit — it wraps Playwright with an MCP server, 41 AI-callable tools, automatic lifecycle management, security scanning, visual regression, dev server integration, and zero-config agent setup.
How It Works
INSTALL REGISTER RUN RECOVER
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ npm │───────▶│ Claude │───────▶│ Native │───────▶│ Doctor │
│ package │ │ Codex │ │ MCP │ │ Cleanup │
│ 0 deps │ │ Cursor │ │ Server │ │ Reports │
│ │ │ Gemini │ │ 41 tools │ │ │
│ │ │ OpenCode │ │ │ │ │
└──────────┘ └──────────┘ └──────────┘ └──────────┘The MCP server exposes these tools:
Core Tools
| Tool | What it does |
| --- | --- |
| version | Returns the installed Better Testing version |
| doctor | Reports stale helpers and provider registration state |
| cleanup | Cleans stale helpers with Better Testing rules |
Browser Tools
Requires Playwright as a peer dependency (npm install playwright && npx playwright install chromium).
| Tool | What it does |
| --- | --- |
| open | Launch a browser (Chromium/Firefox/WebKit), connect via CDP to running Chrome, or use system Chrome via channel. Supports cloud dashboards (AWS, Azure, GCP, Vercel, etc.) |
| playwright | Execute Playwright JS code against the current page |
| screenshot | Capture a PNG screenshot with optional rich annotations (arrows, circles, rects, text callouts, markers) |
| snapshot | Get ARIA accessibility tree with ref IDs. Modes: full, compact (-60% tokens), interactive (-80% tokens) |
| console_logs | Get collected browser console messages (sensitive data auto-filtered) |
| network_requests | Get collected network requests with status codes |
| performance | Collect Core Web Vitals (FCP, LCP, CLS, INP), LoAF with script attribution, Server Timing, resource breakdown |
| accessibility | Run WCAG accessibility audit using axe-core |
| start_recording | Start recording a Playwright trace (screenshots + snapshots) |
| save_recording | Stop recording and save the trace as a .zip file |
| security_audit | Run security checks: headers, mixed content, insecure forms, open redirects |
| import_cookies | Import cookies from JSON array or file path into the browser context |
| export_cookies | Export cookies from the browser context, with optional domain filter and masking |
| test_suggestions | Crawl the current page and return categorized interactive elements for test planning |
| npm_audit | Run npm audit on a project directory and return structured vulnerability results |
| diff_snapshot | Take a new snapshot and diff against the previous one (Myers algorithm) |
| close | Close the browser session and clean up resources |
Visual Overlay
When running in headed mode, Better Testing injects a visual overlay into the page:
- Blue cursor with glow effect that moves smoothly to each target element
- Click animation with scale-down press and expanding ripple effect
- Pulsing highlights on targeted elements (bright-dull-bright breathing animation)
- Action labels showing what the agent is doing (Clicking, Typing, Hovering, etc.)
The overlay is injected via Shadow DOM and does not interfere with the page being tested.
Multi-Browser Support
The open tool accepts a browser parameter to choose the engine:
{ "url": "https://example.com", "browser": "firefox" }Supported engines: chromium (default), firefox, webkit.
Install additional browsers:
npx playwright install firefox
npx playwright install webkitCustom Timeouts
The open tool accepts a timeout parameter (seconds) for Playwright code execution:
{ "url": "https://example.com", "timeout": 60 }Default execution timeout: 30 seconds. Idle session timeout: 120 seconds (configurable).
Session Recording
Record Playwright traces for debugging and replay:
- Open a browser session
- Call
start_recordingto begin capturing screenshots and DOM snapshots - Interact with the page using
playwright,snapshot, etc. - Call
save_recordingto stop and save the trace
Traces save to .better-testing/traces/trace-{timestamp}.zip. View with:
npx playwright show-trace .better-testing/traces/trace-*.zipSecurity Audit
The security_audit tool checks the current page for:
- Security headers: Content-Security-Policy, X-Frame-Options, X-Content-Type-Options, Strict-Transport-Security, Referrer-Policy, Permissions-Policy
- Mixed content: HTTP resources loaded on HTTPS pages
- Insecure forms: Forms posting to HTTP endpoints
- Open redirects: Links with redirect parameters (
?redirect=,?url=,?next=)
Returns a structured report with pass/fail/warning status per check.
Turbo Mode
Launch a speed-optimized browser session by passing turbo: true to the open tool:
{ "url": "https://example.com", "turbo": true }Turbo mode applies Chromium performance flags, blocks analytics/tracking scripts (Google Analytics, Mixpanel, Segment, HubSpot, Hotjar, Facebook Pixel, etc.), and skips overlay animations. Blocked requests appear in network_requests with status: "blocked".
Additional resource blocking with blockResources:
{ "url": "https://example.com", "turbo": true, "blockResources": ["image", "media", "font"] }CDP Connection (Cloud Dashboards)
Connect to an already-running Chrome instance via Chrome DevTools Protocol. Perfect for testing authenticated cloud dashboards (AWS, Azure, GCP, Vercel, Railway, Render, etc.):
{ "cdp": "http://127.0.0.1:9222", "url": "https://console.aws.amazon.com" }Start Chrome with remote debugging first: chrome --remote-debugging-port=9222
Auto-discover a running Chrome instance:
{ "cdp": true }Safe close: BT does NOT kill the external Chrome when the session closes — it only disconnects. Your Chrome keeps running with all tabs intact.
Retry: CDP connections retry with exponential backoff (5 attempts) to handle Chrome startup races.
System Chrome (channel)
Launch system-installed Chrome or Edge instead of Playwright's bundled Chromium. Enables macOS Keychain access, user profiles, and extensions:
{ "url": "https://portal.azure.com", "channel": "chrome" }Supported channels: chrome, chrome-canary, msedge, msedge-dev.
Cookie Tools
Import cookies from JSON or a file:
{ "cookies": [{ "name": "session", "value": "abc123", "domain": ".example.com", "path": "/" }] }Export cookies with optional domain filter:
{ "domain": ".example.com", "raw": true }By default, exported cookie values are masked using length-aware masking. Pass raw: true to see full values.
Annotated Screenshots
Take screenshots with rich visual annotations pointing out issues for developers. Supports arrows, circles, rectangles, text callouts, and numbered markers:
{
"annotations": [
{ "type": "arrow", "target": { "selector": ".submit-btn" }, "label": "8px gap expected, 0px found" },
{ "type": "circle", "target": { "ref": 3 }, "label": "missing required attribute" },
{ "type": "rect", "target": { "selector": ".card" }, "label": "overflow hidden clips content" },
{ "type": "text", "target": { "x": 200, "y": 100 }, "label": "expected: blue, actual: gray" },
{ "type": "marker", "target": { "selector": "#logo" }, "label": "contrast ratio 2.1:1" }
]
}Target elements by CSS selector, ref ID (from snapshot), or x/y coordinates. Annotations use a separate Shadow DOM host for CSS isolation. Invalid or hidden targets are skipped with detailed reasons (partial success). Screenshots saved to .better-testing/annotations/.
Annotations are red by default (distinct from blue ref labels). Override per annotation with color. Max 50 annotations per screenshot. Use annotate: true for simple auto-labels, or annotations: [...] for rich visual callouts.
Test Suggestions
Crawl the current page and get categorized interactive elements for test planning:
{ "scope": "form", "categories": ["inputs", "buttons"] }Returns structured element inventory (forms, links, buttons, inputs, images, headings, landmarks, custom elements) capped at 200 elements.
npm Audit
Run vulnerability scanning on any project directory:
{ "directory": "/path/to/project", "minLevel": "high" }Returns structured results with vulnerability counts by severity level.
Browser Lifecycle
- Browser auto-closes when the MCP session disconnects (stdin pipe closes)
- Idle timeout: browser closes after 120 seconds of inactivity (configurable via
opentimeout) - Clean signal handling: SIGTERM/SIGINT properly terminate the browser process
Snapshot + Ref Workflow
1. Call `snapshot` to get the ARIA tree with [ref=N] IDs
2. Use `ref(N)` in `playwright` tool code to interact with elements
3. Example: await ref(1).click() // clicks element with ref=1Commands
| Command | What it does |
| --- | --- |
| better-testing init --agent all | Register the skill and MCP server for local agents |
| better-testing init | Interactive setup wizard (choose agents, browser mode) |
| better-testing mcp | Start the native Better Testing stdio MCP server |
| better-testing doctor --mcp | Check provider registration and stale helpers |
| better-testing cleanup --dry-run | Preview stale helper cleanup |
| better-testing cleanup | Clean stale helper processes |
| better-testing agents --detect | Detect supported local agents |
| better-testing add github-action | Generate a GitHub Actions workflow for CI browser testing |
| better-testing classify-diff | Classify browser-visible diffs from stdin |
| better-testing skill-copy <src> <dest> | Copy the Better Testing skill into an agent skill folder |
| better-testing pr --title <title> | Create a pull request with the GitHub CLI |
The add github-action command accepts these flags:
better-testing add github-action --force # Overwrite existing workflow
better-testing add github-action --dry-run # Preview without writing
better-testing add github-action --node-version 22 # Set Node.js versionQuick Start
1. Install
npm install -g @uditgoenka/better-testingInstall Playwright for browser tools (Chromium by default):
npm install playwright && npx playwright install chromiumFor multi-browser testing, install additional engines:
npx playwright install firefox webkitVerify the global binary:
better-testing --versionOutput should be:
0.3.252. Register Local Agents
better-testing init --agent allThis registers:
- Claude Code MCP
- Codex MCP
- Cursor MCP
- Gemini CLI MCP
- OpenCode MCP
- Kiro MCP
- Autohand MCP
/better-testingskill files where supported
3. Check Setup
better-testing doctor --mcpFor JSON automation:
better-testing doctor --mcp --json4. Use It
In Claude Code:
/better-testingIn Codex:
$better-testingIn any MCP client, configure the server command:
better-testing mcpMCP Clients
Better Testing supports MCP clients that implement stdio transport.
Claude Code
claude mcp add --scope user better-testing -- better-testing mcpCodex
Add this to ~/.codex/config.toml:
[mcp_servers.better-testing]
command = "better-testing"
args = ["mcp"]
startup_timeout_sec = 20Cursor
Add this to ~/.cursor/mcp.json:
{
"mcpServers": {
"better-testing": {
"command": "better-testing",
"args": ["mcp"]
}
}
}Gemini CLI
Add this to ~/.gemini/settings.json:
{
"mcpServers": {
"better-testing": {
"command": "better-testing",
"args": ["mcp"]
}
}
}OpenCode
Add this to your OpenCode MCP config:
{
"mcpServers": {
"better-testing": {
"command": "better-testing",
"args": ["mcp"]
}
}
}Any stdio MCP Client
{
"command": "better-testing",
"args": ["mcp"]
}Troubleshooting
MCP client times out
Run:
better-testing doctor --mcp --jsonThen re-register:
better-testing init --agent allRestart the client after registration.
MCP reconnect fails
Verify the server by sending an initialize request:
printf '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{}}}\n' | better-testing mcpThe response should include:
{
"serverInfo": {
"name": "better-testing"
}
}Stale helpers remain
Preview cleanup:
better-testing cleanup --dry-runRun cleanup:
better-testing cleanupBrowser stays open after session
This should not happen with v0.2.0+. The browser auto-closes when the MCP pipe disconnects or after 120 seconds of inactivity. If it persists:
better-testing cleanup --includeBrowserWorkersDevelopment
npm test
npm run check
npm run pack:dryPackage rules:
- No required dependencies (Playwright, pixelmatch, pngjs are optional peer dependencies)
- Native MCP server by default
- AGPL-3.0 license
- Public package:
@uditgoenka/better-testing
Roadmap
- ~~Add browser action tools inside the native MCP server~~ (shipped in v0.1.34)
- ~~Add visual overlay with cursor, click animations, and highlights~~ (shipped in v0.1.35)
- ~~Add multi-browser support: Chromium, Firefox, WebKit~~ (shipped in v0.1.39)
- ~~Add custom execution and idle timeouts~~ (shipped in v0.1.39)
- ~~Add session recording with Playwright traces~~ (shipped in v0.1.39)
- ~~Add security audit tool (headers, mixed content, forms, redirects)~~ (shipped in v0.1.39)
- ~~Add GitHub Actions workflow generator (
bt add github-action)~~ (shipped in v0.1.39) - ~~Add advanced performance metrics: INP, LoAF, Server Timing, resource breakdown~~ (shipped in v0.2.0)
- ~~Add snapshot diffing with Myers algorithm~~ (shipped in v0.2.0)
- ~~Add compact/interactive snapshot modes (60-80% token reduction)~~ (shipped in v0.2.0)
- ~~Add sensitive data filtering in console logs~~ (shipped in v0.2.0)
- ~~Add input validation (viewport, timeouts) and recording safety guards~~ (shipped in v0.2.0)
- ~~Add visual regression with pixel-diff (pixelmatch)~~ (shipped in v0.3.0)
- ~~Add network mocking and interception~~ (shipped in v0.3.0)
- ~~Add self-healing selectors with fallback chain~~ (shipped in v0.3.0)
- ~~Add auth session persistence (save/load cookies + storage)~~ (shipped in v0.3.0)
- ~~Add memory leak detection (CDP HeapProfiler, Chromium)~~ (shipped in v0.3.0)
- ~~Add test codegen from session interactions~~ (shipped in v0.3.0)
- ~~Add parallel browser sessions~~ (shipped in v0.3.0)
- ~~Add responsive breakpoint sweep~~ (shipped in v0.3.0)
- ~~Add XSS payload scanner (non-destructive)~~ (shipped in v0.3.0)
- ~~Add cloud browser connection (generic wsEndpoint)~~ (shipped in v0.3.0)
- ~~Add HAR export with sensitive data filtering~~ (shipped in v0.3.0)
- ~~Add device emulation (full Playwright registry)~~ (shipped in v0.3.0)
- ~~Add form fuzzer (boundary, XSS, SQLi, format)~~ (shipped in v0.3.0)
- ~~Add WebSocket monitoring (CDP frame capture)~~ (shipped in v0.3.0)
- ~~Add Lighthouse integration via CLI~~ (shipped in v0.3.0)
- ~~Add CDP connection (attach to running Chrome DevTools)~~ (shipped in v0.3.0)
- ~~Add annotated screenshots (numbered labels on interactive elements)~~ (shipped in v0.3.0)
- ~~Add browser border glow effect (pulsing blue while active)~~ (shipped in v0.3.0)
- ~~Change license to AGPL-3.0~~ (shipped in v0.3.0)
- ~~Add cookie import/export tools~~ (shipped in v0.3.12)
- ~~Add test suggestions tool (DOM element inventory)~~ (shipped in v0.3.12)
- ~~Add npm audit tool (vulnerability scanning)~~ (shipped in v0.3.12)
- ~~Add turbo mode (performance flags + analytics blocking)~~ (shipped in v0.3.12)
- ~~Add enhanced security audit (cookie attributes + CSP analysis)~~ (shipped in v0.3.12)
- ~~Add
/btskill shortcut~~ (shipped in v0.3.12) - ~~Add rich annotated screenshots (arrows, circles, rects, text, markers)~~ (shipped in v0.3.13)
- ~~Add stale ref enriched error messages~~ (shipped in v0.3.13)
- ~~Fix CDP close killing external Chrome (graceful disconnect)~~ (shipped in v0.3.14)
- ~~Add CDP connection retry with exponential backoff~~ (shipped in v0.3.14)
- ~~Add system Chrome channel support (chrome, msedge, chrome-canary)~~ (shipped in v0.3.14)
- ~~Make url optional for CDP/WS connections~~ (shipped in v0.3.14)
- ~~Add auto-restart Chrome for CDP when running without --remote-debugging-port~~ (shipped in v0.3.15)
- ~~Fix cookie import compatibility (sameSite normalization, Chrome field stripping)~~ (shipped in v0.3.16)
- ~~Fix critical: session-store unsanitized cookies, lighthouse category injection~~ (shipped in v0.3.16)
- ~~Fix high: XSS detection, memory profiler, codegen escaping, viewport null crash, WS race, enum mismatch~~ (shipped in v0.3.16)
- ~~Fix MCP transport crash (process guards, async dispatch safety, idle timer race)~~ (shipped in v0.3.17)
- ~~Default to headless browser mode for MCP environments~~ (shipped in v0.3.18)
- ~~Add persistent browser sessions (no idle timeout)~~ (shipped in v0.3.23)
- ~~Add dev server mode with HMR event detection (Vite, Next.js, Webpack, Turbopack)~~ (shipped in v0.3.23)
- ~~Add dev_status tool (HMR event queue + long-poll)~~ (shipped in v0.3.23)
- ~~Add watch tool (run BT tools on HMR events, self-terminating)~~ (shipped in v0.3.23)
- ~~Add watch_cancel tool~~ (shipped in v0.3.23)
- ~~Add "Better Testing vs Playwright" comparison table to README~~ (shipped in v0.3.24)
- Add dual-engine accessibility (IBM Equal Access + axe-core)
- Add Windows process-tree cleanup parity
- Add HTTP MCP server transport mode
- Add TypeScript SDK for programmatic use
License
Better Testing is open source software released under the GNU Affero General Public License v3.0 (AGPL-3.0).
