better-playwright-mcp3
v3.2.0
Published
Better Playwright MCP v3 - High-performance browser automation with intelligent content search
Maintainers
Readme
better-playwright-mcp3
A high-performance Playwright MCP (Model Context Protocol) server with intelligent content search capabilities for browser automation.
Features
- 🎭 Full Playwright browser automation via MCP
- 🏗️ Client-server architecture with HTTP API
- 📍 Ref-based element identification system (
[ref=e1],[ref=e2], etc.) - 🔍 Powerful regex-based content search using ripgrep
- 💾 Persistent browser profiles with Chrome
- 🚀 Optimized for AI assistant integration with minimal token usage
- 📄 Semantic HTML snapshots using Playwright's internal APIs
- ⚡ High-performance search with 100-line safety limit
Installation
Global Installation (for CLI usage)
npm install -g better-playwright-mcp3Local Installation (for SDK usage)
npm install better-playwright-mcp3Usage
As a JavaScript/TypeScript SDK
Prerequisites:
First, start the HTTP server:
npx better-playwright-mcp3@latest serverThen use the SDK in your code:
import { PlaywrightClient } from 'better-playwright-mcp3';
async function automateWebPage() {
// Connect to the HTTP server (must be running)
const client = new PlaywrightClient('http://localhost:3102');
// Create a page
const { pageId, success } = await client.createPage(
'my-page', // page name
'Test page', // description
'https://example.com' // URL
);
// Search for specific content (regex by default)
const searchResult = await client.searchSnapshot(pageId, 'Example', { ignoreCase: true });
console.log(searchResult);
// Returns matching lines with match count and truncation status
// Search with regular expressions (default behavior)
const prices = await client.searchSnapshot(pageId, '\\$[0-9]+\\.\\d{2}', { lineLimit: 10 });
// Search multiple patterns (OR)
const links = await client.searchSnapshot(pageId, 'link|button|input', { ignoreCase: true });
// Interact with the page using ref identifiers
await client.browserClick(pageId, 'e3'); // Click element
await client.browserType(pageId, 'e4', 'Hello World'); // Type text
await client.browserHover(pageId, 'e2'); // Hover over element
// Navigation
await client.browserNavigate(pageId, 'https://google.com');
await client.browserNavigateBack(pageId);
await client.browserNavigateForward(pageId);
// Scrolling
await client.scrollToBottom(pageId);
await client.scrollToTop(pageId);
// Waiting
await client.waitForTimeout(pageId, 2000); // Wait 2 seconds
await client.waitForSelector(pageId, 'body');
// Take screenshots
const screenshot = await client.screenshot(pageId, true); // Full page
// Clean up
await client.closePage(pageId);
}Available Methods:
- Page Management:
createPage,closePage,listPages - Navigation:
browserNavigate,browserNavigateBack,browserNavigateForward - Interaction:
browserClick,browserType,browserHover,browserSelectOption - Advanced Actions:
browserPressKey,browserFileUpload,browserHandleDialog - Page Structure:
getOutline- Get page structure with intelligent folding, limited to 100 lines (NEW in v3.1.0) - Content Search:
searchSnapshot- Search page content with regex patterns (powered by ripgrep) - Screenshots:
screenshot- Capture page as image - Scrolling:
scrollToBottom,scrollToTop - Waiting:
waitForTimeout,waitForSelector
MCP Server Mode
The MCP server requires an HTTP server to be running. You need to start both:
Step 1: Start the HTTP server
npx better-playwright-mcp3@latest serverStep 2: In another terminal, start the MCP server
npx better-playwright-mcp3@latestThe MCP server will:
- Start listening on stdio for MCP protocol messages
- Connect to the HTTP server on port 3102
- Route browser automation commands through the HTTP server
Standalone HTTP Server Mode
You can run the HTTP server independently:
npx better-playwright-mcp3@latest serverOptions:
-p, --port <number>- Server port (default: 3102)--host <string>- Server host (default: localhost)--headless- Run browser in headless mode--chromium- Use Chromium instead of Chrome--no-user-profile- Do not use persistent user profile--user-data-dir <path>- User data directory
MCP Tools
When used with AI assistants, the following tools are available:
Page Management
createPage- Create a new browser page with name and descriptionclosePage- Close a specific pagelistPages- List all managed pages with titles and URLs
Browser Actions
browserClick- Click an element using its ref identifierbrowserType- Type text into an elementbrowserHover- Hover over an elementbrowserSelectOption- Select options in a dropdownbrowserPressKey- Press keyboard keysbrowserFileUpload- Upload files to file inputbrowserHandleDialog- Handle browser dialogs (alert, confirm, prompt)browserNavigate- Navigate to a URLbrowserNavigateBack- Go back to previous pagebrowserNavigateForward- Go forward to next pagescrollToBottom- Scroll to bottom of page/elementscrollToTop- Scroll to top of page/elementwaitForTimeout- Wait for specified millisecondswaitForSelector- Wait for element to appear
Content Search & Screenshots
searchSnapshot- Search page content using regex patterns (powered by ripgrep)screenshot- Take a screenshot (PNG/JPEG)
Architecture
This project implements a two-tier architecture optimized for minimal token usage:
- MCP Server - Communicates with AI assistants via Model Context Protocol
- HTTP Server - Controls browser instances and provides grep-based search
AI Assistant <--[MCP Protocol]--> MCP Server <--[HTTP]--> HTTP Server <---> Browser
|
v
ripgrep engineKey Design Principles
- Minimal Token Usage: Operations return only success status, not full page content
- On-Demand Search: Content is retrieved via regex patterns when needed
- Performance: Uses ripgrep for 10x+ faster searching than JavaScript implementations
- Safety: Hard limit of 100 lines per search to prevent context overflow
Ref-Based Element System
Elements in snapshots are identified using ref attributes (e.g., [ref=e1], [ref=e2]). This system:
- Provides stable identifiers for elements
- Works with Playwright's internal
aria-refselectors - Enables precise element targeting across page changes
Example snapshot:
- generic [ref=e2]:
- heading "Example Domain" [level=1] [ref=e3]
- paragraph [ref=e4]: This domain is for use in illustrative examples
- link "More information..." [ref=e5] [cursor=pointer]Examples
Creating and Navigating Pages
// Create a page
const { pageId, success } = await client.createPage(
'shopping',
'Amazon shopping page',
'https://amazon.com'
);
// Navigate to another URL
await client.browserNavigate(pageId, 'https://google.com');
// Go back/forward
await client.browserNavigateBack(pageId);
await client.browserNavigateForward(pageId);Getting Page Structure (NEW in v3.1.0)
// Get page outline with intelligent folding (limited to 100 lines)
const outline = await client.getOutline(pageId);
console.log(outline);
// Returns structured outline with first element as sample:
// - article [ref=e31]:
// - heading "Product Title" [ref=e32]
// - generic "$99" [ref=e33]
// - article (... and 23 more similar) [ref=e34-e57]Searching Content
// Search for text (case insensitive)
const results = await client.searchSnapshot(pageId, 'product', { ignoreCase: true });
// Search with regular expression (default behavior)
const emails = await client.searchSnapshot(pageId, '[a-zA-Z0-9]+@[a-zA-Z0-9]+\\.[a-z]+');
// Search multiple patterns (OR)
const buttons = await client.searchSnapshot(pageId, 'button|submit|click', { ignoreCase: true });
// Search for prices with dollar sign
const prices = await client.searchSnapshot(pageId, '\\$\\d+\\.\\d{2}');
// Limit number of result lines
const firstTen = await client.searchSnapshot(pageId, 'item', { lineLimit: 10 });Search Options:
pattern(required) - Regex pattern to search forignoreCase(optional) - Case insensitive search (default: false)lineLimit(optional) - Maximum lines to return (default: 100, max: 100)
Response Format:
result- Matched text contentmatchCount- Total number of matches foundtruncated- Whether results were truncated due to line limit
Note: All patterns are treated as regular expressions by default. Results are limited to 100 lines maximum to prevent excessive data retrieval.
Interacting with Elements
// Click on element using its ref identifier
await client.browserClick(pageId, 'e3');
// Type text into input field
await client.browserType(pageId, 'e4', 'search query');
// Hover over element
await client.browserHover(pageId, 'e2');
// Press keyboard key
await client.browserPressKey(pageId, 'Enter');Scrolling and Waiting
// Scroll page
await client.scrollToBottom(pageId);
await client.scrollToTop(pageId);
// Wait operations
await client.waitForTimeout(pageId, 2000); // Wait 2 seconds
await client.waitForSelector(pageId, '#my-element');Best Practices for AI Assistants
Recommended Workflow: Outline First, Then Precise Actions
When using this library with AI assistants, follow this optimized workflow for maximum efficiency:
1. Start with Page Outline (Always First Step)
// Always begin by getting the page structure overview
const outline = await client.getOutline(pageId);
// Returns a 100-line intelligently compressed view of the pageThe outline provides:
- Complete page structure with intelligent folding
- First element of each group preserved as a sample
- All ref identifiers for precise element targeting
- Clear indication of repetitive patterns (e.g., "... and 47 more similar products")
2. Use Outline to Guide Precise Searches
// Based on outline understanding, perform targeted searches
const searchResults = await client.searchSnapshot(pageId, 'specific term', {
ignoreCase: true,
lineLimit: 10
});
// Now you know exactly what to search for and where it might be3. Take Actions with Verified Ref IDs
// Use ref IDs discovered from outline or grep, not guesswork
await client.browserClick(pageId, 'e42'); // Ref ID confirmed from outlineWhy This Approach?
Token Efficiency: 100-line outline + targeted grep searches use far fewer tokens than full snapshots (often 3000+ lines)
Accuracy: The outline shows actual page structure, preventing incorrect assumptions about element locations
Smart Compression: The folding algorithm preserves the first element of each group as a sample, so AI understands the pattern without seeing all repetitions
Anti-Patterns to Avoid
❌ Don't blindly try random ref IDs (e.g., trying e1, e2, e3 without verification) ❌ Don't request full snapshots that exceed token limits ❌ Don't make assumptions about page structure without checking the outline first ❌ Don't use generic search patterns when specific ones would be more efficient
Example: Searching Amazon Products
// GOOD: Outline-first approach
const outline = await client.getOutline(pageId);
// Outline shows: "- listitem [ref=e234]: [first product details]"
// "- listitem (... and 47 more similar) [refs: e235, e236, ...]"
// Now search for specific product attribute
const prices = await client.searchSnapshot(pageId, '\\$\\d+\\.\\d{2}', { lineLimit: 10 });
// BAD: Blind searching without context
const results = await client.searchSnapshot(pageId, 'product', { ignoreCase: true }); // Too generic
await client.browserClick(pageId, 'e1'); // Guessing ref IDsThis design enables AI assistants to efficiently navigate and automate complex web pages within their context window limitations.
Development
Prerequisites
- Node.js >= 18.0.0
- TypeScript
- Chrome or Chromium browser
Building from Source
# Clone the repository
git clone https://github.com/yourusername/better-playwright-mcp.git
cd better-playwright-mcp
# Install dependencies
npm install
# Build the project
npm run build
# Run in development mode
npm run devProject Structure
better-playwright-mcp3/
├── src/
│ ├── index.ts # Main export file
│ ├── mcp-server.ts # MCP server implementation
│ ├── client/
│ │ └── playwright-client.ts # HTTP client for browser automation
│ └── server/
│ └── playwright-server.ts # HTTP server controlling browsers
├── bin/
│ └── cli.js # CLI entry point
├── package.json
├── tsconfig.json
└── README.mdTroubleshooting
Common Issues
Port already in use
- Change the port using
-pflag:npx better-playwright-mcp2 server -p 3103 - Or set environment variable:
PORT=3103 npx better-playwright-mcp2 server
- Change the port using
Browser not launching
- Ensure Chrome or Chromium is installed
- Try using
--chromiumflag for Chromium - Check system resources
Element not found
- Verify the ref identifier exists
- Use
searchSnapshot()to search for elements - Wait for elements using
waitForSelector()
Search returns too many results
- Use more specific patterns
- Use
lineLimitoption to limit results - Leverage regex features for precise matching
Debug Mode
Enable detailed logging:
DEBUG=* npx better-playwright-mcp2Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
MIT
