mcprune
v0.1.0
Published
MCP middleware that prunes Playwright accessibility snapshots for LLM agents. Zero ML, 75-95% token reduction, all refs preserved.
Maintainers
Readme
┌─┐
│m│cprune
└─┘Your AI agent doesn't need 100K tokens to click a button. Prune Playwright snapshots down to what matters. Zero ML, 75-95% reduction.
mcprune is an MCP middleware that sits between your AI agent and Playwright MCP. It intercepts every browser snapshot and prunes it down to only what the agent needs: interactive elements, prices, headings, and refs to click.
Agent <-> mcprune (proxy) <-> Playwright MCP <-> Browser
|
prune() + summarize()
75-95% token reductionBefore / After
Amazon search page — raw Playwright snapshot: ~100,000 tokens. Includes every pixel-level wrapper, tracking URL, sidebar filter, energy label, legal footer, and duplicated link.
After mcprune: ~14,000 tokens. Product titles, prices, ratings, color options, "Add to basket" buttons, and clickable refs. Everything an agent needs to shop.
Amazon product page: ~28,000 tokens -> ~3,300 tokens (88% reduction). Full buy flow preserved.
Install
npm install -g mcpruneThat's it. This installs mcprune, Playwright MCP, and downloads Chromium (~400MB) automatically.
Quick start
Interactive setup
npx mcpruneShows an interactive menu to configure your MCP client.
Add to your MCP client
Add to your Claude Desktop, Cursor, or any MCP client config:
{
"mcpServers": {
"browser": {
"command": "npx",
"args": ["mcprune"]
}
}
}For Claude Code:
claude mcp add browser -- npx mcpruneThe agent gets all Playwright browser tools (browser_navigate, browser_click, browser_type, browser_snapshot, etc.) with automatic pruning on every response.
Options
--headless— run browser without visible window--mode act|browse|navigate|full|auto— pruning mode (default:auto)
Example with options:
{
"mcpServers": {
"browser": {
"command": "npx",
"args": ["mcprune", "--headless"]
}
}
}CLI commands
npx mcprune # Interactive setup
npx mcprune --help # Show usage + lifetime token savingsAs a library
import { prune, summarize } from 'mcprune';
const snapshot = await page.locator('body').ariaSnapshot();
const pruned = prune(snapshot, {
mode: 'act',
context: 'iPhone 15 price' // optional: keywords for relevance filtering
});
const summary = summarize(snapshot);
// -> "Apple iPhone 15 (128GB) - Black | pick color(5), set quantity, add to basket, buy now, 91 links"How it works
A 9-step rule-based pipeline. No ML, no embeddings, no API calls.
| Step | What | Why |
|------|------|-----|
| 1. Extract regions | Keep landmarks matching the mode (act -> main only) | Drop banner, footer, sidebar in action mode |
| 2. Prune nodes | Drop paragraphs, images, descriptions. Keep interactive elements, prices, short labels | Core reduction — 50-60% happens here |
| 3. Collapse wrappers | generic > generic > button "Buy" -> button "Buy" | Playwright trees are deeply nested |
| 4. Clean up | Trim combobox options, drop orphaned headings | A 50-option dropdown -> just the combobox name |
| 5. Dedup links | One link per unique text per product card | Amazon cards have 3+ links to the same product |
| 6. Drop noise | Energy labels, product sheets, ad feedback, "view options" | These repeat 10-30x per search page |
| 7. Truncate footer | Everything after "back to top" is noise | Corporate links, legal text, subsidiaries |
| 8. Drop filters | Sidebar refinement panels | 20+ collapsible filter groups on Amazon |
| 9. Serialize | Back to YAML, strip URLs, clean tracking params | URLs were 62% of output — agents click by ref |
Auto mode detection
mcprune automatically detects the right pruning mode per snapshot:
- URL patterns: Known sites (Amazon -> act, MDN -> browse)
- Content analysis: Paragraph/interactive ratio, price patterns, code blocks
- Fallback: Defaults to act mode when uncertain
Context-aware pruning
When the agent types a search query, mcprune captures it as context. Product cards that don't match any keywords are collapsed to just their title, while matching products keep full details.
Agent types "iPhone 15" in search box
-> mcprune captures context: ["iphone", "15"]
-> Matching cards: full price, rating, colors, buttons
-> Non-matching cards: title onlyPruning modes
The mode controls only how mcprune prunes the snapshot. Playwright MCP executes all browser actions identically regardless of mode.
| Mode | Regions kept | Pipeline | Use case |
|------|-------------|----------|----------|
| act | main only | All 9 steps | Shopping, forms, taking actions |
| browse | main only | Steps 1-4 + 9 (skip e-commerce noise removal) | Docs, articles, reading content |
| navigate | main + banner + nav + search | All 9 steps | Site exploration |
| full | All landmarks | All 9 steps | Debugging, full page view |
| auto | Auto-detected | Varies | Default — picks act or browse per snapshot |
Browse mode preserves paragraphs, code blocks, term/definition pairs, inline links, all headings, and figure captions — content that act mode drops because agents taking actions don't need article text.
Performance
Tested live via MCP proxy:
| Page | Raw | Pruned | Reduction | |------|-----|--------|-----------| | Amazon NL search (30 products) | ~100K tokens | ~14K tokens | 85.8% | | Amazon NL product page | ~28K tokens | ~3.3K tokens | 88.0% | | Wikipedia article (browse) | ~54K tokens | ~8.6K tokens | 84.0% | | MDN docs (browse) | ~10K tokens | ~5.5K tokens | ~43% | | Python docs (browse) | ~22K tokens | ~17K tokens | ~23% | | Amazon product (fixture) | ~1.2K tokens | ~289 tokens | 76.5% |
All refs ([ref=eN]) are preserved. The agent can click, type, and interact with every element in the pruned output.
Token savings tracking
mcprune tracks cumulative token savings in ~/.mcprune/stats.json. View with:
npx mcprune --helpTest
npm test # 161 testsProject structure
mcprune/
mcp-server.js MCP proxy entry point — TTY/help routing, spawns Playwright MCP
postinstall.js Downloads Chromium on npm install
src/
prune.js 9-step pruning pipeline + summarize(), mode-aware filtering
parse.js Playwright ariaSnapshot YAML -> tree
serialize.js Tree -> YAML, URL cleaning
roles.js ARIA role taxonomy (LANDMARKS, INTERACTIVE, STRUCTURAL, ...)
proxy-utils.js Extracted proxy logic (snapshot detection, context, mode detection)
cli.js Interactive TTY menu (install/uninstall)
stats.js Token savings tracking (~/.mcprune/stats.json)
test/
cli.test.js 4 CLI tests
stats.test.js 12 stats tests
parse.test.js 8 parser tests
prune.test.js 12 prune + summarize tests
proxy.test.js 24 proxy utility tests
edge-cases.test.js 77 edge case + browse mode + regression tests
fixtures/ 9 real-world page snapshots (e-commerce, docs, forums, gov)
scripts/ Dev tools for capturing live snapshots
blueprint.md Detailed technical documentation
docs/ Structured project documentationLicense
MIT
