barebrowse
v0.5.3
Published
Authenticated web browsing for autonomous agents via CDP. URL in, pruned ARIA snapshot out.
Downloads
1,891
Maintainers
Readme
~~~~~~~~~~~~~~~~~~~~
~~~ .---------. ~~~
~~~ | · clear | ~~~
~~~ | · focus | ~~~
~~~ '---------' ~~~
~~~~~~~~~~~~~~~~~~~~
barebrowseYour agent browses like you do -- same browser, same logins, same cookies. Prunes pages down to what matters. 40-90% fewer tokens, zero wasted context.
What this is
barebrowse gives your AI agent a real browser. Navigate, read, interact, move on.
It uses the browser you already have -- your sessions, your cookies. Pages come back stripped to what matters -- 40-90% fewer tokens than raw output.
No Playwright. Zero dependencies. No bundled browser. No 200MB download.
Install
npm install barebrowseRequires Node.js >= 22 and any installed Chromium-based browser.
Three ways to use it
1. CLI session -- for coding agents and quick testing
barebrowse open https://example.com # Start session + navigate
barebrowse snapshot # ARIA snapshot → .barebrowse/page-*.yml
barebrowse click 8 # Click element
barebrowse close # End sessionOutputs go to .barebrowse/ as files -- agents read them with their file tools, no token waste in tool responses.
Teach your agent the commands by installing the skill file (a markdown reference the agent reads as context). The CLI tool itself still needs npm install barebrowse -- the skill just teaches the agent how to use it.
Claude Code: Copy commands/barebrowse/SKILL.md to .claude/skills/barebrowse/SKILL.md (project) or run barebrowse install --skill (global).
Other agents: Copy commands/barebrowse.md to your agent's command/skill directory.
For writing your own skill files for other CLI tools: docs/skill-template.md.
2. MCP server -- for Claude Desktop, Cursor, and other MCP clients
Claude Code:
claude mcp add barebrowse -- npx barebrowse mcpClaude Desktop / Cursor:
npx barebrowse installOr manually add to your config (claude_desktop_config.json, .cursor/mcp.json):
{
"mcpServers": {
"barebrowse": {
"command": "npx",
"args": ["barebrowse", "mcp"]
}
}
}VS Code (.vscode/mcp.json):
{
"servers": {
"barebrowse": {
"command": "npx",
"args": ["barebrowse", "mcp"]
}
}
}12 tools: browse, goto, snapshot, click, type, press, scroll, back, forward, drag, upload, pdf. Plus assess (privacy scan) if wearehere is installed. Session runs in hybrid mode with automatic cookie injection.
3. Library -- for agentic automation
Import barebrowse in your agent code. One-shot reads, interactive sessions, full observe-think-act loops. Works with any LLM orchestration library. Ships with a ready-made adapter for bareagent (17 tools, auto-snapshot after every action).
For code examples, API reference, and wiring instructions, see barebrowse.context.md -- the full integration guide.
Three modes
| Mode | What happens | Best for | |------|-------------|----------| | Headless (default) | Launches a fresh Chromium, no UI | Fast automation, scraping, reading pages | | Headed | Connects to your running browser on CDP port | Bot-detected sites, visual debugging, CAPTCHAs | | Hybrid | Tries headless first, falls back to headed if blocked | General-purpose agent browsing |
What it handles automatically
Cookie consent walls (29 languages), login walls (cookie extraction from your browsers), bot detection (stealth patches + automatic headed fallback on challenge pages, error pages, and near-empty responses), permission prompts, SPA navigation, JS dialogs, off-screen elements, pre-filled inputs, ARIA noise, and profile locking. The agent doesn't think about any of it.
What the agent sees
Raw ARIA output from a page is noisy -- decorative wrappers, hidden elements, structural junk. The pruning pipeline (ported from mcprune) strips it down to what matters.
| Page | Raw | Pruned | Reduction | |------|-----|--------|-----------| | example.com | 377 chars | 45 chars | 88% | | Hacker News | 51,726 chars | 27,197 chars | 47% | | Wikipedia (article) | 109,479 chars | 40,566 chars | 63% | | DuckDuckGo | 42,254 chars | 5,407 chars | 87% |
Two pruning modes: act (default) keeps interactive elements and visible labels -- for clicking, typing, navigating. read keeps all text content -- for reading articles and extracting information.
Actions
Everything the agent can do through barebrowse:
| Action | What it does |
|--------|-------------|
| Navigate | Load a URL, wait for page load, auto-dismiss consent |
| Back / Forward | Browser history navigation |
| Snapshot | Pruned ARIA tree with [ref=N] markers. Two modes: act (buttons, links, inputs) and read (full text). 40-90% token reduction. |
| Click | Scroll into view + mouse click at element center |
| Type | Focus + insert text, with option to clear existing content first |
| Press | Special keys: Enter, Tab, Escape, Backspace, Delete, arrows, Space |
| Scroll | Mouse wheel up or down |
| Hover | Move mouse to element center (triggers tooltips, hover states) |
| Select | Set dropdown value (native select or custom dropdown) |
| Drag | Drag one element to another (Kanban boards, sliders) |
| Upload | Set files on a file input element |
| Screenshot | Page capture as base64 PNG/JPEG/WebP |
| PDF | Export page as PDF |
| Assess | Privacy scan: score (0-100), risk level, 10-category breakdown. Requires npm install wearehere. |
| Tabs | List open tabs, switch between them |
| Wait for content | Poll for text or CSS selector to appear on page |
| Wait for navigation | SPA-aware: works for full page loads and pushState |
| Wait for network idle | Resolve when no pending requests for 500ms |
| Dialog handling | Auto-dismiss JS alert/confirm/prompt dialogs |
| Save state | Export cookies + localStorage to JSON |
| Inject cookies | Extract from Firefox/Chromium and inject via CDP |
| Raw CDP | Escape hatch for any Chrome DevTools Protocol command |
Tested against
16+ sites across 8 countries, all consent dialogs dismissed, all interactions working:
Google, YouTube, BBC, Wikipedia, GitHub, DuckDuckGo, Hacker News, Amazon DE, The Guardian, Spiegel, Le Monde, El Pais, Corriere, NOS, Bild, Nu.nl, Booking, NYT, Stack Overflow, CNN, Reddit
Context file
barebrowse.context.md is the full integration guide. Feed it to an AI assistant or read it yourself -- it covers the complete API, snapshot format, interaction loop, auth options, bareagent wiring, MCP setup, and gotchas. Everything you need to wire barebrowse into a project.
How it works
URL -> find/launch browser (chromium.js)
-> WebSocket CDP connection (cdp.js)
-> stealth patches before page scripts (stealth.js, headless only)
-> suppress all permission prompts (Browser.setPermission)
-> extract + inject cookies from your browser (auth.js)
-> navigate to URL, wait for load
-> detect + dismiss cookie consent dialogs (consent.js)
-> get full ARIA accessibility tree (aria.js)
-> 9-step pruning pipeline from mcprune (prune.js)
-> dispatch real input events: click/type/scroll (interact.js)
-> agent-ready snapshot with [ref=N] markers11 modules, 2,400 lines, zero required dependencies.
Requirements
- Node.js >= 22 (built-in WebSocket, built-in SQLite)
- Any Chromium-based browser installed (Chrome, Chromium, Brave, Edge, Vivaldi)
- Linux tested (Fedora/KDE). macOS/Windows cookie paths exist but untested.
The bare ecosystem
Three vanilla JS modules. Zero dependencies. Same API patterns.
| | bareagent | barebrowse | baremobile | |---|---|---|---| | Does | Gives agents a think→act loop | Gives agents a real browser | Gives agents an Android device | | How | Goal in → coordinated actions out | URL in → pruned snapshot out | Screen in → pruned snapshot out | | Replaces | LangChain, CrewAI, AutoGen | Playwright, Selenium, Puppeteer | Appium, Espresso, UIAutomator2 | | Interfaces | Library · CLI · subprocess | Library · CLI · MCP | Library · CLI · MCP | | Solo or together | Orchestrates both as tools | Works standalone | Works standalone |
What you can build:
- Headless automation — scrape sites, fill forms, extract data, monitor pages on a schedule
- QA & testing — automated test suites for web and Android apps without heavyweight frameworks
- Personal AI assistants — chatbots that browse the web or control your phone on your behalf
- Remote device control — manage Android devices over WiFi, including on-device via Termux
- Agentic workflows — multi-step tasks where an AI plans, browses, and acts across web and mobile
Why this exists: Most automation stacks ship 200MB of opinions before you write a line of code. These don't. Install, import, go.
License
MIT
