@youtyan/browser-pilot
v2.3.2
Published
Browser automation MCP server for Claude Code — control Chrome via natural language
Maintainers
Readme
browser-pilot
An MCP server for controlling Chrome browsers from Claude Code. Operates real browsers via a Chrome Extension.
Quick Start
1. Run setup
npx @youtyan/browser-pilot setupThis will:
- Detect your MCP host (Claude Code / Cursor / Windsurf) and generate config
- Install skills
- Guide you through Chrome Extension installation
2. Load the Chrome Extension
Open chrome://extensions, enable Developer Mode, click "Load unpacked", and select the extension path shown by setup.
3. Verify connection
npx @youtyan/browser-pilot doctorNo token required — the Extension connects automatically.
Skills
Seven skills are included:
| Skill | Description | |---|---| | bp-usage | Browser operation guide and tool quick reference | | bp-testing | Agent-driven web app testing | | bp-test-scripts | Code-driven browser testing (HTTP API) | | bp-gemini-image | Image generation via Gemini Web UI | | bp-x-operation | X (Twitter) operations (post, search, collect) | | bp-annotate-coords | Screenshot annotation with coordinates and bounds | | bp-generate-manual | Site operation manual generation with screenshots/video |
You can also install skills with npx skills add:
npx skills add https://github.com/youtyan/browser-pilotAvailable Tools (19)
Core (9)
| Tool | Description | |---|---| | browser_state | Page content (mode: page) or element search (mode: find). Replaces browser_get_page + browser_find_elements | | browser_click | Click by text, CSS selector, ref, or backendNodeId | | browser_type | Set form values, select options, press keys (action: input/select/key) | | browser_scroll | Scroll page or element | | browser_wait | Wait for element appear/disappear/count change, or simple ms delay. assert param for error on timeout | | browser_batch | Execute multiple actions in one round-trip | | browser_tab | Tab management + navigation (list, connect, create, navigate, back, forward, reload, action log, manual/test generation) | | browser_upload | Upload or paste files | | browser_screenshot | Take screenshot (full page or element, with savePath) |
Debug (2)
| Tool | Description | |---|---| | browser_debug | Console, network, performance, health, CSS inspect, interaction diagnosis (action parameter) | | browser_eval | Execute JavaScript in page context (CDP direct, fast). First-class tool, no env var gate |
Advanced (3)
| Tool | Description | |---|---| | browser_mouse | Mouse actions (hover, dblclick, drag, contextmenu) | | browser_extract | Schema-first structured data extraction | | browser_artifacts | Record video, save PDF, annotate screenshots (action: record/pdf/annotate) |
Storage (1)
| Tool | Description | |---|---| | browser_storage | Cookie, localStorage, sessionStorage management (target + action params) |
Integrations (4)
| Tool | Description | |---|---| | gemini_generate_image | Generate images via Gemini Web UI | | x_collect_tweets | Collect tweets from X (Twitter) with scroll and dedup | | x_post | Post to X (Twitter) | | x_interact | Batch interact on X (like, repost, follow) |
Security Considerations
browser-pilot is a powerful browser automation tool. Understand the following risks before use.
browser_evalexecutes arbitrary JavaScript in the MAIN world of Chrome tabs. This gives full access to the page's DOM, cookies, localStorage, and more.- The MCP server binds to
localhost:18888. Any local process can connect (no token required for WebSocket). - The HTTP API (
/api/*) is gated by a bearer token stored in~/.browser-pilot-tokenwith0600permissions. - The Chrome Extension has
<all_urls>permission. It can access any website. - Only use this tool in trusted environments.
Development
See DEVELOPMENT.md for setup, architecture, adding tools, and build instructions.
Updating
npx @youtyan/browser-pilot updateLicense
MIT
