browserman-cli
v0.3.0
Published
BrowserMan CLI for device authorization and MCP browser control
Maintainers
Readme
BrowserMan CLI
Connect AI agents to BrowserMan with a device-authorization CLI and MCP bridge.
BrowserMan CLI lets a user run setup from the terminal, approve access on the web, and then start an MCP server that targets their approved BrowserMan browser scope.
No dashboard code-copy flow required. Run setup in the terminal, approve on the web, then hand the resulting access to your agent.
What this npm package includes
This npm package contains the BrowserMan CLI and MCP bridge.
It is intended for:
browserman setupbrowserman setup startbrowserman setup statusbrowserman setup finishbrowserman doctorbrowserman revokebrowserman mcp
It does not bundle the BrowserMan web server or Chrome extension. Those live in the main BrowserMan project and service.
Architecture
AI Agent <--> MCP Server <--> BrowserMan Server <--> Chrome Extension
(Claude) (stdio/SSE) (Express + WS) (Manifest V3)- Chrome Extension — Runs in your browser, executes commands via Chrome DevTools Protocol
- Server — Relay that bridges API calls to the extension via WebSocket
- MCP Server — Exposes browser tools to AI agents using the Model Context Protocol
- CLI — Starts the MCP server with your credentials
Install
npm install -g browserman-cliOne-off usage without global install:
npx browserman-cli setupAfter install, the main commands are:
browserman setup
browserman setup start --json
browserman setup status --json
browserman setup finish --resume <deviceCode> --json
browserman doctor
browserman browser list --json
- For any command that accepts --browser, you can pass a browser id, slug, or exact name.
browserman page open --url https://example.com
browserman page read --json
browserman revoke
browserman mcpDirect CLI automation
These commands are the primary way for AI agents, shell scripts, and skills to use BrowserMan without MCP.
browserman browser list --json
- For any command that accepts --browser, you can pass a browser id, slug, or exact name.
browserman browser current --json
browserman browser ping --json
browserman page open --url https://example.com --json
browserman page read --json
browserman page click --ref 12 --json
browserman page type --text "hello" --json
browserman page press --key Enter --json
browserman page eval --js "document.title" --json
browserman page url --json
browserman page form --ref 12 --value "kinhunt" --json
browserman page scroll --direction down --pixels 500 --json
browserman page screenshot --out ./page.png --json
browserman script list --json
browserman script run --site x.com --action search --text "browserman" --jsonImportant behavior for agent users:
browserman page ...commands now operate on the selected browser's current active page by default.- You do not need to manually provide a
tabIdorsessionIdfor normal CLI usage. page openwill reuse the active tab when possible.- If the selected browser has no usable tab yet, BrowserMan CLI will create a session-backed tab automatically and continue.
- For the most stable agent integrations, prefer
--jsonon browser/page/script commands.
Recommended mental model:
browserman browser current --jsonbrowserman page open --url ... --jsonbrowserman page read --jsonbrowserman page click/type/press/... --json
Use browserman mcp when you specifically need an MCP server for an MCP-compatible client.
Quick Start
1. Start the BrowserMan service
git clone <repo-url> && cd browserman
npm install
npm start
# Server running on http://localhost:31002. Sign in on the web and connect your browser
- Open the BrowserMan website or local dashboard in your browser.
- Sign in or create an account.
- Install the Chrome extension from the
extension/folder. - Open the BrowserMan side panel and connect this browser to your account.
The normal product path is account-first:
- sign in
- connect browser
- connect AI agent
You do not need to manually mint a long-lived bm_key_... or paste a bm_ext_... secret for the default CLI flow.
3. Approve CLI access from the terminal
Run setup from the terminal:
browserman setupBrowserMan CLI prints a stable URL and code like:
BrowserMan authorization required.
Open: https://browserman.run/activate
Code: ABCD-EFGHThen:
- Open
https://browserman.run/activate - Sign in if needed
- Enter the code from the terminal
- Review and narrow the final scope on the web
- Approve the request
- Return to CLI — BrowserMan finishes setup automatically
For agent-friendly automation:
browserman setup --jsonFor a task-oriented flow that does not depend on one long-lived interactive command:
browserman setup start --json
browserman setup status --json
browserman setup finish --resume <deviceCode> --jsonRecommended standard flow for AI agents plus a human approver:
- Agent runs
browserman setup start --json --no-open. - Agent reads
userCode,activateUrl, andfinishCommandfrom the JSON result. - Agent shows the human the activation URL and code.
- Human opens the web page, signs in, reviews scope, and approves.
- Agent runs the exact
finishCommandfrom the JSON result. - Agent expects a final
setup_resultwithstatus: "completed"andcode: "setup_completed".
The real successful shape for setup start --json is:
status: "waiting_for_user"code: "device_authorization_required"userCodeactivateUrldeviceCodefinishCommandresumeCommandstartCommand
If no local setup task exists yet, setup status --json and setup finish --json return a structured setup_result with:
status: "not_started"code: "setup_state_missing"startCommand: "browserman setup start"
Important JSON contract for agents:
browserman setup --jsonmay emit multiple JSON lines, not just one.- Progress lines use
kind: "setup_event". - The final command outcome uses
kind: "setup_result". - Agents should branch on
kind, not on line position alone. setup_resultpayloads now expose stable top-level fields likestatus,code,resumable,deviceCode,userCode,activateUrl,pollAfterMs,expiresAt,statusCommand,finishCommand, andresumeCommand.
Typical agent pattern:
- Read each JSON line.
- If
kind === "setup_event", update progress state. - If
kind === "setup_result", treat it as the final return value for that CLI call. - If
status === "interrupted"andresumable === true, tell the user to approve on the web and then runfinishCommand.
Example event line:
{"kind":"setup_event","type":"waiting_for_user","status":"waiting_for_user","code":"approval_pending","resumable":true,"deviceCode":"bm_dev_xxx"}Example final result line:
{"kind":"setup_result","ok":true,"command":"setup","status":"interrupted","code":"setup_waiting_for_user","resumable":true,"deviceCode":"bm_dev_xxx","finishCommand":"browserman setup finish --resume bm_dev_xxx"}4. Validate the saved delegated setup
browserman doctor
browserman browser list --json
- For any command that accepts --browser, you can pass a browser id, slug, or exact name.
browserman page open --url https://example.com --json
browserman page read --json5. Connect AI agents
Claude Desktop / Claude Code
After browserman setup, BrowserMan stores delegated config locally. The simplest MCP config is:
{
"mcpServers": {
"browserman": {
"command": "browserman",
"args": ["mcp"]
}
}
}If you need to override the saved config explicitly:
{
"mcpServers": {
"browserman": {
"command": "browserman",
"args": [
"mcp",
"--server", "http://localhost:3100",
"--token", "bm_dlg_xxx",
"--browser", "ext_xxx"
]
}
}
}Cursor / Other MCP Clients
Use the same pattern: prefer saved delegated config from browserman setup, and only fall back to explicit --server / --token / --browser when you need a fully scripted override.
SSE Transport (Remote)
browserman mcp \
--server http://localhost:3100 \
--token bm_dlg_xxx \
--browser ext_xxx \
--transport sse \
--port 3001
# MCP server available at http://localhost:3001/sseMCP Tools
Browser Discovery
| Tool | Description |
|------|-------------|
| browserman_list_browsers | List visible connected browsers and the saved default when available |
| browserman_current_browser | Show which browser MCP will target by default |
Core Browser Operations
| Tool | Description |
|------|-------------|
| browser_status | Check if extension is connected |
| browser_navigate | Go to a URL |
| browser_read_page | Get accessibility tree (element refs) |
| browser_screenshot | Capture page screenshot |
| browser_click | Click element by ref |
| browser_type | Type text at cursor |
| browser_form_input | Set form field value by ref |
| browser_press_key | Press keyboard key |
| browser_scroll | Scroll page or element into view |
| browser_evaluate | Run JavaScript in page |
| browser_get_url | Get current URL |
| browser_new_tab | Open new tab |
| browser_upload_file | Upload file to input |
| browser_task_complete | Signal task done (hides overlay) |
Platform Automation Scripts
| Tool | Description |
|------|-------------|
| browser_list_scripts | List available platforms and their actions |
| browser_run_script | Execute a pre-built multi-step automation script |
BrowserMan ships with pre-built scripts that combine multiple low-level operations into single, reliable actions. For sites without scripts, the AI falls back to core browser tools.
Detailed social media script reference: docs/social-media-platform-scripts.md
Supported Platforms
For the full action inventory implemented in code, see docs/social-media-platform-scripts.md.
X (Twitter) — x.com
| Action | Description |
|--------|-------------|
| post | Create a tweet (with optional media) |
| like | Like a tweet by URL |
| reply | Reply to a tweet |
| retweet | Retweet a post |
| quote_retweet | Quote retweet with comment |
| bookmark | Bookmark a tweet |
| search | Search tweets |
| get_timeline | Read the home timeline |
| follow / unfollow | Follow or unfollow a user |
| get_notifications | Read your notifications |
LinkedIn — linkedin.com
| Action | Description |
|--------|-------------|
| post | Create a post (with optional images) |
| article | Publish a long-form article (Markdown supported) |
| like | Like/react to a post |
| comment | Comment on a post |
| get_feed | Read the LinkedIn feed |
| follow / unfollow | Follow or unfollow a user/company |
| get_notifications | Read your notifications |
| send_connection | Send a connection request (with optional note) |
| search | Search people, posts, or companies |
Reddit — reddit.com
| Action | Description |
|--------|-------------|
| post | Submit a post to a subreddit |
| comment | Comment on a post |
| vote | Upvote or downvote |
| get_feed | Read a subreddit or home feed |
| subscribe / unsubscribe | Join or leave a subreddit |
| get_notifications | Read inbox/notifications |
| search | Search Reddit |
Medium — medium.com
| Action | Description |
|--------|-------------|
| article | Publish an article (Markdown supported) |
| get_feed | Read the Medium feed |
| follow | Follow an author |
| clap | Clap for an article (1–50 claps) |
Any website works. For sites without pre-built scripts, the AI uses core browser tools (navigate, read, click, type, etc.) to automate any task.
How It Works
Two Operation Modes
Pre-built Scripts — Single-command automation for supported platforms. The AI calls
browser_run_scriptwith a platform and action (e.g.,x.com+post), and BrowserMan handles the entire multi-step flow.Core Browser Tools — Direct low-level control for any website. The AI reads the page structure, identifies elements, and performs actions step by step. This is the fallback for sites without pre-built scripts.
Smart Tab Management (v0.2.0)
BrowserMan manages browser tabs intelligently:
- Tab Reuse — Navigating to a platform reuses an existing tab instead of opening a new one
- Auto Cleanup — Tabs created by scripts are automatically closed when done; tabs that existed before are left alone
- Unified Tab Group — All BrowserMan-managed tabs are organized in a single Chrome Tab Group for a clean workspace
- Activity-Driven Lifecycle — Sessions stay alive based on activity, not arbitrary timeouts
REST API
All API endpoints require Authorization: Bearer <token> header. Both session tokens (bm_sess_) and API keys (bm_key_) are accepted.
Auth
| Method | Endpoint | Description |
|--------|----------|-------------|
| POST | /auth/register | Create account (email, password) |
| POST | /auth/login | Login, get session token |
| POST | /auth/logout | Invalidate session |
| GET | /auth/me | Get current user |
Commands
# Send a command to the browser
curl -X POST http://localhost:3100/api/command \
-H "Authorization: Bearer bm_key_xxx" \
-H "Content-Type: application/json" \
-d '{
"extension": "bm_ext_xxx",
"action": "navigate",
"params": {"url": "https://example.com"}
}'Extensions
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | /api/extensions | List your extensions |
| POST | /api/extensions | Create extension |
| PATCH | /api/extensions/:id | Update name |
| DELETE | /api/extensions/:id | Delete extension |
API Keys
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | /api/keys | List keys (prefix only) |
| POST | /api/keys | Create key (full key shown once) |
| DELETE | /api/keys/:id | Revoke key |
Deployment
Fly.io
# Install flyctl, then:
fly launch # first time
fly deploy # subsequent deploys
fly secrets set PORT=8080 # if neededThe included fly.toml and Dockerfile are pre-configured:
- Multi-stage build (compiles native SQLite bindings)
- Persistent volume at
/app/datafor the database - Auto-stop when idle, auto-start on request
- Force HTTPS
Docker
docker build -t browserman .
docker run -p 3100:8080 -v browserman-data:/app/data browsermanEnvironment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| PORT | 3100 | Server port |
| HOSTNAME | 0.0.0.0 | Bind address |
Development
npm run dev # Start with --watch (auto-reload)
npm run seed # Create test user/extension/key
node scripts/test-auth.sh # Test auth flowRuntime data note:
data/is a local runtime data directory used for the SQLite database and other machine-local state. It is not part of the source tree and is ignored by git.
Security Notes
- Passwords stored as salted SHA-256 hashes
- API keys and session tokens stored as SHA-256 hashes (never plaintext)
- Extension keys (bm_ext_) are stored in plaintext for WebSocket auth lookup
- Sessions expire after 30 days
- All API endpoints validate ownership (users can only access their own resources)
License
MIT
