@browserbasehq/browse-cli
v0.6.0
Published
Browser automation CLI for AI agents, built on Stagehand
Downloads
6,259
Readme
Browse CLI
Browser automation CLI for AI agents. Built on Stagehand, providing raw browser control without requiring LLM integration.
Installation
npm install -g @browserbasehq/browse-cliRequires Chrome/Chromium installed on the system.
Quick Start
# Navigate to a URL (auto-starts browser daemon)
browse open https://example.com
# Take a snapshot to get element refs
browse snapshot -c
# Click an element by ref
browse click @0-5
# Type text
browse type "Hello, world!"
# Take a screenshot
browse screenshot ./page.png
# Stop the browser
browse stopHow It Works
Browse uses a daemon architecture for fast, stateful interactions:
- First command auto-starts a Chrome browser daemon
- Subsequent commands reuse the same browser session
- State persists between commands (cookies, refs, etc.)
- Multiple sessions supported via
--sessionorBROWSE_SESSIONenv var
Self-Healing Sessions
The CLI automatically recovers from stale sessions. If the daemon or Chrome crashes:
- Detects the failure
- Cleans up stale processes and files
- Restarts the daemon
- Retries the command
Agents don't need to handle recovery - commands "just work".
Commands
Navigation
browse open <url> [--wait load|domcontentloaded|networkidle] [-t|--timeout ms]
browse reload
browse back
browse forwardThe --timeout flag (default: 30000ms) controls how long to wait for the page load state. Use longer timeouts for slow-loading pages:
browse open https://slow-site.com --timeout 60000Click Actions
browse click <ref> [-b left|right|middle] [-c count] # Click by ref (e.g., @0-5)
browse click_xy <x> <y> [--button] [--xpath] # Click at coordinatesCoordinate Actions
browse hover <x> <y> [--xpath]
browse scroll <x> <y> <deltaX> <deltaY> [--xpath]
browse drag <fromX> <fromY> <toX> <toY> [--steps n] [--xpath]Keyboard
browse type <text> [-d delay] [--mistakes]
browse press <key> # e.g., Enter, Tab, Cmd+AForms
browse fill <selector> <value> [--no-press-enter]
browse select <selector> <values...>
browse highlight <selector> [-d duration]Page Info
browse get url
browse get title
browse get text <selector>
browse get html <selector>
browse get value <selector>
browse get box <selector> # Returns center coordinates
browse snapshot [-c|--compact] # Accessibility tree with refs
browse screenshot [path] [-f|--full-page] [-t png|jpeg]Waiting
browse wait load [state]
browse wait selector <selector> [-t timeout] [-s visible|hidden|attached|detached]
browse wait timeout <ms>Multi-Tab
browse pages # List all tabs
browse newpage [url] # Open new tab
browse tab_switch <n> # Switch to tab by index
browse tab_close [n] # Close tab (default: last)Network Capture
Capture HTTP requests to the filesystem for inspection:
browse network on # Start capturing requests
browse network off # Stop capturing
browse network path # Get capture directory path
browse network clear # Clear captured requestsCaptured requests are saved as directories:
/tmp/browse-default-network/
001-GET-api.github.com-repos/
request.json # method, url, headers, body
response.json # status, headers, body, durationDaemon Control
browse start # Explicitly start daemon
browse stop [--force] # Stop daemon
browse status # Check daemon status
browse env [target] # Show or switch environment: local | remoteEnvironment Switching (Local vs Remote)
Use environment switching when an agent should keep the same command flow, but the browser runtime needs to change:
localruns Chrome on your machine (best for local debugging/dev loops)remoteruns a Browserbase session (best for anti-bot hardening and cloud runs)
# Show active environment (if running) and desired environment for next start
browse env
# Switch current session to Browserbase (restarts daemon if needed)
browse env remote
# Switch back to local Chrome (clean isolated browser by default)
browse env localLocal Browser Strategies
By default, browse env local launches a clean isolated local browser.
Use browse env local --auto-connect to opt into reusing an already-running
Chrome with remote debugging enabled. If no debuggable Chrome is found, it
falls back to launching an isolated browser.
# Use a clean isolated browser (default)
browse env local
# Auto-discover local Chrome, fallback to isolated
browse env local --auto-connect
# Attach to a specific CDP target (port or URL)
browse env local 9222
browse env local ws://localhost:9222/devtools/browser/...Auto-discovery checks:
DevToolsActivePortfiles in well-known Chrome/Chromium/Brave user-data directories- Common debugging ports (9222, 9229)
To make your Chrome discoverable:
- Open
chrome://inspect/#remote-debugging - Check the box "Allow remote debugging for this browser instance"
For more information, see the Chrome DevTools docs.
Use browse status to see which strategy was resolved:
browse status
# {"running":true,"session":"default","mode":"local","localStrategy":"isolated","localSource":"isolated"}General Behavior
- Environment is scoped per
--session browse env <target>persists an override and restarts the daemonbrowse stopclears the override so next start falls back to env-var-based auto detection- Auto detection defaults to:
remotewhenBROWSERBASE_API_KEYis setlocalotherwise
Global Options
| Option | Description |
|--------|-------------|
| --session <name> | Session name for multiple browsers (default: "default") |
| --headless | Run Chrome in headless mode |
| --headed | Run Chrome with visible window (default) |
| --ws <url\|port> | One-shot CDP connection (bypasses daemon) |
| --json | Output as JSON |
Environment Variables
| Variable | Description |
|----------|-------------|
| BROWSE_SESSION | Default session name (alternative to --session) |
| BROWSERBASE_API_KEY | Browserbase API key (required for browse env remote) |
Element References
After running browse snapshot, you can reference elements by their ref ID:
# Get snapshot with refs
browse snapshot -c
# Output includes refs like [0-5], [1-2], etc.
# RootWebArea "Example" url="https://example.com"
# [0-0] link "Home"
# [0-1] link "About"
# [0-2] button "Sign In"
# Click using ref (multiple formats supported)
browse click @0-2 # @ prefix
browse click 0-2 # Plain ref
browse click ref=0-2 # Explicit prefixThe full snapshot output includes mappings:
- xpathMap: Cross-frame XPath selectors
- cssMap: Fast CSS selectors when available
- urlMap: Extracted URLs from links
Multiple Sessions
Run multiple browser instances simultaneously:
# Terminal 1
BROWSE_SESSION=session1 browse open https://google.com
# Terminal 2
BROWSE_SESSION=session2 browse open https://github.com
# Or use --session flag
browse --session work open https://slack.com
browse --session personal open https://twitter.comDirect CDP Connection
Opt into using an existing Chrome instance:
To make your Chrome discoverable:
- Open
chrome://inspect/#remote-debugging - Check the box "Allow remote debugging for this browser instance"
- Re-run the CLI with auto-connect enabled.
For more information, see the Chrome DevTools docs.
# Auto-discover Chrome with remote debugging enabled
browse env local --auto-connect
browse open https://example.com
# Or target a specific port / WebSocket URL
browse env local 9222
browse --ws ws://localhost:9222/devtools/browser/... open https://example.comOptimal AI Workflow
- Navigate to target page (browser auto-starts)
- Snapshot to get the accessibility tree with refs
- Click/Fill using refs directly (e.g.,
@0-5) - Re-snapshot after actions to verify state changes
- Stop when done
browse open https://example.com
browse snapshot -c
# [0-5] textbox: Search
# [0-8] button: Submit
browse fill @0-5 "my query"
browse click @0-8
browse snapshot -c # Verify result
browse stopTroubleshooting
Chrome not found
The CLI uses your system Chrome/Chromium. If not found:
# macOS - Install Chrome or set path
export CHROME_PATH=/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome
# Linux - Install chromium
sudo apt install chromium-browserStale daemon
If the daemon becomes unresponsive:
browse stop --forcePermission denied on socket
# Clean up stale socket files
rm /tmp/browse-*.sock /tmp/browse-*.pidPlatform Support
- macOS (Intel and Apple Silicon)
- Linux (x64 and arm64)
Windows support requires WSL or TCP socket implementation.
Development
# Clone and setup (in monorepo)
cd packages/cli
pnpm install # Install dependencies first!
pnpm run build # Build the CLI
# Run without building (for development)
pnpm run dev -- <command>
# Or with tsx directly
npx tsx src/index.ts <command>
# Run linting and formatting
pnpm run lint
pnpm run formatLicense
MIT - see LICENSE
Related
- Stagehand - AI web browser automation framework
- Browserbase - Cloud browser infrastructure
