@browserbasehq/browse-cli

v0.6.0

Published

13 days ago

Browser automation CLI for AI agents, built on Stagehand

Downloads

6,259

0High
0Medium
0Low

browser automation cli ai agent chrome cdp web-scraping testing stagehand

Browse CLI

Browser automation CLI for AI agents. Built on Stagehand, providing raw browser control without requiring LLM integration.

Installation

npm install -g @browserbasehq/browse-cli

Requires Chrome/Chromium installed on the system.

Quick Start

# Navigate to a URL (auto-starts browser daemon)
browse open https://example.com

# Take a snapshot to get element refs
browse snapshot -c

# Click an element by ref
browse click @0-5

# Type text
browse type "Hello, world!"

# Take a screenshot
browse screenshot ./page.png

# Stop the browser
browse stop

How It Works

Browse uses a daemon architecture for fast, stateful interactions:

First command auto-starts a Chrome browser daemon
Subsequent commands reuse the same browser session
State persists between commands (cookies, refs, etc.)
Multiple sessions supported via --session or BROWSE_SESSION env var

Self-Healing Sessions

The CLI automatically recovers from stale sessions. If the daemon or Chrome crashes:

Detects the failure
Cleans up stale processes and files
Restarts the daemon
Retries the command

Agents don't need to handle recovery - commands "just work".

Commands

Navigation

browse open <url> [--wait load|domcontentloaded|networkidle] [-t|--timeout ms]
browse reload
browse back
browse forward

The --timeout flag (default: 30000ms) controls how long to wait for the page load state. Use longer timeouts for slow-loading pages:

browse open https://slow-site.com --timeout 60000

Click Actions

browse click <ref> [-b left|right|middle] [-c count]  # Click by ref (e.g., @0-5)
browse click_xy <x> <y> [--button] [--xpath]          # Click at coordinates

Coordinate Actions

browse hover <x> <y> [--xpath]
browse scroll <x> <y> <deltaX> <deltaY> [--xpath]
browse drag <fromX> <fromY> <toX> <toY> [--steps n] [--xpath]

Keyboard

browse type <text> [-d delay] [--mistakes]
browse press <key>  # e.g., Enter, Tab, Cmd+A

Forms

browse fill <selector> <value> [--no-press-enter]
browse select <selector> <values...>
browse highlight <selector> [-d duration]

Page Info

browse get url
browse get title
browse get text <selector>
browse get html <selector>
browse get value <selector>
browse get box <selector>  # Returns center coordinates

browse snapshot [-c|--compact]  # Accessibility tree with refs
browse screenshot [path] [-f|--full-page] [-t png|jpeg]

Waiting

browse wait load [state]
browse wait selector <selector> [-t timeout] [-s visible|hidden|attached|detached]
browse wait timeout <ms>

Multi-Tab

browse pages          # List all tabs
browse newpage [url]  # Open new tab
browse tab_switch <n> # Switch to tab by index
browse tab_close [n]  # Close tab (default: last)

Network Capture

Capture HTTP requests to the filesystem for inspection:

browse network on     # Start capturing requests
browse network off    # Stop capturing
browse network path   # Get capture directory path
browse network clear  # Clear captured requests

Captured requests are saved as directories:

/tmp/browse-default-network/
  001-GET-api.github.com-repos/
    request.json      # method, url, headers, body
    response.json     # status, headers, body, duration

Daemon Control

browse start          # Explicitly start daemon
browse stop [--force] # Stop daemon
browse status         # Check daemon status
browse env [target]   # Show or switch environment: local | remote

Environment Switching (Local vs Remote)

Use environment switching when an agent should keep the same command flow, but the browser runtime needs to change:

local runs Chrome on your machine (best for local debugging/dev loops)
remote runs a Browserbase session (best for anti-bot hardening and cloud runs)

# Show active environment (if running) and desired environment for next start
browse env

# Switch current session to Browserbase (restarts daemon if needed)
browse env remote

# Switch back to local Chrome (clean isolated browser by default)
browse env local

Local Browser Strategies

By default, browse env local launches a clean isolated local browser. Use browse env local --auto-connect to opt into reusing an already-running Chrome with remote debugging enabled. If no debuggable Chrome is found, it falls back to launching an isolated browser.

# Use a clean isolated browser (default)
browse env local

# Auto-discover local Chrome, fallback to isolated
browse env local --auto-connect

# Attach to a specific CDP target (port or URL)
browse env local 9222
browse env local ws://localhost:9222/devtools/browser/...

Auto-discovery checks:

DevToolsActivePort files in well-known Chrome/Chromium/Brave user-data directories
Common debugging ports (9222, 9229)

To make your Chrome discoverable:

Open chrome://inspect/#remote-debugging
Check the box "Allow remote debugging for this browser instance"

For more information, see the Chrome DevTools docs.

Use browse status to see which strategy was resolved:

browse status
# {"running":true,"session":"default","mode":"local","localStrategy":"isolated","localSource":"isolated"}

General Behavior

Environment is scoped per --session
browse env <target> persists an override and restarts the daemon
browse stop clears the override so next start falls back to env-var-based auto detection
Auto detection defaults to:
- remote when BROWSERBASE_API_KEY is set
- local otherwise

Global Options

| Option | Description | |--------|-------------| | --session <name> | Session name for multiple browsers (default: "default") | | --headless | Run Chrome in headless mode | | --headed | Run Chrome with visible window (default) | | --ws <url\|port> | One-shot CDP connection (bypasses daemon) | | --json | Output as JSON |

Environment Variables

| Variable | Description | |----------|-------------| | BROWSE_SESSION | Default session name (alternative to --session) | | BROWSERBASE_API_KEY | Browserbase API key (required for browse env remote) |

Element References

After running browse snapshot, you can reference elements by their ref ID:

# Get snapshot with refs
browse snapshot -c

# Output includes refs like [0-5], [1-2], etc.
# RootWebArea "Example" url="https://example.com"
#   [0-0] link "Home"
#   [0-1] link "About"
#   [0-2] button "Sign In"

# Click using ref (multiple formats supported)
browse click @0-2       # @ prefix
browse click 0-2        # Plain ref
browse click ref=0-2    # Explicit prefix

The full snapshot output includes mappings:

xpathMap: Cross-frame XPath selectors
cssMap: Fast CSS selectors when available
urlMap: Extracted URLs from links

Multiple Sessions

Run multiple browser instances simultaneously:

# Terminal 1
BROWSE_SESSION=session1 browse open https://google.com

# Terminal 2
BROWSE_SESSION=session2 browse open https://github.com

# Or use --session flag
browse --session work open https://slack.com
browse --session personal open https://twitter.com

Direct CDP Connection

Opt into using an existing Chrome instance:

To make your Chrome discoverable:

Open chrome://inspect/#remote-debugging
Check the box "Allow remote debugging for this browser instance"
Re-run the CLI with auto-connect enabled.

For more information, see the Chrome DevTools docs.

# Auto-discover Chrome with remote debugging enabled
browse env local --auto-connect
browse open https://example.com

# Or target a specific port / WebSocket URL
browse env local 9222
browse --ws ws://localhost:9222/devtools/browser/... open https://example.com

Optimal AI Workflow

Navigate to target page (browser auto-starts)
Snapshot to get the accessibility tree with refs
Click/Fill using refs directly (e.g., @0-5)
Re-snapshot after actions to verify state changes
Stop when done

browse open https://example.com
browse snapshot -c
# [0-5] textbox: Search
# [0-8] button: Submit
browse fill @0-5 "my query"
browse click @0-8
browse snapshot -c  # Verify result
browse stop

Troubleshooting

Chrome not found

The CLI uses your system Chrome/Chromium. If not found:

# macOS - Install Chrome or set path
export CHROME_PATH=/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome

# Linux - Install chromium
sudo apt install chromium-browser

Stale daemon

If the daemon becomes unresponsive:

browse stop --force

Permission denied on socket

# Clean up stale socket files
rm /tmp/browse-*.sock /tmp/browse-*.pid

Platform Support

macOS (Intel and Apple Silicon)
Linux (x64 and arm64)

Windows support requires WSL or TCP socket implementation.

Development

# Clone and setup (in monorepo)
cd packages/cli
pnpm install         # Install dependencies first!
pnpm run build       # Build the CLI

# Run without building (for development)
pnpm run dev -- <command>

# Or with tsx directly
npx tsx src/index.ts <command>

# Run linting and formatting
pnpm run lint
pnpm run format

License

MIT - see LICENSE

Stagehand - AI web browser automation framework
Browserbase - Cloud browser infrastructure