@browseragentprotocol/cli

v0.9.0

Published

3 months ago

BAP CLI - AI-native browser automation from the command line

0High
0Medium
0Low

pyyush

bap cli browser automation ai agent playwright web-scraping composite-actions semantic-selectors

@browseragentprotocol/cli

CLI-first browser automation from the command line. BAP defaults to installed Chrome, an auto-detected profile when available, and a persistent daemon so agents can work against a real browser instead of starting fresh every time. Like playwright-cli but with superpowers: composite actions (bap act), semantic selectors, and structured extraction.

Quick Start

npx @browseragentprotocol/cli goto https://example.com --observe
npx @browseragentprotocol/cli click role:button:"Get Started"

Or install globally:

npm i -g @browseragentprotocol/cli
bap open https://example.com

By default, the CLI prefers headful Chrome with a persistent session. Use --headless for CI or --no-profile for a fresh automation browser. Chrome can restrict automation of a live default profile, so a dedicated --profile <dir> is the most reliable production setup when you need cookies and long-lived state.

Why BAP CLI?

Composite Actions — Fewer Commands, Fewer Tokens

Execute multi-step flows in one command instead of one-at-a-time:

# playwright-cli: 3 commands, 3 snapshots, 3 LLM reasoning cycles
playwright-cli fill e5 "[email protected]"
playwright-cli fill e8 "password123"
playwright-cli click e12

# bap: 1 command, 1 snapshot, 1 LLM reasoning cycle
bap act fill:e5="[email protected]" fill:e8="password123" click:e12

Semantic Selectors — Resilient to Layout Changes

Target elements by their purpose, not their position:

bap click role:button:"Submit"
bap fill label:"Email" "[email protected]"
bap act fill:role:textbox:"Email"="[email protected]" \
        fill:role:textbox:"Password"="secret" \
        click:role:button:"Sign in"

Structured Extraction — Validated JSON Output

bap extract --fields="title,price,rating"
bap extract --schema=product.json
bap extract --list="product"

Commands

Navigation

bap open [url]              # Browser lifecycle command, optionally navigate
bap goto <url>              # Recommended for "open this URL"
bap goto <url> --observe    # Fused: navigate + observe in 1 server call
bap back                    # Go back
bap forward                 # Go forward
bap reload                  # Reload page

For most agent-driven browsing, prefer bap goto. It reuses the active page when possible and supports fused observation. Use bap open when you specifically want to open a browser first or inspect an existing session.

Interaction

bap click <selector>        # Click element
bap fill <selector> <value> # Fill input field
bap type <text>             # Type into focused element
bap press <key>             # Press keyboard key
bap select <selector> <val> # Select dropdown option
bap check <selector>        # Check checkbox
bap uncheck <selector>      # Uncheck checkbox
bap hover <selector>        # Hover over element
bap scroll [dir] [--pixels=N]  # Scroll page (up/down/left/right)
bap scroll <selector>       # Scroll element into view

Observation

bap observe                 # Interactive elements (default max 50)
bap observe --full          # Full accessibility tree
bap observe --forms         # Form fields only
bap observe --navigation    # Navigation elements only
bap observe --max=20        # Limit elements
bap observe --diff          # Incremental: only show changes since last observation
bap observe --tier=minimal  # Response tier: full, interactive, minimal
bap snapshot [--file=F]     # YAML accessibility snapshot
bap screenshot [--file=F]   # PNG screenshot

Composite Actions

bap act <step1> <step2> ... # Execute multiple steps atomically
bap act <steps> --observe   # Fused: act + observe in 1 server call

Step syntax: action:selector=value or action:selector

# Login flow in one command
bap act fill:role:textbox:"Email"="[email protected]" \
        fill:role:textbox:"Password"="secret" \
        click:role:button:"Sign in"

# Accept cookies + navigate
bap act click:text:"Accept" goto:https://example.com/app

# Fill and submit a search
bap act fill:role:searchbox:"Search"="query here" press:Enter

# Fused act + observe (1 server call instead of 3)
bap act click:e3 --observe --tier=interactive

Sessions & Tabs

bap -s=<name> <command>     # Named session
bap sessions                # List active sessions
bap tabs                    # List open tabs
bap tab-new [url]           # Open new tab
bap tab-select <N>          # Switch to tab
bap frames                  # List frames
bap frame-switch <id>       # Switch to frame

Recipes

bap recipe login <url> --user=<u> --pass=<p>
bap recipe fill-form <url> --data=data.json
bap recipe wait-for <selector> [--timeout=ms]

Configuration

bap config                  # View all settings
bap config browser firefox  # Set default browser
bap config headless false   # Disable headless mode
bap install-skill           # Install skill to detected AI agents
bap skill init              # Install skill to current project

Selectors

| Selector | Example | When to use | | ---------------------- | ------------------------- | ---------------------------------------------- | | e<N> | e15 | From snapshot refs (playwright-cli compatible) | | role:<role>:"<name>" | role:button:"Submit" | By ARIA role and name | | text:"<content>" | text:"Sign in" | By visible text | | label:"<text>" | label:"Email" | Form fields by label | | placeholder:"<text>" | placeholder:"Search..." | By placeholder text | | testid:"<id>" | testid:"submit-btn" | By data-testid | | css:<selector> | css:.btn-primary | CSS selector | | xpath:<path> | xpath://button | XPath selector | | coords:<x>,<y> | coords:100,200 | By coordinates |

Global Options

-s=<name>              Named session
-p, --port <N>         Server port (default: 9222)
-b, --browser <name>   Browser: chrome, chromium, firefox, webkit, edge
--headless             Headless mode for CI/background runs
--no-headless          Show browser window (default)
--profile <path>       Chrome profile dir (default: auto-detect)
--no-profile           Fresh browser, no user profile
-v, --verbose          Verbose output
--observe              Fused observation (for goto, act)
--diff                 Incremental observation (for observe)
--tier=<tier>          Response tier: full, interactive, minimal
--max=<N>              Limit elements (default: 50)

Architecture

BAP CLI communicates with a BAP Playwright server over WebSocket:

bap <command>
    ↕ WebSocket (JSON-RPC 2.0)
BAP Playwright Server (auto-started as background daemon)
    ↕ Playwright
Browser (Chromium / Firefox / WebKit)

The server starts automatically on first use and persists across commands. Use bap close-all to stop it.

Output

Commands produce concise, AI-agent-friendly output:

### Page
- URL: https://example.com/dashboard
- Title: Dashboard
### Snapshot
[Snapshot](.bap/snapshot-2026-02-16T19-30-42.yml)

Files are saved to .bap/ in the current directory:

Snapshots: .bap/snapshot-<timestamp>.yml
Screenshots: .bap/screenshot-<timestamp>.png
Extractions: .bap/extraction-<timestamp>.json

AI Agent Integration

BAP CLI includes a SKILL.md file that teaches AI coding agents how to use it effectively. Install it to your agent:

bap install-skill           # Auto-detect and install to all agents
bap install-skill --dry-run # Preview what would be installed

Supports 13 AI coding agent platforms: Claude Code, Codex CLI, Gemini CLI, Cursor, GitHub Copilot, Windsurf, Roo Code, Amp, Deep Agents, OpenCode, and more.

Migrating from playwright-cli

BAP is a drop-in replacement for playwright-cli. All e<N> refs from snapshots work identically:

| playwright-cli | bap | | ------------------------------- | -------------------- | | playwright-cli open [url] | bap open [url] | | playwright-cli click e15 | bap click e15 | | playwright-cli fill e5 "text" | bap fill e5 "text" | | playwright-cli snapshot | bap snapshot | | playwright-cli screenshot | bap screenshot |

BAP adds composite actions, semantic selectors, smart observation, and structured extraction on top.

Requirements

Node.js >= 20.0.0
Playwright browsers (npx playwright install chromium)

License

Apache-2.0

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@browseragentprotocol/cli

Quick Start

Why BAP CLI?

Composite Actions — Fewer Commands, Fewer Tokens

Semantic Selectors — Resilient to Layout Changes

Structured Extraction — Validated JSON Output

Commands

Navigation

Interaction

Observation

Composite Actions

Sessions & Tabs

Recipes

Configuration

Selectors

Global Options

Architecture

Output

AI Agent Integration

Migrating from playwright-cli

Requirements

License