open-browse

v0.1.0

Published

19 days ago

TypeScript-native browser automation tool for AI agents

0High
0Medium
0Low

linc.3395

browser automation playwright typescript ai-agent cli web-scraping testing

open-browse

TypeScript-native browser automation CLI — optimized for AI agents

32 commands for browser control via CLI. Every command returns structured JSON. Designed for AI agent workflows with zero parsing ambiguity.

Install

npm install -g open-browse

Playwright chromium is installed automatically via postinstall.

Quick Start

# Launch browser (auto-sets as current session)
browser-driver open --url https://app.com --name my-session

# See what's on the page
browser-driver state

# Interact
browser-driver click "#login-btn"
browser-driver fill "#email" "[email protected]"

# Take screenshot (base64 to stdout)
browser-driver screenshot

# Done
browser-driver close

Command Reference

Session (5)

| Command | Description | |---------|-------------| | open --url <url> --name <name> [--use] | Launch browser, create session, auto-set as current | | close | Close session and browser | | sessions | List all active sessions | | use <name> | Set current working session | | current | Show current session |

Navigation (4)

| Command | Description | |---------|-------------| | goto <url> | Navigate to URL | | back / forward / reload | History navigation |

Inspection (4)

| Command | Description | |---------|-------------| | state | Page URL, title, and interactive elements | | dom <selector> | Get outer HTML | | inspect <selector> | Element details (tag, bbox, styles) | | source | Full page HTML source |

Interaction (8)

| Command | Description | |---------|-------------| | click <selector> | Click an element | | fill <selector> <text> | Fill input field | | select <selector> <value> | Select dropdown option | | hover <selector> | Hover over element | | scroll-to <selector> | Scroll element into viewport | | scroll-by --y <px> | Scroll by pixel offset | | screenshot [--out <file>] | Screenshot (base64 stdout or file) | | upload <selector> <path> | Upload file to input |

Wait (3)

| Command | Description | |---------|-------------| | wait <selector> [--timeout <ms>] | Wait for element | | wait-text <text> | Wait for text to appear | | wait-url <pattern> | Wait for URL match |

Get (6)

| Command | Description | |---------|-------------| | get-title / get-url | Page title or URL | | get-text <selector> | Element text content | | get-value <selector> | Input/select value | | get-html <selector> | Element HTML | | get-attributes <selector> | All attributes |

Cookies (5)

| Command | Description | |---------|-------------| | cookies-get / cookies-clear | Get or clear all cookies | | cookies-set <name> <value> | Set cookie | | cookies-export / cookies-import <file> | Export/import JSON |

Snapshot & Diff (5)

| Command | Description | |---------|-------------| | snapshot <name> [--scope <sel>] | Save DOM + styles snapshot | | diff dom/elements/style/attrs <a> <b> | Compare snapshots |

Script System (3)

| Command | Description | |---------|-------------| | run <script> [--args <json>] | Execute a script | | list-scripts | List available scripts | | describe <script> [--json] | Show script docs |

Agent Workflow Example

# Setup
browser-driver open --url https://app.com/login --name task1

# Observe → Act → Wait → Verify
browser-driver state
browser-driver fill "#email" "[email protected]"
browser-driver fill "#password" "secret123"
browser-driver click "#login-btn"
browser-driver wait-url "**/dashboard" --timeout 10000
browser-driver get-title
# → { "title": "Dashboard - App" }

Key Features

JSON everywhere — stdout for success, stderr for errors, no parsing ambiguity
Session persistence — daemon keeps browser alive between commands (~50ms per command)
state command — returns all interactive elements indexed, the agent's primary view
snapshot + diff — token-efficient change detection (125x fewer tokens than raw HTML)
Self-discovery — list-scripts + describe --json for runtime command discovery

Architecture

CLI (yargs) → HTTP → Daemon (Node.js) → Browser (Playwright)
                         ↓
                    Session Map

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

open-browse

Install

Quick Start

Command Reference

Session (5)

Navigation (4)

Inspection (4)

Interaction (8)

Wait (3)

Get (6)

Cookies (5)

Snapshot & Diff (5)

Script System (3)

Agent Workflow Example

Key Features

Architecture

License