@browser-cli/cli
v0.3.1
Published
Skill-powered browser automation CLI for AI agents — real browser extensions, optional CDP, no Playwright
Maintainers
Readme
Browser-CLI
Skill-powered browser automation CLI for AI agents — real browser extensions, optional CDP, no Playwright.
Why Browser-CLI?
Most browser automation tools (Playwright, Puppeteer, Selenium) rely on CDP or WebDriver protocols — running a headless or debug-mode browser that doesn't behave like a real user's browser. Browser-CLI takes a different approach:
- Real browser, zero fingerprint — Runs inside your actual Chrome/Firefox via a lightweight extension. No
navigator.webdriver, no headless flags, no CDP traces by default — behaves exactly like a human user, minimizing the risk of triggering anti-bot detection. - Same session, same identity — Operates in your existing browser with all your cookies, login state, and extensions intact. No separate browser profile or cold start.
- Skill-first design — Ships with a skill definition so AI agents (Claude Code, etc.) can call
/browser-clias a tool and automate tasks autonomously. - CDP-free by default — The extension communicates over WebSocket, with no dependency on browser drivers. Opt-in
--debuggerflag uses CDP for trusted (isTrusted=true) input events when needed. - Agent-friendly output — Accessibility snapshots with element refs (
@e1,@e2), semantic locators, structured errors with hints, and--jsonmode.
Architecture
CLI (client) ── NDJSON / Unix socket ──→ Daemon (server) ── JSON / WebSocket ──→ Extension (browser)The CLI sends commands to a background daemon, which relays them over WebSocket to a browser extension. The extension executes commands via Chrome APIs or content scripts and returns results through the same path. CDP-free by default — just a lightweight extension in your real browser. Opt-in --debugger flag uses CDP for trusted input events when needed.
Features
Page & Content
- Navigate (goto, back, forward, reload), take screenshots
- Extract clean readable Markdown via Defuddle — strips nav, ads, and boilerplate
- Accessibility snapshots with element refs (
@e1,@e2) for precise interaction - Evaluate JavaScript in page context
Interaction
- Actions — click, fill, type, press, hover, drag & drop, scroll, check/uncheck, select, upload
- Semantic locators — find elements by role, text, label, placeholder, alt, title, testid, xpath
findcommand — locate + act in one step:find 'role=button[name="Submit"]'- Wait for selector, URL, duration, text, load state, or custom function
Browser State
- Tabs, windows, tab groups; frame switching
- Cookies, localStorage, sessionStorage
- Network interception (block, redirect, track)
- Dialogs (alert/confirm/prompt), console logs
Scripting
scriptcommand — run multi-step automation as a single ES module (Node.js process)- Browser SDK with 1:1 CLI command mapping (
browser.navigate(),browser.click(), etc.) - Supports stdin, file input, CLI arguments, and per-command timeouts
Data & Config
- Query text, HTML, value, attributes, element state, count, bounding box
- Viewport, geolocation, media preferences, custom headers
- Bookmarks & history management
Site-Specific Guides — Pre-built Automation for Popular Sites
Browser-CLI ships with site-specific guides that contain tested CSS selectors, extraction scripts, and interaction patterns for popular websites. When an AI agent automates a known site, it can skip trial-and-error DOM exploration and use the pre-built scripts directly — saving tokens and dramatically improving accuracy.
| Site | What's Covered | | ------------------------------------------------------------------------------------- | -------------------------------------------------------- | | google.com | Search results extraction, pagination, "People also ask" | | scholar.google.com | Academic paper search, citations, metadata | | mail.google.com | Gmail inbox, email reading, compose, labels | | youtube.com | Video search, transcript extraction, captions | | x.com | Timeline, tweets, search, profiles | | reddit.com | Feeds, posts, threaded comments, subreddit search | | news.ycombinator.com | Front page, comments, search | | linkedin.com | Company pages, people search, feed, job listings | | discord.com | Servers, channels, messages, search, members | | quora.com | Question search, answers extraction | | xiaohongshu.com | Search, note detail, comments | | weixin.sogou.com | WeChat article search | | jira-datacenter | Self-hosted Jira issue tracking, agile boards | | opensearch-dashboards | Self-hosted log analytics (Kibana fork) |
Each guide includes ready-to-use browser-cli eval scripts, key selectors, pagination/scroll patterns, and site-specific gotchas (auth requirements, shadow DOM, SPA caveats). Community contributions welcome — use the site-guide skill to interactively explore a site's live DOM and generate a tested guide, or see the contributing guide for manual authoring.
Quick Start
npm install -g @browser-cli/cliThen install the browser extension from GitHub Releases and load it into Chrome or Firefox. For detailed steps (extension loading, daemon connection, troubleshooting), see the Setup Guide.
# Start the daemon and verify connection
browser-cli start
browser-cli status
# Navigate and inspect
browser-cli navigate https://example.com
browser-cli snapshot -ic
# Interact with elements
browser-cli click 'role=button[name="Submit"]'
browser-cli fill 'label=Email' [email protected]
browser-cli find 'role=button[name="Submit"]'
# Extract page content
browser-cli markdown
# Stop the daemon
browser-cli stopFor the full command reference — including all operations, selector types, semantic locators, and workflow examples — see skills/browser-cli/SKILL.md.
Install as Claude Code Skill
Browser-CLI ships with a skill definition that lets AI agents use it as a tool. To install it in Claude Code:
# Add the marketplace
/plugin marketplace add six-ddc/browser-cli
# Install the skill
/plugin install browser-cli@six-ddc/browser-cliOnce installed, agents can invoke /browser-cli directly in Claude Code conversations:
# /browser-cli navigate to hacker news and get the top 3 storiesDevelopment
Prerequisites
- Node.js >= 20
- pnpm >= 10
- Bun (used to build the CLI)
Setup
git clone https://github.com/six-ddc/browser-cli.git
cd browser-cli
pnpm install
pnpm build
# Link the CLI command globally for local development
cd apps/cli && pnpm link --globalAfter linking, the browser-cli command will be available system-wide. Changes to the code will be reflected after running pnpm build (no need to re-link).
# Start extension in dev mode (hot reload)
pnpm --filter @browser-cli/extension dev
# Build CLI only
pnpm --filter @browser-cli/cli build
# Run quality checks
pnpm lint # ESLint
pnpm typecheck # TypeScript
pnpm test # Vitest
# Format code
pnpm formatTo unlink the global command:
cd apps/cli && pnpm unlink --globalPackages
| Package | Path | Description |
| ------------------------ | ----------------- | -------------------------------------------------- |
| @browser-cli/cli | apps/cli | CLI client + daemon process |
| @browser-cli/extension | apps/extension | Browser extension — Chrome + Firefox (WXT + React) |
| @browser-cli/shared | packages/shared | Protocol types, Zod schemas, constants |
