agentbrowse

v0.2.0

Published

18 hours ago

Agent-browser CLI: drive any website from the terminal.

0High
0Medium
0Low

mandarsagarwagh

agentbrowse

Drive any website from the terminal — built for AI coding agents.

Agents (Claude Code, Codex, …) are great at running CLIs and clumsy at clicking through web UIs. agentbrowse gives them a clean, parseable surface: open a page, read it as token-bounded markdown, follow links, fill and submit forms, and operate behind a login — all from terminal commands, with a persistent browser session that survives across invocations.

There is no separate web interface to wire up. The agent runs agentbrowse and gets structured output back.

Install

npm install -g agentbrowse
# first run downloads the browser:
npx playwright install chromium

Quickstart

agentbrowse open https://example.com    # navigate the session
agentbrowse read                        # current page as clean markdown
agentbrowse links                       # numbered, followable links
agentbrowse click "Learn more"          # click by visible text...
agentbrowse click 2                     # ...or by a number from links/find
agentbrowse read --json                 # structured output for machines
agentbrowse stop                        # end the session (frees the browser)

read/links accept an optional URL to open first, so agentbrowse read https://x.com is "open then read" in one step.

Make your agent use it by default

Install agentbrowse as a skill so your coding agent reaches for it automatically on web tasks — no prompting required:

npx agentbrowse skill           # auto-detects Claude Code, Codex, Cursor, Gemini, Windsurf in this project
npx agentbrowse skill --global  # or install once at the user level

It writes each agent's native format — a SKILL.md for Claude Code, a .cursor/rules rule for Cursor, an AGENTS.md block for Codex and others — and is safe to re-run (idempotent). Preview with --print; target one with npx agentbrowse skill claude|codex|cursor|gemini|windsurf.

Claude Code plugin — install the skill from this repo's built-in marketplace:

/plugin marketplace add mandarwagh9/agentbrowse-skill
/plugin install agentbrowse@agentbrowse

How it works

A background browser daemon (auto-spawned per session, local socket only) holds a live Playwright page, so state persists between separate commands — open, then later click, then read, all hit the same page. The daemon self-stops after inactivity.

Sessions are isolated by --session <id> (default default), each with its own cookies and saved auth.

Commands

| Command | What it does | |---|---| | open <url> | Navigate the session to a URL | | read [url] | Current page (or open <url> first) as token-bounded markdown (--max-chars, --page) | | links [url] | Numbered, followable links (--filter) | | snapshot [url] | Accessibility-tree view: every actionable element with a stable [ref], role, name, state (--filter, --max, --json). The robust way to act | | find <text> | Locate elements by visible text (falls back to accessible name); numbers reusable by click | | click <target> | Click by a snapshot ref (robust), visible text, a links/find number, or a CSS selector | | type <field> <text> | Type into a field (CSS selector or bare name) | | fill -f name=value … | Fill form fields | | submit [form] | Submit the current form | | login <url> | Open a real browser to authenticate once; persists the session for headless reuse | | session save\|load\|clear | Manage saved auth/session state | | stop | Stop the session's browser daemon |

Add --json to any command for structured output. Errors go to stderr as { "error": { code, message } } with a non-zero exit code (2 usage, 3 navigation, 4 target-not-found, 5 daemon).

Authentication

agentbrowse login https://site/login opens a real browser window for you to log in (handling SSO, MFA, captchas an agent can't). On success it saves the session's cookies locally (~/.webcli/sessions/<id>/, gitignored); the agent then operates headlessly with that session. Credentials are typed into the browser by a human — never passed as CLI arguments.

Site manifests (optional)

Point agentbrowse at a site.agent.json to expose named, high-level commands for a specific site:

agentbrowse --site ./notion.agent.json search "roadmap"

A manifest declares pages, selectors, and commands as ordered steps with {pages.*} / {selectors.*} / {arg} / ${ENV} interpolation. Schema version: webcli-manifest-v0. See AGENTS.md for the agent-usage guide and the manifest format.

License

Free to use, including commercially — but not copyable. You may install, run, and use agentbrowse; you may not copy, fork, redistribute, or modify it. See LICENSE. (Versions 0.0.1–0.1.1 were released under MIT and keep that license.)

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

agentbrowse

Install

Quickstart

Make your agent use it by default

How it works

Commands

Authentication

Site manifests (optional)

License