agentic-browser
v2.5.1
Published
CLI and MCP server for AI agents to control Chrome via CDP
Maintainers
Readme
Agentic Browser
CLI and MCP server to control a local Chrome session for AI agents.
Purpose
- Starts a managed Chrome session.
- Accepts commands (for example
navigate). - Returns structured JSON output that an LLM can parse directly.
- Optimized for low-latency command execution by reusing CDP connections.
Requirements
- Node.js 20+
- Installed Chrome
Install
npm install agentic-browser -gBuild (Development)
npm install
npm run buildQuality Checks
npm run format
npm run lint
npm testUse Your Existing Chrome
By default, agentic-browser launches a fresh Chrome with a throwaway profile. If you want to use your logged-in sessions, cookies, or extensions, there are two ways:
Option 1: Control your running Chrome (recommended)
This lets agentic-browser take over your already-open Chrome — no need to quit it, no need to log in again.
Step 1. Quit Chrome, then relaunch it with remote debugging enabled:
# macOS
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222
# Linux
google-chrome --remote-debugging-port=9222
# Windows
chrome.exe --remote-debugging-port=9222Chrome opens normally with all your tabs, extensions, and sessions intact.
Step 2. Connect agentic-browser:
agentic-browser agent start --cdp-url http://127.0.0.1:9222Stopping the session will not close your Chrome.
Option 2: Launch a new Chrome with your profile
Important: You must quit Chrome first. Chrome locks its profile directory — if Chrome is already running, this command will fail.
# Quit Chrome, then:
agentic-browser agent start --user-profile defaultThis launches a new Chrome window using your default profile. You can also pass a custom profile path:
agentic-browser agent start --user-profile /path/to/chrome/profileDefault profile locations per platform:
- macOS:
~/Library/Application Support/Google/Chrome - Linux:
~/.config/google-chrome - Windows:
%LOCALAPPDATA%\Google\Chrome\User Data
Environment variables
These options can also be set via environment variables (CLI flags take precedence):
| Variable | Example | Description |
| ------------------------------ | ----------------------------- | ------------------------------- |
| AGENTIC_BROWSER_CDP_URL | http://127.0.0.1:9222 | Connect to a running Chrome |
| AGENTIC_BROWSER_USER_PROFILE | default or an absolute path | Launch with a real profile |
| AGENTIC_BROWSER_HEADLESS | true | Run Chrome in headless mode |
| AGENTIC_BROWSER_USER_AGENT | MyBot/1.0 | Override the browser user-agent |
Agent Commands (Recommended for LLMs)
The agent subcommand manages session state, auto-restarts on disconnect, generates command IDs, and retries failed commands automatically:
agentic-browser agent start
agentic-browser agent start --cdp-url http://127.0.0.1:9222
agentic-browser agent start --user-profile default
agentic-browser agent start --headless
agentic-browser agent start --user-agent "MyBot/1.0"
agentic-browser agent status
agentic-browser agent navigate https://example.com
agentic-browser agent click "#login"
agentic-browser agent content --mode summary
agentic-browser agent content --mode html --selector main
agentic-browser agent elements
agentic-browser agent elements --roles button,link --limit 20
agentic-browser agent memory-search "navigate:example.com" --domain example.com
agentic-browser agent stop
agentic-browser agent cleanup --dry-run --max-age-days 7Discover Interactive Elements
List all clickable/interactive elements on the current page:
agentic-browser agent elements
agentic-browser agent elements --roles button,link,input --visible-only --limit 30
agentic-browser agent elements --selector "#main-content"Returns a JSON array of elements with selectors and fallback selectors usable in the typed agent commands like agent click, agent type, and agent select:
{
"ok": true,
"action": "elements",
"elements": [
{
"selector": "#login-btn",
"fallbackSelectors": ["button[aria-label=\"Login\"]"],
"role": "button",
"text": "Login",
"enabled": true
}
],
"totalFound": 42,
"truncated": true
}MCP responses are compact — visible, actions, and tagName are omitted to reduce token usage, and enabled is omitted when it is true. Responses also include a summary block with countsByRole and primaryActions so an LLM can identify the main controls faster. The full element shape is available via the programmatic API.
When an element lives inside an open shadow root or a same-origin iframe, discovery returns a composed locator using >>> to cross boundaries, for example iframe[name="checkout"] >>> button[aria-label="Pay"]. The same locator string works with agent click, agent type, MCP browser_interact, and fallback selectors.
agent content and MCP browser_get_content now default to summary, which returns a low-token overview with title, headings, landmarks, primaryActions, inputs, alerts, and iframe metadata. Use a11y when you need the deeper accessibility structure, and text or html only when you need raw content.
## MCP Server
### Quick Setup
```bash
npx agentic-browser setupDetects your AI tools (Claude Code, Cursor) and writes the MCP config automatically.
Manual Configuration
Add to your MCP config (.mcp.json, .cursor/mcp.json, etc.):
{
"mcpServers": {
"agentic-browser": {
"command": "npx",
"args": ["agentic-browser", "mcp"]
}
}
}Low-Level CLI Commands
For direct control without session state management:
1. Start a Session
agentic-browser session:start
agentic-browser session:start --cdp-url http://127.0.0.1:9222
agentic-browser session:start --user-profile default
agentic-browser session:start --headless
agentic-browser session:start --user-agent "MyBot/1.0"2. Read Session Status
agentic-browser session:status <sessionId>3. Run a Command (navigate / interact)
agentic-browser command:run <sessionId> <commandId> navigate '{"url":"https://example.com"}'
agentic-browser command:run <sessionId> cmd-2 interact '{"action":"click","selector":"a"}'More interact actions:
{"action":"type","selector":"input[name=q]","text":"innoq"}{"action":"press","key":"Enter"}{"action":"waitFor","selector":"main","timeoutMs":4000}{"action":"goBack"}— browser back{"action":"goForward"}— browser forward{"action":"refresh"}— reload page{"action":"dialog"}— accept a JS dialog (alert/confirm/prompt){"action":"dialog","text":"dismiss"}— dismiss a dialog{"action":"dialog","value":"answer"}— respond to a prompt dialog
4. Read Page Content
agentic-browser page:content <sessionId> --mode summary
agentic-browser page:content <sessionId> --mode title
agentic-browser page:content <sessionId> --mode html --selector main5. Rotate Session Token
agentic-browser session:auth <sessionId>6. Restart / Stop / Cleanup
agentic-browser session:restart <sessionId>
agentic-browser session:stop <sessionId>
agentic-browser session:cleanup --max-age-days 77. Task Memory
agentic-browser memory:search "navigate:example.com" --domain example.com --limit 5
agentic-browser memory:inspect <insightId>
agentic-browser memory:verify <insightId>
agentic-browser memory:statsRecommended Agent Flow
agent start— launch Chrome and persist session.agent elements— discover what's on the page.agent navigate,agent click,agent type,agent select, and related typed commands — execute actions using discovered selectors.agent content— read page content after actions.agent memory-search— reuse known selectors for repeated tasks.agent stop— terminate when done.
Anti-Detection
Chrome is launched with stealth flags (--disable-blink-features=AutomationControlled, --disable-infobars, --excludeSwitches=enable-automation) and runtime patches that remove navigator.webdriver, fake plugins and languages, patch chrome.runtime, clean Function.prototype.toString, and patch Permissions.query. The HeadlessChrome token is automatically stripped from the user-agent string. Stealth is always on — no configuration needed.
Important Notes for LLMs
- Exactly one managed session is supported at a time.
- Session state is persisted in
~/.agentic-browser/(override withAGENTIC_BROWSER_DIR). - All commands print exactly one JSON line to
stdout. - The typed
agentcommands are the preferred interface for LLMs. Use low-levelcommand:runonly when you need raw JSON payload control. - Parse only
stdoutas result object and use exit code for success/failure.
Programmatic API
import { createAgenticBrowserCore } from "agentic-browser";
const core = createAgenticBrowserCore();
const session = await core.startSession();
await core.runCommand({
sessionId: session.sessionId,
type: "navigate",
payload: { url: "https://example.com" },
});
const elements = await core.getInteractiveElements({
sessionId: session.sessionId,
roles: ["button", "link"],
visibleOnly: true,
limit: 30,
});
const memory = core.searchMemory({
taskIntent: "navigate:example.com",
siteDomain: "example.com",
limit: 3,
});
await core.stopSession(session.sessionId);Connect to your running Chrome
// Chrome must be running with --remote-debugging-port=9222
const core = createAgenticBrowserCore({
env: { ...process.env, AGENTIC_BROWSER_CDP_URL: "http://127.0.0.1:9222" },
});
const session = await core.startSession();
// Stopping the session will NOT close your ChromeLaunch Chrome with your real profile
// Chrome must be closed first
const core = createAgenticBrowserCore({
env: { ...process.env, AGENTIC_BROWSER_USER_PROFILE: "default" },
});
const session = await core.startSession();Documentation
npm run docs:dev # Dev server at localhost:5173
npm run docs:build # Static build