chrome-ai-action
v2.0.2
Published
AI agent browser automation — control Chrome programmatically from AI agents (opencode, etc.) via CDP
Maintainers
Readme
Chrome AI Action
Browser automation bridge for AI agents (like opencode). Controls Chrome programmatically via CDP (Chrome DevTools Protocol) — navigate, click, type, screenshot, extract content, and more.
Architecture
┌─────────────────────────────────────────┐
│ AI Agent │
│ (opencode, custom LLM agent) │
└──────────────────┬──────────────────────┘
│ HTTP JSON (127.0.0.1:9876)
▼
┌─────────────────────────────────────────┐
│ Chrome AI Action Bridge (CDP) │
│ native-host/native_host_bridge.js │
└──────────────────┬──────────────────────┘
│ CDP WebSocket (ws://127.0.0.1:9222)
▼
┌─────────────────────────────────────────┐
│ Chrome │
│ (--remote-debugging-port=9222) │
└─────────────────────────────────────────┘Quick Start
1. Start Chrome with Remote Debugging
# Windows
"C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222
# macOS
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222
# Linux
google-chrome --remote-debugging-port=92222. Install & Start the Bridge
# Install globally
npm install -g chrome-ai-action
# Start the bridge
chrome-ai-action
# Or with custom port
chrome-ai-action --port 9877The bridge will be available at http://127.0.0.1:9876.
API Protocol
Send JSON messages to http://127.0.0.1:9876:
{"type": "action", "action": "navigate", "params": {"url": "https://example.com"}}Message Types
| Type | Description |
|------|-------------|
| ping | Health check |
| action | Execute a browser action |
| batch | Execute multiple actions sequentially |
Available Actions
Navigation
navigate, goBack, goForward, reload, getUrl, getTitle
Page Content
getText, getHtml, getLinks, getHeadings, getMetaTags, getFormFields, getFocusableElements
Element Interaction
click, type, pressKey, scroll, scrollIntoView, findElement
Data Extraction
getValue, getAttribute, getBoundingBox, getCookies, getPerformanceMetrics
JavaScript
evaluate
Screenshot
screenshot
Tab Management
listTabs, newTab, closeTab, switchTab, getCurrentTab
Waiting
waitForElement, waitForTimeout
Project Structure
chrome-ai-action/
├── index.js # CLI entry point
├── native-host/
│ └── native_host_bridge.js # HTTP + CDP WebSocket bridge (core)
├── test/
│ └── e2e.js # Integration tests
├── package.json
└── README.mdRequirements
- Chrome 116+ (with
--remote-debugging-port=9222) - Node.js 18+
License
MIT
