@felixisaac/tandem
v1.5.0
Published
Browser automation for AI agents via Chrome extension + Native Messaging. Works with OpenCode, Claude Code, and any MCP-compatible agent.
Downloads
906
Maintainers
Readme
Tandem
A hardened, multi-agent fork of opencode-browser by Benjamin Shafii.
Browser automation for AI agents via Chrome extension + Native Messaging. Works with Claude Code, OpenCode, Cursor, Windsurf, VS Code Copilot, Gemini CLI, Codex, and any MCP-compatible agent.
Runs inside your existing Chrome session — shares your logins, cookies, and bookmarks. No separate profiles, no re-authentication.
Agent tabs open in a dedicated Tandem window (cyan tab group) so your browsing is never interrupted.
Why not Playwright / DevTools?
Chrome 136+ blocks --remote-debugging-port on your default profile. DevTools-based tools trigger a security prompt every time. This project uses Chrome's Native Messaging API — the same approach Anthropic uses for Claude in Chrome — so automation works silently against your live session.
Installation
npx @felixisaac/tandem installThe installer:
- Copies the extension to
~/.tandem/extension/ - Opens Chrome so you can load the unpacked extension
- Registers the native messaging host (registry on Windows,
NativeMessagingHostson macOS/Linux) - Optionally updates your agent config file
Agent Setup
After installing, the server lives at ~/.tandem/server.js. Configure your agent:
claude mcp add -s user browser -- node ~/.tandem/server.jsAdds the server globally — available in every Claude Code session.
Add to opencode.json (project) or ~/.config/opencode/opencode.json (global):
{
"mcp": {
"browser": {
"type": "local",
"command": ["node", "~/.tandem/server.js"],
"enabled": true
}
}
}Add to ~/.cursor/mcp.json (global) or .cursor/mcp.json (project):
{
"mcpServers": {
"browser": {
"command": "node",
"args": ["~/.tandem/server.js"]
}
}
}Restart Cursor after saving.
Add to ~/.codeium/windsurf/mcp_config.json:
{
"mcpServers": {
"browser": {
"command": "node",
"args": ["~/.tandem/server.js"]
}
}
}Add to .vscode/mcp.json (workspace) or open MCP: Open User Configuration from the command palette:
{
"servers": {
"browser": {
"type": "stdio",
"command": "node",
"args": ["~/.tandem/server.js"]
}
}
}Reload the window after saving (Ctrl+Shift+P → Developer: Reload Window).
Add to ~/.gemini/settings.json:
{
"mcpServers": {
"browser": {
"command": "node",
"args": ["~/.tandem/server.js"]
}
}
}Add to ~/.codex/config.toml:
[mcp_servers.browser]
command = "node"
args = ["~/.tandem/server.js"]Available Tools
Page Interaction
| Tool | Description |
|------|-------------|
| browser_snapshot | Start here. Accessibility tree with CSS selectors — low token cost (200–1500 tokens) |
| browser_snapshot_cached | Cached snapshot — re-uses last result if URL unchanged and < 30 s old. Use in multi-step flows to skip re-layout |
| browser_invalidate_cache | Clear snapshot cache for a tab (or all tabs) — call after mutating actions |
| browser_page_text | Plain text (innerText) — cheapest way to read page content |
| browser_screenshot | Visual capture — use only when layout matters (500–3000 tokens) |
| browser_navigate | Navigate to a URL |
| browser_click | Click an element by CSS selector |
| browser_double_click | Double-click an element (text selection, file open) |
| browser_right_click | Right-click to trigger context menu |
| browser_hover | Hover mouse over element — triggers mouseover/mousemove for hover menus |
| browser_drag_drop | Drag element to target element or coordinates |
| browser_select_option | Select a <select> option by value or label text |
| browser_type | Type text into an input |
| browser_keyboard | Send key events (Enter, Escape, Tab, ctrl+a, …) |
| browser_dialog_handle | Accept or dismiss alert/confirm/prompt dialogs |
| browser_wait_for_selector | Wait for element to appear — essential for SPAs |
| browser_scroll | Scroll page or element into view |
| browser_wait | Wait for a fixed duration (capped at 30s) |
| browser_execute | Run JavaScript via chrome.debugger — works on all pages including CSP-strict sites |
| browser_storage_inspect | Read localStorage or sessionStorage from a tab |
| browser_print_to_pdf | Print page to PDF via Chrome's print engine (base64) |
| browser_performance | CDP Performance metrics: heap size, DOM nodes, layout count |
| browser_device_emulate | Emulate mobile viewport via CDP; reset=true restores desktop |
| browser_get_element_info | Get precise bounds, center, visibility, and z-index for any element |
Tab Management
| Tool | Description |
|------|-------------|
| browser_status | Show connection status + current tab claims |
| browser_list_claims | List per-session tab ownership claims |
| browser_claim_tab | Claim a specific tab for this session |
| browser_release_tab | Release a claimed tab |
| browser_open_tab | Open and claim a fresh agent tab for this session |
| browser_get_tabs | List all open tabs |
| browser_new_tab | Open a new tab in the agent window |
| browser_close_tab | Close a tab |
| browser_switch_tab | Focus a tab (use to hand off to user) |
| browser_new_window | Open a new browser window (supports incognito) |
| browser_open_batch | Open up to 20 URLs as tabs in one call |
| browser_deduplicate_tabs | Find and close duplicate-URL tabs (supports dry-run) |
Tab Groups
| Tool | Description |
|------|-------------|
| browser_get_tab_groups | List all tab groups with colors, titles, and member tabs |
| browser_create_tab_group | Create a new group from tab IDs with optional title and color |
| browser_update_tab_group | Rename, recolor, or collapse/expand a group |
| browser_move_to_group | Move tabs into an existing group |
Sessions
| Tool | Description |
|------|-------------|
| browser_session_save | Save all open tabs as a named session to Chrome storage |
| browser_session_restore | Restore a saved session (opens all saved URLs) |
History & Bookmarks
| Tool | Description |
|------|-------------|
| browser_search_history | Search history by keyword, URL, and date range |
| browser_recent_browsing | Recent visits from the last N hours |
| browser_history_stats | Total entries, date range, top visited domains |
| browser_get_bookmarks | Full bookmarks tree |
Browser Utilities
| Tool | Description |
|------|-------------|
| browser_downloads | List recent Chrome downloads |
| browser_notify | Send a Chrome desktop notification (supports buttons) |
| browser_storage_read | Read Chrome extension storage (local or sync) |
Cookie Management
| Tool | Description |
|------|-------------|
| browser_get_cookies | Get cookies for a tab's URL via CDP |
| browser_get_all_cookies | Get all browser cookies (optional domain filter) |
| browser_set_cookie | Set a cookie with full control over flags and expiry |
| browser_delete_cookies | Delete cookies by name, domain, or URL |
Network & Emulation
| Tool | Description |
|------|-------------|
| browser_network_conditions | Throttle bandwidth / add latency / go offline — presets: offline, slow-2g, 2g, 3g, fast-3g, 4g |
| browser_geolocation | Override GPS coordinates; reset=true clears |
| browser_user_agent | Override UA string (presets: mobile-android, mobile-ios), timezone, and locale |
| browser_inject_script | Inject JS at document start on every page load; returns scriptId |
| browser_block_urls | Block URL patterns from loading (ads, analytics, images); reset=true clears |
MCP Resources
| URI | Description |
|-----|-------------|
| tandem://agents-guide | Full AGENTS.md behavioral guide — fetch via resources/read for runtime agent context |
Per-tab Ownership
Tandem enforces per-session tab claims so multiple agents can share one Chrome session without stepping on each other.
- If you omit
tabId, Tandem auto-creates (once) and reuses a default agent tab per MCP client session. - If you pass an explicit
tabId, Tandem will reject the call if that tab is claimed by another session. - Use
browser_open_tabto open a tab that is automatically claimed for your session. - Claims expire after inactivity (
TANDEM_CLAIM_TTL_MS, default 5 minutes). Debug withbrowser_status/browser_list_claims.
Agent Instructions
See AGENTS.md for full behavioral guidance:
- Snapshot-first workflow (token efficiency)
- Waiting for dynamic elements before clicking
- Hand-off pattern for login walls / CAPTCHAs
- Error recovery table
- Security rules and prompt injection protection
Claude Code users: see CLAUDE.md.
Architecture
src/server.js— MCP server; exposes tools, enforces rate limits, routes to hostsrc/host.js— Native messaging host; bridges socket ↔ Chrome, handles authextension/background.js— Service worker; executes Chrome APIs, runs JS viachrome.debugger
Multiple agents can share one browser session simultaneously.
Security model
The agent acts as you inside your real Chrome session. The trust boundary is the URL space, not the JS the agent runs.
- URL blocklist — banking, email, OAuth, password managers, crypto blocked by default. Extend via
~/.tandem/blocklist.txt(one regex per line,#for comments). The host pushes updates to the extension on reconnect. - Socket auth — 256-bit token rotated on every host start. Server reads token fresh on each connect.
- Prompt injection — agents are instructed never to act on instructions in page content (see AGENTS.md). This remains an unsolved problem industry-wide; the URL blocklist is your safety net.
browser_execute— runs arbitrary JS as you viachrome.debugger. No content filter (regex blocklists are trivially bypassable); 50KB result cap as DoS guard. Don't run untrusted agents.- Logging —
~/.tandem/logs/host.logredactscode,text, and URL query strings; rotates at 5MB; dir is0700.
Known limitation (Windows): the named pipe has the default ACL (Everyone). Any process running as the same Windows user can connect. Token rotation per host start limits exposure. Run only agents you trust.
Platform Support
| Platform | Status | |----------|--------| | Windows | ✓ (named pipe, registry) | | macOS | ✓ | | Linux | ✓ |
Uninstall
npx @felixisaac/tandem uninstallThen remove the extension from Chrome and optionally delete ~/.tandem/.
Logs
~/.tandem/logs/host.logLicense
MIT — forked from benjaminshafii/opencode-browser
