@agent-browser-io/browser

v0.3.0

Published

3 months ago

Token efficient agent browser

0High
0Medium
0Low

agent-browser-io

@agent-browser-io/browser

Token efficient agent browser.

This package lets AI agents control a real browser ( navigate, click, type, interact via ASCII wireframes ) in a token-efficient way. Use it from MCP clients (e.g. Cursor, Claude Desktop) or from code with the Vercel AI SDK.

Ways to use:

MCP — Add the included MCP server to Cursor or another MCP client so the AI can drive a browser (see How to add MCP).
Vercel AI SDK — Use createBrowserTools(browser) with generateText({ tools, ... }) in your app (see Vercel AI SDK).
CLI — Run the interactive CLI for manual testing (npx @agent-browser-io/browser or agent-browser-cli after install).

Install

npm install @agent-browser-io/browser

How to add MCP

MCP (Model Context Protocol) lets AI assistants in Cursor or Claude Desktop use browser tools over stdio. Your AI will be able to launch a browser, open URLs, get wireframes, click, type, scroll, screenshot, and more.

Run the MCP server (for testing):

npx @agent-browser-io/browser mcp

Add to Cursor

Open Cursor settings → MCP (or edit your MCP config file, e.g. ~/.cursor/mcp.json or project .cursor/mcp.json).
Add a server entry:

{
  "mcpServers": {
    "agent-browser": {
      "command": "npx",
      "args": ["-y", "@agent-browser-io/browser", "mcp"]
    }
  }
}

Restart Cursor or reload MCP so it picks up the new server. The agent-browser tools will appear for the AI to use.

Other MCP clients (e.g. Claude Desktop)

Use the same stdio command in your client's config:

Command: npx (or full path to node)
Args: ["-y", "@agent-browser-io/browser", "mcp"] (or ["path/to/bin/index.cjs", "mcp"])

The server speaks JSON-RPC over stdin/stdout; no extra env vars are required.

Vercel AI SDK

You can use the same browser automation as tools with the Vercel AI SDK and generateText. The package exposes createBrowserTools(browser), which returns an object of tools you can pass to generateText({ tools, ... }). The ai package is included as a dependency.

Tools: launch, navigate, getWireframe, click, type, fill, dblclick, hover, press, select, check, uncheck, scroll, screenshot, close. Same toolset as the MCP server, so behavior is consistent.

Important: Have the model call the launch tool first before other actions (navigate, getWireframe, click, etc.).

Example:

import { createBrowserTools, AgentBrowser, DefaultBrowserBackend } from '@agent-browser-io/browser';
import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai';

const browser = new AgentBrowser(new DefaultBrowserBackend());
const tools = createBrowserTools(browser);

const { text } = await generateText({
  model: openai('gpt-4o'),
  tools,
  prompt: 'Go to hackernews visit on top 3 news, and summarize their content.',
});
// Model will call launch, then navigate, then getWireframe, etc.

Development

Requires Node 18+. Browser automation uses Playwright (included as a dev dependency).

npm install
npm run build

Builds to dist/cjs (CommonJS) and dist/esm (ESM).

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@agent-browser-io/browser

Install

How to add MCP

Vercel AI SDK

Development