@zabarich/agentbrowser
v0.1.0
Published
A TypeScript-native autonomous browser agent for Node.js, inspired by browser-use. Give an LLM a task and a headless browser — it observes, decides, and acts autonomously.
Maintainers
Readme
agentbrowser
A TypeScript-native autonomous browser agent for Node.js, inspired by browser-use.
Give an LLM a task and a headless browser. The agent observes the page, decides what to do, executes actions, and repeats — navigating, clicking, filling forms, and extracting data autonomously.
Early but functional. Built on Playwright + Anthropic/OpenAI-compatible LLMs.
Status: v0.1.0 — core agent loop works, tested against real sites and local models. Not yet battle-tested on hostile pages, auth flows, or complex SPAs. Contributions welcome.
Install
npm install agentbrowser
npx playwright install chromiumQuick Start
import { Agent, BrowserSession } from "agentbrowser";
const session = new BrowserSession({ headless: true });
const agent = new Agent({
task: "Find the latest version of the anthropic SDK on PyPI",
browser: session,
llm: {
provider: "anthropic",
model: "claude-sonnet-4-20250514",
apiKey: process.env.ANTHROPIC_API_KEY,
},
maxSteps: 10,
maxFailures: 3,
});
const result = await agent.run();
console.log(result.success); // true
console.log(result.finalResult); // "The latest version is 0.39.0..."
console.log(result.visitedUrls); // ["https://pypi.org/project/anthropic/"]
console.log(result.history); // Step-by-step actions taken
await session.close();API
BrowserSession
Manages a Playwright browser instance.
const session = new BrowserSession({
headless: true, // Run browser without GUI (default: true)
});
await session.start(); // Launches browser (called automatically by Agent)
await session.getScreenshot(); // Returns base64-encoded PNG of current page
await session.getPageInfo(); // Returns viewport/scroll dimensions
await session.close(); // CleanupAgent
The autonomous browser agent. Supports multiple LLM providers.
// Anthropic Claude
const agent = new Agent({
task: "Your task description",
browser: session,
llm: {
provider: "anthropic",
model: "claude-sonnet-4-20250514",
apiKey: process.env.ANTHROPIC_API_KEY,
},
maxSteps: 10,
maxFailures: 3,
});
// OpenAI
const agent = new Agent({
task: "Your task description",
browser: session,
llm: {
provider: "openai",
model: "gpt-4o",
apiKey: process.env.OPENAI_API_KEY,
},
});
// Local model (llama.cpp, Ollama, vLLM, LM Studio — any OpenAI-compatible server)
const agent = new Agent({
task: "Your task description",
browser: session,
llm: {
provider: "openai",
model: "qwen",
apiKey: "not-needed",
baseUrl: "http://localhost:8080/v1",
},
});Options:
maxSteps— Maximum steps before stopping (default: 10)maxFailures— Consecutive failures before stopping (default: 3)maxActionsPerStep— Max actions per LLM response (default: 5)useVision— Send screenshots to LLM (default: true, auto-disabled for local models withbaseUrl)
AgentResult
interface AgentResult {
success: boolean;
finalResult: string | null;
history: AgentStep[];
visitedUrls: string[];
screenshots?: Buffer[]; // One per step (when useVision is true)
error?: string;
stepsUsed: number; // Observe-think-act cycles completed
consecutiveFailures: number; // Failure streak at time of stop
totalFailures: number; // Lifetime failure count (never resets)
duration: number; // ms
}
interface AgentStep {
index: number;
url: string;
action: AgentAction;
result: string;
thinking?: string; // LLM's reasoning for this step
screenshot?: Buffer;
timestamp: number; // Unix ms
}How It Works
- Observe — Extracts the current page DOM into a compact indexed format that the LLM can reason about
- Think — Sends the page state + task + history to Claude, which decides what action(s) to take
- Act — Executes the action(s) via Playwright (click, type, navigate, scroll, etc.)
- Repeat — Until the task is complete, max steps reached, or too many failures
Examples
See the examples/ directory:
basic-navigation.ts— Navigate to a URL and extract datasearch-and-extract.ts— Search engine query and extract resultsform-filling.ts— Fill and submit a web formlocal-model.ts— Use a local OpenAI-compatible server (llama.cpp, Ollama, etc.)
Requirements
- Node.js >= 18
"type": "module"in yourpackage.json(this is an ESM package)- An LLM API key (Anthropic, OpenAI, or a local server)
Developing from source
When running examples or tests from this repo, imports use ../src/index.js.
When consuming the published npm package, use agentbrowser:
import { Agent, BrowserSession } from "agentbrowser";