agentos-browser
v0.1.0
Published
Token-efficient web browsing for AI agents. Your agent needs 500 tokens of content, not 50,000 tokens of HTML.
Maintainers
Readme
agentos-browser
Token-efficient web browsing for AI agents.
Your agent needs 500 tokens of content, not 50,000 tokens of HTML.
npm install agentos-browserThe Problem
When an AI agent reads a web page, the naive approach dumps raw HTML into context. This is catastrophically wasteful:
| Approach | Typical Token Count | Cost at $3/1M tokens | |---|---|---| | Raw HTML (Reddit front page) | ~50,000 tokens | $0.15 per page | | After Readability extraction | ~5,000 tokens | $0.015 per page | | agentos-browser (with budget) | ~500 tokens | $0.0015 per page |
Raw HTML is 99% noise: nav bars, footers, scripts, ads, inline styles, ARIA attributes, tracking pixels. None of it helps your agent answer the question.
Real-World Benchmarks
| Site | Raw HTML | After extraction | Reduction | |---|---|---|---| | Reddit front page | ~50,000 tokens | 514 tokens | 99% | | Hacker News top stories | ~45,000 tokens | 1,847 tokens structured JSON | 96% | | MLB stats page | ~2,106 tokens | 965 tokens | 54% |
Solution
agentos-browser is a context engine for the web. It:
- Fetches the page (auto-detects JS-rendered SPAs, uses Playwright when needed)
- Extracts the real content (Mozilla Readability strips all the noise)
- Converts to clean markdown (human-readable, token-efficient)
- Budgets the output to your token limit (query-aware prioritization)
- Caches results (avoid refetching within 1 hour)
- Adapts to structured sites via the Browser API (get clean JSON from any site)
The result: your agent gets clean, relevant content that fits in its context window.
Install
npm install agentos-browser
# Install Playwright browser (first time only, required for JS-heavy pages)
npx playwright install chromiumQuick Start
CLI
# Basic fetch — clean markdown
agentos-browser fetch https://example.com
# With a token budget
agentos-browser fetch https://example.com --budget 2000
# Query-aware: prioritize sections relevant to your question
agentos-browser fetch https://example.com --query "pricing plans"
# Multiple URLs in parallel
agentos-browser fetch https://a.com https://b.com https://c.com --budget 1000
# Output as JSON (includes metadata, sections, tables)
agentos-browser fetch https://example.com --format json
# Site adapter: get Hacker News top stories as clean JSON
agentos-browser api hackernews stories.top --format pretty
# Site adapter: get a subreddit
agentos-browser api reddit posts.hot --param subreddit=programming --format pretty
# LLM extraction: extract anything from any page
agentos-browser api:extract https://example.com --query "product prices and names"Node.js API
import { browse, browseMany } from 'agentos-browser';
// Fetch and clean a page
const { text } = await browse('https://example.com', {
maxTokens: 2000,
query: 'pricing plans',
});
console.log(text); // clean, budget-trimmed markdown
// Multiple URLs in parallel
const results = await browseMany([
'https://a.com',
'https://b.com',
], { maxTokens: 1000, concurrency: 3 });
for (const r of results) {
if (r.error) console.error(r.url, r.error);
else console.log(r.content?.markdown);
}Browser API
The Browser API lets agents interact with websites as if they were REST APIs — navigate to structured endpoints, extract clean JSON, fill forms, and maintain persistent sessions.
Concept
Instead of scraping raw HTML and hoping your LLM figures out the structure, you define an adapter that knows exactly where the data lives on a site. The adapter uses CSS selectors to target specific elements, and agentos-browser does the extraction.
import { BrowserAPI } from 'agentos-browser';
const api = new BrowserAPI();
// Call a built-in adapter endpoint
const result = await api.call('hackernews', 'stories.top');
console.log(result.data); // Array of {rank, title, url, score, author, comments}
// With URL params
const userResult = await api.call('hackernews', 'user.profile', { username: 'pg' });
await api.close();CLI Commands
# Call a site adapter endpoint
agentos-browser api <site> <endpoint> [--param key=value...]
# Examples:
agentos-browser api hackernews stories.top
agentos-browser api hackernews user.profile --param username=dang
agentos-browser api reddit posts.hot --param subreddit=typescript
agentos-browser api reddit post.comments --param subreddit=typescript --param id=abc123
# LLM-powered extraction for sites without adapters
agentos-browser api:extract <url> --query <text> [--schema <json>]
agentos-browser api:extract https://news.ycombinator.com --query "top story titles and scores"
# Login to a site (persists session for future calls)
agentos-browser api:login <site> [--credential KEY=VALUE...]
agentos-browser api:login mysite --credential USERNAME=joe PASSWORD=secret
# List all available adapters
agentos-browser api:list
agentos-browser api:list --adapters-dir ./my-adapters
# Show adapters with their endpoints
agentos-browser api:adapters
agentos-browser api:adapters --format jsonBuilt-in Adapters
| Adapter | Site | Endpoints |
|---|---|---|
| hackernews | news.ycombinator.com | stories.top, stories.newest, stories.ask, stories.show, stories.jobs, user.profile, item.detail |
| reddit | old.reddit.com | posts.hot, posts.new, posts.top, posts.rising, front.hot, front.top, post.comments, subreddit.about |
Writing Your Own Adapter
Adapters are simple TypeScript (or JavaScript) objects. Here's the full structure:
import type { SiteAdapter } from 'agentos-browser';
const myAdapter: SiteAdapter = {
name: 'mysite',
baseUrl: 'https://mysite.com',
// Optional: login flow for sites that require authentication
loginFlow: {
url: 'https://mysite.com/login',
steps: [
{
fill: {
'#username': '{{env.MYSITE_USERNAME}}', // reads from env var
'#password': '{{env.MYSITE_PASSWORD}}',
},
click: '#login-button',
waitFor: '.dashboard',
},
],
successIndicator: '.user-avatar',
},
endpoints: {
'items.list': {
method: 'read',
navigate: '/items', // relative to baseUrl
waitFor: '.item-row', // wait for this selector before extracting
extract: {
selector: '.item-row', // matches each item/row
fields: {
// Simple: CSS sub-selector → textContent
title: '.item-title',
// Full control: selector + attribute + transform
price: { selector: '.price', transform: 'number' },
link: { selector: 'a', attribute: 'href' },
date: { selector: '.date', transform: 'date' },
},
},
},
// Endpoint with URL params: navigate: '/items/:id'
'items.detail': {
method: 'read',
navigate: '/items/:id', // :id is substituted from call params
waitFor: '.item-detail',
extract: {
selector: '.detail-row',
fields: { label: 'th', value: 'td' },
},
},
// Write endpoint: fills a form and submits
'items.create': {
method: 'write',
navigate: '/items/new',
fill: {
name: '#item-name', // param name → CSS selector
description: '#item-desc',
},
submit: '#submit-button',
},
},
};
export default myAdapter;Load custom adapters at runtime:
const api = new BrowserAPI({ adaptersDir: './my-adapters' });
// or
await api.loadAdapters('./my-adapters');LLM Extraction Fallback
For sites without adapters, use api:extract or api.extract() to navigate to any URL and extract structured data using an LLM:
const api = new BrowserAPI();
// Requires OPENAI_API_KEY or ANTHROPIC_API_KEY in environment
const result = await api.extract(
'https://stripe.com/pricing',
'List all plan names and monthly prices',
{ type: 'array', items: { properties: { plan: {}, price: {} } } } // optional schema
);
console.log(result.data); // structured JSON
console.log(result.source); // 'llm-extraction'Falls back to returning raw markdown if no LLM key is configured.
Persistent Auth
BrowserAPI stores browser profiles (cookies, localStorage) per site in ~/.agent-browser/profiles/<site-name>/. Once you log in, subsequent calls reuse the same session:
# Log in once (opens visible browser by default so you can handle 2FA)
agentos-browser api:login mysite --credential USERNAME=joe
# All future calls use the saved session
agentos-browser api mysite dashboardFull CLI Reference
# Fetch a page
agentos-browser fetch <url> [options]
--budget <tokens> Max tokens to return (default: unlimited)
--query <text> Focus extraction on this query
--format <type> markdown | json | text (default: markdown)
--no-cache Skip cache (always fetch fresh)
--force-playwright Always use Playwright (headless browser)
--no-playwright Never use Playwright (faster, static only)
--session <name> Use named session for cookie persistence
--timeout <ms> Request timeout in milliseconds
--concurrency <n> Max parallel fetches (multi-URL)
--show-links Include links in JSON output
--show-tables Include tables in JSON output
--verbose Print fetch metadata to stderr
# Manage cache
agentos-browser cache stats
agentos-browser cache clear
# Browser API
agentos-browser api <site> <endpoint> [--param key=value...]
agentos-browser api:extract <url> --query <text> [--schema <json>]
agentos-browser api:login <site> [--credential KEY=VALUE...]
agentos-browser api:list [--adapters-dir <dir>]
agentos-browser api:adapters [--format table|json]Node.js API Reference
import {
browse, browseMany, // High-level convenience
fetchPage, extractContent, // Low-level building blocks
budgetContent, estimateTokens,
ContentCache, SessionManager,
BrowserAPI, createBrowserAPI,
AdapterRegistry,
} from 'agentos-browser';
// High-level: fetch + extract + budget in one call
const { content, text } = await browse(url, {
maxTokens: 2000, // token budget
query: 'pricing', // relevance hint
noCache: false, // skip cache
session: 'mysite', // cookie session name
timeout: 15_000,
forcePlaywright: false,
noPlaywright: false,
});
// content.title, content.markdown, content.sections, content.tables
// content.links, content.metadata, content.tokenEstimate
// Multiple URLs
const results = await browseMany(urls, { maxTokens: 1000, concurrency: 3 });
// Low-level
const page = await fetchPage(url, { noPlaywright: true });
const content = await extractContent(page.html, { url: page.url, query: 'install' });
const trimmed = budgetContent(content, 1000);
const tokens = estimateTokens(trimmed);
// Cache
const cache = new ContentCache();
cache.set(url, content);
const cached = cache.get(url);
// Browser API
const api = new BrowserAPI({ headless: true, profileDir: '~/.agent-browser/profiles' });
const result = await api.call('hackernews', 'stories.top');
const extracted = await api.extract(url, 'article title and author');
await api.login('mysite', { USERNAME: 'joe', PASSWORD: 'secret' });
await api.close();Architecture
agentos-browser/
├── src/
│ ├── fetcher.ts — fetch pages (auto Playwright vs HTTP)
│ ├── extractor.ts — Readability + Turndown + section parsing
│ │ stripCssNoise() removes inline CSS garbage
│ ├── budgeter.ts — token budget enforcement + query-aware sorting
│ ├── cache.ts — file-based LRU cache (~/.agent-browser/cache/)
│ ├── parallel.ts — concurrent multi-URL fetching
│ ├── session.ts — cookie persistence per session name
│ ├── browser-api.ts — BrowserAPI class + adapter execution engine
│ ├── cli.ts — Commander CLI (fetch, cache, api:*)
│ └── index.ts — public API exports
├── src/adapters/
│ ├── types.ts — SiteAdapter, Endpoint, ExtractRule types
│ ├── index.ts — AdapterRegistry + built-in adapter list
│ ├── hackernews.ts — Hacker News adapter
│ └── reddit.ts — Reddit adapter
├── bin/
│ ├── agentos-browser.js — CLI entry point
│ └── agent-browser.js — Backward-compatible alias
└── test/
└── core.test.mjs — Unit tests (node:test)Key Design Decisions
- Playwright auto-detection: simple
fetch()first, Playwright only if the page looks like a SPA (detects React, Next.js, Vue, Angular, Nuxt, Svelte signatures) - Readability first: Mozilla's battle-tested article extraction algorithm, same as Firefox Reader Mode
- CSS noise stripping: Reddit, Twitter, and other SPAs inline their entire CSS framework.
stripCssNoise()removes this before Readability runs, preventing garbage output - Query-aware budgeting: when you provide
--query, sections are scored by keyword overlap and highest-relevance sections are prioritized within the token budget - File-based cache: no Redis, no database — just JSON files in
~/.agent-browser/cache/. 100MB LRU eviction, 1-hour TTL - Adapter system: declarative CSS-selector mappings let you extract structured data from any site without writing scraper code
- Persistent browser profiles: Playwright's
launchPersistentContext()saves cookies and localStorage per site, enabling one-time login - LLM fallback: when no adapter exists, pipe clean markdown to GPT-4o-mini or Claude Haiku for zero-shot extraction
- ESM + TypeScript strict: modern, type-safe, no CommonJS legacy
Comparison
| Tool | Token-aware | Local-first | Adapter system | Free | |---|---|---|---|---| | agentos-browser | ✅ built-in | ✅ runs locally | ✅ declarative adapters | ✅ MIT | | Firecrawl | ❌ returns full markdown | ❌ cloud API | ❌ | ❌ paid | | Browse AI | ❌ screenshot-based | ❌ cloud only | ⚠️ proprietary | ❌ paid | | Browser Use | ❌ LLM-driven (expensive) | ✅ | ❌ | ✅ MIT | | Stagehand | ❌ LLM-driven | ✅ | ❌ | ✅ MIT |
agentos-browser is the only tool built specifically for token efficiency. It's not trying to automate a browser — it's trying to get the right content into your agent's context as cheaply as possible.
Philosophy
AI agents need context, not data. The difference matters enormously when you're paying per token or working within a context window.
The typical web fetch pipeline looks like:
URL → raw HTML → agent contextThat's like asking someone to read a book by staring at the printer's raw press plates. The information is technically there — buried under formatting directives, whitespace, metadata, and mechanical noise.
agentos-browser interposes a context layer:
URL → raw HTML → CSS stripped → Readability → Markdown → Budget → agent contextEach stage removes noise. The output is semantically dense: every token carries meaning.
This is the philosophy behind the Context Engine — the idea that what AI agents need isn't access to raw data, but intelligently pre-processed, relevance-ranked, budget-aware context packets. agentos-browser applies that philosophy to the open web.
The adapter system extends this further: instead of asking the LLM to make sense of a scraped page, you tell it exactly where the data is. The result is deterministic, structured, and cheap.
Requirements
- Node.js 20+
- For JS-rendered pages:
npx playwright install chromium - For LLM extraction (
api:extract):OPENAI_API_KEYorANTHROPIC_API_KEYin environment
Contributing
Contributions welcome! The most valuable contributions right now:
- New adapters — add a file to
src/adapters/and export it fromsrc/adapters/index.ts - Extractor improvements — better handling of SPAs, paywalls, dynamic content
- Test coverage — integration tests, more edge cases in the budget logic
- Documentation — more examples, adapter authoring guide
Please open an issue before submitting large PRs.
License
MIT — Copyright (c) 2026 Joe McLaughlin
See LICENSE for full text.
