agentos-browser

v0.1.0

Published

3 months ago

Token-efficient web browsing for AI agents. Your agent needs 500 tokens of content, not 50,000 tokens of HTML.

0High
0Medium
0Low

ai agent browser web-scraping llm context token-efficient readability playwright markdown agentos context-engine web-extraction

agentos-browser

Token-efficient web browsing for AI agents.

Your agent needs 500 tokens of content, not 50,000 tokens of HTML.

npm install agentos-browser

The Problem

When an AI agent reads a web page, the naive approach dumps raw HTML into context. This is catastrophically wasteful:

| Approach | Typical Token Count | Cost at $3/1M tokens | |---|---|---| | Raw HTML (Reddit front page) | ~50,000 tokens | $0.15 per page | | After Readability extraction | ~5,000 tokens | $0.015 per page | | agentos-browser (with budget) | ~500 tokens | $0.0015 per page |

Raw HTML is 99% noise: nav bars, footers, scripts, ads, inline styles, ARIA attributes, tracking pixels. None of it helps your agent answer the question.

Real-World Benchmarks

| Site | Raw HTML | After extraction | Reduction | |---|---|---|---| | Reddit front page | ~50,000 tokens | 514 tokens | 99% | | Hacker News top stories | ~45,000 tokens | 1,847 tokens structured JSON | 96% | | MLB stats page | ~2,106 tokens | 965 tokens | 54% |

Solution

agentos-browser is a context engine for the web. It:

Fetches the page (auto-detects JS-rendered SPAs, uses Playwright when needed)
Extracts the real content (Mozilla Readability strips all the noise)
Converts to clean markdown (human-readable, token-efficient)
Budgets the output to your token limit (query-aware prioritization)
Caches results (avoid refetching within 1 hour)
Adapts to structured sites via the Browser API (get clean JSON from any site)

The result: your agent gets clean, relevant content that fits in its context window.

Install

npm install agentos-browser

# Install Playwright browser (first time only, required for JS-heavy pages)
npx playwright install chromium

Quick Start

CLI

# Basic fetch — clean markdown
agentos-browser fetch https://example.com

# With a token budget
agentos-browser fetch https://example.com --budget 2000

# Query-aware: prioritize sections relevant to your question
agentos-browser fetch https://example.com --query "pricing plans"

# Multiple URLs in parallel
agentos-browser fetch https://a.com https://b.com https://c.com --budget 1000

# Output as JSON (includes metadata, sections, tables)
agentos-browser fetch https://example.com --format json

# Site adapter: get Hacker News top stories as clean JSON
agentos-browser api hackernews stories.top --format pretty

# Site adapter: get a subreddit
agentos-browser api reddit posts.hot --param subreddit=programming --format pretty

# LLM extraction: extract anything from any page
agentos-browser api:extract https://example.com --query "product prices and names"

Node.js API

import { browse, browseMany } from 'agentos-browser';

// Fetch and clean a page
const { text } = await browse('https://example.com', {
  maxTokens: 2000,
  query: 'pricing plans',
});
console.log(text); // clean, budget-trimmed markdown

// Multiple URLs in parallel
const results = await browseMany([
  'https://a.com',
  'https://b.com',
], { maxTokens: 1000, concurrency: 3 });

for (const r of results) {
  if (r.error) console.error(r.url, r.error);
  else console.log(r.content?.markdown);
}

Browser API

The Browser API lets agents interact with websites as if they were REST APIs — navigate to structured endpoints, extract clean JSON, fill forms, and maintain persistent sessions.

Concept

Instead of scraping raw HTML and hoping your LLM figures out the structure, you define an adapter that knows exactly where the data lives on a site. The adapter uses CSS selectors to target specific elements, and agentos-browser does the extraction.

import { BrowserAPI } from 'agentos-browser';

const api = new BrowserAPI();

// Call a built-in adapter endpoint
const result = await api.call('hackernews', 'stories.top');
console.log(result.data); // Array of {rank, title, url, score, author, comments}

// With URL params
const userResult = await api.call('hackernews', 'user.profile', { username: 'pg' });

await api.close();

CLI Commands

# Call a site adapter endpoint
agentos-browser api <site> <endpoint> [--param key=value...]

# Examples:
agentos-browser api hackernews stories.top
agentos-browser api hackernews user.profile --param username=dang
agentos-browser api reddit posts.hot --param subreddit=typescript
agentos-browser api reddit post.comments --param subreddit=typescript --param id=abc123

# LLM-powered extraction for sites without adapters
agentos-browser api:extract <url> --query <text> [--schema <json>]
agentos-browser api:extract https://news.ycombinator.com --query "top story titles and scores"

# Login to a site (persists session for future calls)
agentos-browser api:login <site> [--credential KEY=VALUE...]
agentos-browser api:login mysite --credential USERNAME=joe PASSWORD=secret

# List all available adapters
agentos-browser api:list
agentos-browser api:list --adapters-dir ./my-adapters

# Show adapters with their endpoints
agentos-browser api:adapters
agentos-browser api:adapters --format json

Built-in Adapters

| Adapter | Site | Endpoints | |---|---|---| | hackernews | news.ycombinator.com | stories.top, stories.newest, stories.ask, stories.show, stories.jobs, user.profile, item.detail | | reddit | old.reddit.com | posts.hot, posts.new, posts.top, posts.rising, front.hot, front.top, post.comments, subreddit.about |

Writing Your Own Adapter

Adapters are simple TypeScript (or JavaScript) objects. Here's the full structure:

import type { SiteAdapter } from 'agentos-browser';

const myAdapter: SiteAdapter = {
  name: 'mysite',
  baseUrl: 'https://mysite.com',

  // Optional: login flow for sites that require authentication
  loginFlow: {
    url: 'https://mysite.com/login',
    steps: [
      {
        fill: {
          '#username': '{{env.MYSITE_USERNAME}}',  // reads from env var
          '#password': '{{env.MYSITE_PASSWORD}}',
        },
        click: '#login-button',
        waitFor: '.dashboard',
      },
    ],
    successIndicator: '.user-avatar',
  },

  endpoints: {
    'items.list': {
      method: 'read',
      navigate: '/items',        // relative to baseUrl
      waitFor: '.item-row',      // wait for this selector before extracting
      extract: {
        selector: '.item-row',   // matches each item/row
        fields: {
          // Simple: CSS sub-selector → textContent
          title: '.item-title',

          // Full control: selector + attribute + transform
          price: { selector: '.price', transform: 'number' },
          link: { selector: 'a', attribute: 'href' },
          date: { selector: '.date', transform: 'date' },
        },
      },
    },

    // Endpoint with URL params: navigate: '/items/:id'
    'items.detail': {
      method: 'read',
      navigate: '/items/:id',    // :id is substituted from call params
      waitFor: '.item-detail',
      extract: {
        selector: '.detail-row',
        fields: { label: 'th', value: 'td' },
      },
    },

    // Write endpoint: fills a form and submits
    'items.create': {
      method: 'write',
      navigate: '/items/new',
      fill: {
        name: '#item-name',      // param name → CSS selector
        description: '#item-desc',
      },
      submit: '#submit-button',
    },
  },
};

export default myAdapter;

Load custom adapters at runtime:

const api = new BrowserAPI({ adaptersDir: './my-adapters' });
// or
await api.loadAdapters('./my-adapters');

LLM Extraction Fallback

For sites without adapters, use api:extract or api.extract() to navigate to any URL and extract structured data using an LLM:

const api = new BrowserAPI();

// Requires OPENAI_API_KEY or ANTHROPIC_API_KEY in environment
const result = await api.extract(
  'https://stripe.com/pricing',
  'List all plan names and monthly prices',
  { type: 'array', items: { properties: { plan: {}, price: {} } } } // optional schema
);

console.log(result.data); // structured JSON
console.log(result.source); // 'llm-extraction'

Falls back to returning raw markdown if no LLM key is configured.

Persistent Auth

BrowserAPI stores browser profiles (cookies, localStorage) per site in ~/.agent-browser/profiles/<site-name>/. Once you log in, subsequent calls reuse the same session:

# Log in once (opens visible browser by default so you can handle 2FA)
agentos-browser api:login mysite --credential USERNAME=joe

# All future calls use the saved session
agentos-browser api mysite dashboard

Full CLI Reference

# Fetch a page
agentos-browser fetch <url> [options]
  --budget <tokens>    Max tokens to return (default: unlimited)
  --query <text>       Focus extraction on this query
  --format <type>      markdown | json | text (default: markdown)
  --no-cache           Skip cache (always fetch fresh)
  --force-playwright   Always use Playwright (headless browser)
  --no-playwright      Never use Playwright (faster, static only)
  --session <name>     Use named session for cookie persistence
  --timeout <ms>       Request timeout in milliseconds
  --concurrency <n>    Max parallel fetches (multi-URL)
  --show-links         Include links in JSON output
  --show-tables        Include tables in JSON output
  --verbose            Print fetch metadata to stderr

# Manage cache
agentos-browser cache stats
agentos-browser cache clear

# Browser API
agentos-browser api <site> <endpoint> [--param key=value...]
agentos-browser api:extract <url> --query <text> [--schema <json>]
agentos-browser api:login <site> [--credential KEY=VALUE...]
agentos-browser api:list [--adapters-dir <dir>]
agentos-browser api:adapters [--format table|json]

Node.js API Reference

import {
  browse, browseMany,        // High-level convenience
  fetchPage, extractContent, // Low-level building blocks
  budgetContent, estimateTokens,
  ContentCache, SessionManager,
  BrowserAPI, createBrowserAPI,
  AdapterRegistry,
} from 'agentos-browser';

// High-level: fetch + extract + budget in one call
const { content, text } = await browse(url, {
  maxTokens: 2000,    // token budget
  query: 'pricing',   // relevance hint
  noCache: false,     // skip cache
  session: 'mysite',  // cookie session name
  timeout: 15_000,
  forcePlaywright: false,
  noPlaywright: false,
});

// content.title, content.markdown, content.sections, content.tables
// content.links, content.metadata, content.tokenEstimate

// Multiple URLs
const results = await browseMany(urls, { maxTokens: 1000, concurrency: 3 });

// Low-level
const page = await fetchPage(url, { noPlaywright: true });
const content = await extractContent(page.html, { url: page.url, query: 'install' });
const trimmed = budgetContent(content, 1000);
const tokens = estimateTokens(trimmed);

// Cache
const cache = new ContentCache();
cache.set(url, content);
const cached = cache.get(url);

// Browser API
const api = new BrowserAPI({ headless: true, profileDir: '~/.agent-browser/profiles' });
const result = await api.call('hackernews', 'stories.top');
const extracted = await api.extract(url, 'article title and author');
await api.login('mysite', { USERNAME: 'joe', PASSWORD: 'secret' });
await api.close();

Architecture

agentos-browser/
├── src/
│   ├── fetcher.ts          — fetch pages (auto Playwright vs HTTP)
│   ├── extractor.ts        — Readability + Turndown + section parsing
│   │                         stripCssNoise() removes inline CSS garbage
│   ├── budgeter.ts         — token budget enforcement + query-aware sorting
│   ├── cache.ts            — file-based LRU cache (~/.agent-browser/cache/)
│   ├── parallel.ts         — concurrent multi-URL fetching
│   ├── session.ts          — cookie persistence per session name
│   ├── browser-api.ts      — BrowserAPI class + adapter execution engine
│   ├── cli.ts              — Commander CLI (fetch, cache, api:*)
│   └── index.ts            — public API exports
├── src/adapters/
│   ├── types.ts            — SiteAdapter, Endpoint, ExtractRule types
│   ├── index.ts            — AdapterRegistry + built-in adapter list
│   ├── hackernews.ts       — Hacker News adapter
│   └── reddit.ts           — Reddit adapter
├── bin/
│   ├── agentos-browser.js  — CLI entry point
│   └── agent-browser.js    — Backward-compatible alias
└── test/
    └── core.test.mjs       — Unit tests (node:test)

Key Design Decisions

Playwright auto-detection: simple fetch() first, Playwright only if the page looks like a SPA (detects React, Next.js, Vue, Angular, Nuxt, Svelte signatures)
Readability first: Mozilla's battle-tested article extraction algorithm, same as Firefox Reader Mode
CSS noise stripping: Reddit, Twitter, and other SPAs inline their entire CSS framework. stripCssNoise() removes this before Readability runs, preventing garbage output
Query-aware budgeting: when you provide --query, sections are scored by keyword overlap and highest-relevance sections are prioritized within the token budget
File-based cache: no Redis, no database — just JSON files in ~/.agent-browser/cache/. 100MB LRU eviction, 1-hour TTL
Adapter system: declarative CSS-selector mappings let you extract structured data from any site without writing scraper code
Persistent browser profiles: Playwright's launchPersistentContext() saves cookies and localStorage per site, enabling one-time login
LLM fallback: when no adapter exists, pipe clean markdown to GPT-4o-mini or Claude Haiku for zero-shot extraction
ESM + TypeScript strict: modern, type-safe, no CommonJS legacy

Comparison

| Tool | Token-aware | Local-first | Adapter system | Free | |---|---|---|---|---| | agentos-browser | ✅ built-in | ✅ runs locally | ✅ declarative adapters | ✅ MIT | | Firecrawl | ❌ returns full markdown | ❌ cloud API | ❌ | ❌ paid | | Browse AI | ❌ screenshot-based | ❌ cloud only | ⚠️ proprietary | ❌ paid | | Browser Use | ❌ LLM-driven (expensive) | ✅ | ❌ | ✅ MIT | | Stagehand | ❌ LLM-driven | ✅ | ❌ | ✅ MIT |

agentos-browser is the only tool built specifically for token efficiency. It's not trying to automate a browser — it's trying to get the right content into your agent's context as cheaply as possible.

Philosophy

AI agents need context, not data. The difference matters enormously when you're paying per token or working within a context window.

The typical web fetch pipeline looks like:

URL → raw HTML → agent context

That's like asking someone to read a book by staring at the printer's raw press plates. The information is technically there — buried under formatting directives, whitespace, metadata, and mechanical noise.

agentos-browser interposes a context layer:

URL → raw HTML → CSS stripped → Readability → Markdown → Budget → agent context

Each stage removes noise. The output is semantically dense: every token carries meaning.

This is the philosophy behind the Context Engine — the idea that what AI agents need isn't access to raw data, but intelligently pre-processed, relevance-ranked, budget-aware context packets. agentos-browser applies that philosophy to the open web.

The adapter system extends this further: instead of asking the LLM to make sense of a scraped page, you tell it exactly where the data is. The result is deterministic, structured, and cheap.

Requirements

Node.js 20+
For JS-rendered pages: npx playwright install chromium
For LLM extraction (api:extract): OPENAI_API_KEY or ANTHROPIC_API_KEY in environment

Contributing

Contributions welcome! The most valuable contributions right now:

New adapters — add a file to src/adapters/ and export it from src/adapters/index.ts
Extractor improvements — better handling of SPAs, paywalls, dynamic content
Test coverage — integration tests, more edge cases in the budget logic
Documentation — more examples, adapter authoring guide

Please open an issue before submitting large PRs.

License

See LICENSE for full text.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

agentos-browser

The Problem

Real-World Benchmarks

Solution

Install

Quick Start

CLI

Node.js API

Browser API

Concept

CLI Commands

Built-in Adapters

Writing Your Own Adapter

LLM Extraction Fallback

Persistent Auth

Full CLI Reference

Node.js API Reference

Architecture

Key Design Decisions

Comparison

Philosophy

Requirements

Contributing

License