@unbrowser/local

v0.5.3

Published

2 months ago

Full local browser engine for AI agents. Includes Playwright for complete page rendering, MCP server, and learning capabilities.

unbrowser

This is NOT an AI-enhanced browser for humans. This is a web browser where the user is an LLM.

A browser that AI agents control directly via MCP. It learns from every interaction, discovers APIs automatically, and progressively optimizes to bypass rendering entirely. Machine-first, not human-first.

What This Actually Does

When an LLM browses with unbrowser:

First visit: Uses tiered rendering (fastest method that works)
Learning: Discovers APIs, learns selectors, builds reusable skills
Future visits: Often skips browser rendering entirely for 10x faster access

First visit:  LLM -> smart_browse -> Full render (~2-5s) -> Content + learned patterns
Next visit:   LLM -> smart_browse -> API call (~200ms)   -> Same content, much faster

What This Does NOT Do

Not a visual browser - No screenshots, no visual rendering for humans
Not magic - Complex JS-heavy sites still need the browser
Not stealth - Sites with aggressive bot detection may block it
No code generation - LLMs use MCP tools directly, no Puppeteer scripts needed

Installation

npm install unbrowser

If cloning from source: Run npm run build before using. The package exports point to compiled code in dist/ which isn't checked into git:

git clone https://github.com/rabbit-found/unbrowser
cd unbrowser
npm install
npm run build  # Required! Compiles src/ -> dist/

Optional Dependencies

Both of these are optional and the package works without them:

# For full browser rendering (recommended for best compatibility)
npm install playwright
npx playwright install chromium

# For neural embeddings (better cross-domain skill transfer)
npm install @xenova/transformers

Without Playwright, the browser uses Content Intelligence and Lightweight rendering tiers only. Without transformers, it falls back to hash-based embeddings.

Quick Setup (Any MCP Client)

The easiest way to get started:

npx unbrowser init

This interactive command will:

Let you choose a profile (intelligence/core/api/full)
Configure your MCP client (Claude Desktop, Cline, Zed, etc.)
Set up lazy loading to save context tokens

Profiles

Unbrowser offers 4 profiles to optimize token usage:

| Profile | Tools | Tokens | Use Case | |---------|-------|--------|----------| | intelligence | 2 | ~800 | Work alongside browser automation tools | | core | 4 | ~2,000 | Essential browsing and content extraction | | api | 6 | ~3,500 | API discovery and integration | | full | 11 | ~7,000 | All features including debugging |

View detailed profile information:

npx unbrowser profiles

Manual Configuration

Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "unbrowser-core": {
      "command": "npx",
      "args": ["unbrowser", "--profile=core"]
    }
  }
}

Cline (VS Code)

Add via VS Code Settings → MCP Servers:

{
  "unbrowser-core": {
    "command": "npx",
    "args": ["unbrowser", "--profile=core"]
  }
}

Zed

Add to ~/.config/zed/settings.json:

{
  "mcpServers": {
    "unbrowser-core": {
      "command": "npx",
      "args": ["unbrowser", "--profile=core"]
    }
  }
}

Other MCP Clients

Use the command directly:

npx unbrowser --profile=core

Replace core with intelligence, api, or full as needed.

Then restart your MCP client. The browser tools will be available.

Programmatic Usage

import { createLLMBrowser } from 'unbrowser/sdk';

const browser = await createLLMBrowser();

// Browse a page (learns from the interaction)
const result = await browser.browse('https://example.com');
console.log(result.content.markdown);
console.log(result.discoveredApis);

// On subsequent visits, may use learned APIs directly
const result2 = await browser.browse('https://example.com/page2');

await browser.cleanup();

Unbrowser Connect (B2B SaaS SDK)

For SaaS applications that need to fetch content through their users' browsers, Unbrowser Connect provides a client-side SDK that bypasses bot detection by executing requests in the user's actual browser session.

npm install @unbrowser/connect

import { createConnect } from '@unbrowser/connect';

const connect = createConnect({
  appId: 'your-app-id',
  apiKey: 'ub_live_xxx'
});

// Background fetch (hidden iframe, public content)
const result = await connect.fetch({
  url: 'https://example.com',
  mode: 'background',
  extract: { markdown: true }
});

// Popup fetch (OAuth-style, auth-required content)
const authResult = await connect.fetch({
  url: 'https://reddit.com/r/topic',
  mode: 'popup',
  requiresAuth: true
});

Key features:

Runs in user's browser (bypasses server-side bot detection)
Patterns sync from Unbrowser cloud for intelligent extraction
Background mode (iframe) for public content
Popup mode for authenticated content
Data goes to your app, intelligence stays in Unbrowser cloud

How It Works

Tiered Rendering

The browser tries the fastest approach first, falling back only when needed:

| Tier | Speed | What It Does | When It's Used | |------|-------|--------------|----------------| | Content Intelligence | ~50-200ms | Static HTML + framework extraction | Sites with server-rendered content | | Lightweight | ~200-500ms | linkedom + Node VM | Sites needing basic JavaScript | | Playwright | ~2-5s | Full browser | Sites requiring complex JS or interactions |

The system remembers which tier works for each domain and uses it next time.

Learning System

Every browse operation teaches the system:

Selector patterns: Which CSS selectors reliably extract content
API endpoints: Discovered APIs that can bypass rendering
Validation rules: What valid content looks like (to detect errors)
Browsing skills: Reusable action sequences (click, fill, extract)

Semantic Embeddings

Skills are matched using neural embeddings (when @xenova/transformers is installed) or hash-based embeddings (fallback). This enables:

Skills learned on one site can apply to similar sites
Automatic domain similarity detection
Cross-domain pattern transfer

New in v0.6: Enhanced Features

Playwright Debug Mode - Visual debugging for teaching and troubleshooting:

const result = await browser.browse('https://example.com', {
  debug: {
    visible: true,         // Show browser window
    slowMotion: 150,       // 150ms delay between actions
    screenshots: true,     // Capture screenshots
    consoleLogs: true,     // Collect console output
  }
});
// Access debug data: result.debug.screenshots, result.debug.consoleLogs

API Fuzzing Discovery - Proactively discover API endpoints:

import { ApiDiscoveryOrchestrator } from 'unbrowser/discovery';

const orchestrator = new ApiDiscoveryOrchestrator(learningEngine);
const result = await orchestrator.discoverViaFuzzing('https://api.example.com', {
  methods: ['GET', 'POST'],
  learnPatterns: true,  // Cache discovered patterns
});

// Future browse() calls use discovered APIs directly (~10x faster)

Example Workflows - See examples/ directory for complete demonstrations:

Article extraction with metadata
GitHub repository intelligence
E-commerce product monitoring
Multi-page company research
Visual debugging with Playwright
API discovery strategies

MCP Tools

The Unbrowser exposes 7 core tools by default, designed to minimize cognitive load:

Core Tools

| Tool | Purpose | |------|---------| | smart_browse | Intelligent browsing with automatic learning and optimization | | quick_fetch | Simplified browsing with minimal parameters (just URL required) | | research | Search-first research mode - discovers sources via web search | | batch_browse | Browse multiple URLs in a single call with controlled concurrency | | execute_api_call | Direct API calls using discovered patterns (bypasses browser) | | session_management | Manage sessions for authenticated access (save, list, health) | | api_auth | Configure API authentication (API keys, OAuth, bearer tokens, etc.) |

smart_browse (Primary Tool)

The main tool that automatically applies all learned intelligence.

Parameters:
- url (required): URL to browse
- contentType: Hint for extraction ('main_content', 'table', 'form', etc.)
- followPagination: Follow detected pagination
- waitForSelector: CSS selector to wait for (SPAs)
- scrollToLoad: Scroll to trigger lazy content
- sessionProfile: Use saved authentication session
- maxChars: Truncate content to this length (for large pages)
- includeInsights: Include domain knowledge summary (default: true)
- checkForChanges: Check if content changed since last visit

batch_browse

Browse multiple URLs efficiently with controlled concurrency.

Parameters:
- urls (required): Array of URLs to browse
- concurrency: Max parallel requests (default: 3)
- stopOnError: Stop on first error (default: false)
- All smart_browse options apply to each URL

quick_fetch

Simplified browsing with sensible defaults - minimal cognitive overhead.

Parameters:
- url (required): The URL to fetch
- maxChars: Maximum characters for markdown content (optional)

Defaults:
- Auto-detects content type
- Uses all tiers (intelligence → lightweight → playwright)
- Enables learning and caching
- Returns clean markdown with tables
- No insights section (cleaner output)

Use when: You just need "get this page" behavior without configuration

research

Search-first research mode that discovers sources via web search.

Parameters:
- scope (required): Natural language research scope (e.g., "Portugal D7 visa requirements")
- strategy: 'authoritative', 'comprehensive', or 'quick' (default: authoritative)
- maxSources: Maximum sources to consult (default: 5 for authoritative)
- language: Language hint (e.g., "es", "en")
- preferredDomains: Array of preferred domains (e.g., ["gov.pt", "irs.gov"])
- extractSnippets: Return snippets instead of full content (default: true)
- snippetLength: Maximum snippet length in chars (default: 500)

Returns:
- Citation-ready results with source URLs and titles
- Confidence scoring with human-readable explanation
- Source quality classification (government, official, reference, news)
- Authority scores per source

Requirements:
- Set BRAVE_SEARCH_API_KEY environment variable

Use when: You need "search → scan results → extract facts" workflow

Advanced Tools (Hidden by Default)

Additional tools are available for debugging and administration:

Debug tools (set UNBROWSER_DEBUG_MODE=1):
- capture_screenshot - Visual debugging
- export_har - Network traffic analysis
- debug_traces - Failure analysis and replay
Admin tools (set UNBROWSER_ADMIN_MODE=1):
- Performance metrics, usage analytics, tier management
- Deprecated tools for backward compatibility

Configuration

Environment variables:

LOG_LEVEL=info          # debug, info, warn, error, silent
LOG_PRETTY=true         # Pretty print logs (dev mode)

# Research mode (optional)
BRAVE_SEARCH_API_KEY=   # Required for 'research' tool - web search integration
                        # Get your API key at https://brave.com/search/api/

Storage

The browser stores learned patterns in the current directory:

./sessions/ - Saved authentication sessions
./enhanced-knowledge-base.json - Learned patterns and validators
./procedural-memory.json - Browsing skills and workflows
./embedding-cache.json - Cached embeddings (when using transformers)

Comparison with Alternatives

| Feature | Jina/Firecrawl | Puppeteer | unbrowser | |---------|---------------|-----------|-------------| | Clean content extraction | Yes | No | Yes | | API discovery | No | No | Yes | | Learning over time | No | No | Yes | | Selector fallbacks | No | No | Yes | | MCP integration | No | No | Yes | | Works without browser | No | No | Yes (partial) | | Progressive optimization | No | No | Yes |

Limitations

Be honest about what this can and can't do:

Works well for:

E2E API testing - Auto-discovers APIs, validates responses, detects changes
Content validation - Built-in verification engine with assertions and confidence scores
QA automation - Record workflows, replay with validation, regression testing
Government websites, documentation sites
E-commerce product listings
News and content sites
Sites with discoverable APIs

Testing & QA: See docs/QA_TESTING_GUIDE.md for using Unbrowser as a testing tool.

May struggle with:

Heavy SPAs that require complex interaction flows
Sites with aggressive bot detection (Cloudflare challenges)
Sites requiring visual verification (CAPTCHAs)
Real-time applications (chat, streaming)

Development

git clone https://github.com/rabbit-found/unbrowser
cd unbrowser
npm install
npm run build
npm test

License

MIT