npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@tangle-network/browser-agent-driver

v0.32.0

Published

LLM-driven browser agent for UI automation, testing, and evaluation

Readme

@tangle-network/browser-agent-driver

General-purpose agentic browser automation. Completes real user outcomes on any website — search, extract, fill forms, compare prices, navigate complex UIs.

91.3% on WebVoyager (590 tasks, 15 sites) at $0.09/task. 100% on held-out competitive bench. 95.7% on WebbBench-50 (excl. DataDome sites). Default model: gpt-5.4.

Table of Contents

Install

CLI (recommended)

curl -fsSL https://raw.githubusercontent.com/tangle-network/browser-agent-driver/main/scripts/install.sh | sh

Installs the bad command, downloads Chromium, adds PATH. Requires Node.js 20+.

Or via npm:

npm i -g @tangle-network/browser-agent-driver
npx playwright install chromium

As a library

pnpm add @tangle-network/browser-agent-driver
pnpm add -D playwright

Quick Start

CLI

# Run a task
bad run --goal "Find the cheapest flight from NYC to London on Jan 15" \
  --url https://www.google.com/travel/flights

# With vision (screenshot-based decisions)
bad run --goal "Compare MacBook Air prices" --url https://apple.com \
  --observation-mode hybrid

# Test suite from case file
bad run --cases ./my-tests.json --concurrency 4

# Authenticated session
bad run --goal "Check account settings" --url https://app.example.com \
  --storage-state .auth/session.json

# With proxy (residential or SOCKS5)
bad run --goal "Search hotels in Tokyo" --url https://booking.com \
  --proxy http://user:[email protected]:port

SDK

import { chromium } from 'playwright'
import { PlaywrightDriver, BrowserAgent } from '@tangle-network/browser-agent-driver'

const browser = await chromium.launch({ channel: 'chrome' })
const page = await browser.newPage()
const driver = new PlaywrightDriver(page)

const agent = new BrowserAgent({
  driver,
  config: {
    model: 'gpt-5.4',
    observationMode: 'hybrid',    // vision + DOM
    plannerEnabled: true,          // plan-then-execute
  },
})

const result = await agent.run({
  goal: 'Find the top 3 trending repositories on GitHub',
  startUrl: 'https://github.com/trending',
})

console.log(result.success, result.reason)
await browser.close()

Config File

Create bad.config.ts (or .js, .mjs) in your project root:

import { defineConfig } from '@tangle-network/browser-agent-driver'

export default defineConfig({
  provider: 'openai',
  model: 'gpt-5.4',
  observationMode: 'hybrid',
  plannerEnabled: true,
  headless: true,
  concurrency: 4,
  maxTurns: 30,
  outputDir: './test-results',

  // Per-role model routing (Gen 28)
  models: {
    planner:    { model: 'claude-opus-4-6', provider: 'anthropic' },
    executor:   { model: 'gpt-4.1-mini' },
    verifier:   { model: 'gpt-4.1-mini' },
    supervisor: { model: 'gpt-5.4' },
  },

  // Parallel tabs for compound goals (Gen 21)
  parallelTabs: { enabled: true, maxTabs: 3 },

  // Proxy for anti-bot bypass
  proxy: 'http://user:pass@proxy:port',  // or set BAD_PROXY_URL env

  // Supervisor for stuck detection
  supervisor: { enabled: true, useVision: true },
})

Auto-detected by CLI and SDK. CLI flags override config values.

How It Works

Goal → DOM Planner (1 LLM call with screenshot)
  → N deterministic steps (click, type, fill, navigate)
  → If deviation: vision per-action loop
    → Observe (a11y tree + screenshot + SoM labels)
    → LLM decides action (ref-based, coordinate, or label)
    → Execute + verify expected effect
    → Recovery on failure (strategy shift, form reset retry, search fallback)
  → Goal verification (LLM or fast-path with script evidence)
  → Complete with structured result

Recovery is automatic: cookie consent, modal blockers, A-B-A-B oscillation loops, form field resets, date picker stalls, and CAPTCHA challenges are handled before the agent continues.

Features

Vision + DOM Hybrid

The agent sees BOTH a screenshot and the accessibility tree. Screenshots show visual layout, SoM labels mark interactive elements with numbered badges, and the a11y tree provides precise @ref IDs for targeting.

Three observation modes:

  • dom — a11y tree only (fastest, cheapest)
  • vision — screenshot + coordinates only
  • hybrid — both (default for benchmarks, most reliable)

Stealth & Anti-Bot

Built-in evasion for Cloudflare, Akamai, and most WAF systems:

  • System Chrome (channel: 'chrome') — real TLS/JA3/HTTP2 fingerprint, not bundled Chromium
  • Patchright — Playwright fork that patches CDP protocol leaks
  • Mouse humanization — Bezier curve movement (8-15 points), gaussian click offset
  • Browser fingerprint — navigator.webdriver, plugins, languages, WebGL, canvas noise, screenX/Y fix
  • GPU rendering--use-gl=desktop for real WebGL fingerprint
  • Resource blocking — 99+ analytics/tracking domains blocked

Unblocks 9/13 previously-blocked sites on WebbBench-50 with zero configuration.

CAPTCHA Solving

Automatic CAPTCHA detection and solving during runs:

  • reCAPTCHA v2 — checkbox click + image grid solver (LLM vision-based)
  • Cloudflare Turnstile — checkbox behavioral click
  • Google "unusual traffic" — detected and solver attempted
// Enabled by default. To configure:
{ captcha: { enabled: true, maxAttempts: 5 } }

Parallel Tab Execution

For compound goals ("compare X vs Y", "find 5 items matching criteria"), the agent decomposes the goal and runs sub-tasks in parallel browser tabs.

{
  parallelTabs: { enabled: true, maxTabs: 3 },
}

The GoalDecomposer (1 cheap LLM call) classifies goals as simple or compound. Simple goals run as before. Compound goals get split into sub-goals, each running in its own tab via Promise.all, with results merged by the EvidenceMerger.

Multi-Model Orchestration

Different agent roles can use different models for optimal cost/quality:

{
  models: {
    planner:    { model: 'claude-opus-4-6', provider: 'anthropic' },  // best reasoning
    executor:   { model: 'gpt-4.1-mini' },                            // cheap, follows plans
    verifier:   { model: 'gpt-4.1-mini' },                            // structured yes/no
    supervisor: { model: 'gpt-5.4' },                                 // strategic recovery
  },
}

Each role falls back to the main model when not configured. The planner needs top-tier reasoning; the executor just follows instructions.

Design Audit

Vision-powered design quality analysis with closed-loop improvement:

# Audit any URL
bad design-audit --url https://your-app.com

# Multi-page crawl with cross-page systemic detection
bad design-audit --url https://your-app.com --pages 10

# Auto-fix: dispatch findings to a coding agent
bad design-audit --url http://localhost:3000 --evolve claude-code --project-dir ~/my-app

Reports rank Top Fixes by ROI — the highest-leverage changes scored by (impact * blast / effort). Multi-page systemic findings collapse automatically.

Wallet & DeFi Testing

Built-in MetaMask integration for DeFi app testing:

pnpm wallet:setup      # download MetaMask
pnpm wallet:onboard    # automate first-run wizard
pnpm wallet:anvil      # start Anvil mainnet fork (100 ETH + 10 WETH + 10k USDC)
pnpm wallet:validate   # run wallet test suite

Supports connect, swap, supply workflows across Uniswap, Aave, SushiSwap, 1inch.

Configuration

| Option | Type | Default | Description | |--------|------|---------|-------------| | provider | string | 'openai' | LLM provider | | model | string | 'gpt-5.4' | LLM model | | observationMode | string | 'dom' | 'dom', 'vision', or 'hybrid' | | plannerEnabled | boolean | false | Plan-then-execute mode | | headless | boolean | true | Run browser headless | | maxTurns | number | 30 | Max turns per task | | proxy | string | — | Proxy URL (also BAD_PROXY_URL env) | | models | object | — | Per-role model overrides | | parallelTabs | object | — | { enabled, maxTabs } | | supervisor | object | — | { enabled, useVision, model } | | captcha | object | — | { enabled, maxAttempts } | | goalVerification | boolean | true | Verify goal before accepting | | vision | boolean | true | Send screenshots to LLM | | concurrency | number | 1 | Parallel test cases |

See Configuration Reference for all options.

CLI Reference

bad run [options]              # Run tasks
bad snapshot --url <url>       # Headless accessibility-tree dump (no LLM)
bad design-audit [options]     # Design quality analysis
bad view <run-dir>             # Open session viewer
bad competitive [options]      # Head-to-head framework comparison

bad snapshot

Deterministic, no-LLM DOM dump. Loads a URL, dismisses consent dialogs, waits for the chosen network state, emits the accessibility-tree snapshot. Intended for CI and downstream quality pipelines where the agentic loop is overkill.

bad snapshot --url https://example.com --json --out snap.json
bad snapshot --url http://localhost:3000 --wait domcontentloaded

| Flag | Description | |------|-------------| | --url URL | URL to snapshot (required) | | --json | Emit structured JSON instead of human-readable text | | --out FILE | Write to file instead of stdout | | --wait load\|domcontentloaded\|networkidle\|commit | Playwright waitUntil. Default networkidle | | --timeout MS | Per-action timeout. Default 15000 | | --no-dismiss-modals | Skip consent/cookie dialog dismissal | | --headed | Show the browser window (default headless) |

JSON shape: { schemaVersion, url, finalUrl, title, snapshot, timing, dismissed, errors }. Exits non-zero on navigation failure or aria-snapshot error.

Report schema

<sink>/report.json carries a top-level schemaVersion field. The current version is "1". Consumers should verify this before destructuring. Adding an optional field is non-breaking; the version bumps only on removed fields, renamed fields, or changed value semantics.

Run options

| Flag | Description | |------|-------------| | --goal "..." | Natural language goal | | --url URL | Starting URL | | --cases file.json | Test case file | | --mode fast-explore\|full-evidence | Run mode | | --model MODEL | LLM model | | --provider openai\|anthropic\|google | LLM provider | | --observation-mode dom\|vision\|hybrid | Observation mode | | --headless | Run headless (default) | | --proxy URL | Residential/SOCKS5/HTTP proxy | | --profile default\|stealth\|benchmark-webvoyager | Launch profile | | --concurrency N | Parallel cases | | --show-cursor | Animated cursor overlay in screenshots | | --live | Real-time SSE viewer |

SDK / Library Usage

import {
  // Core
  BrowserAgent,
  PlaywrightDriver,
  SteelDriver,
  TestRunner,

  // Config
  defineConfig,

  // Types
  type AgentConfig,
  type Scenario,
  type AgentResult,
  type Turn,

  // Multi-actor (parallel users)
  MultiActorSession,

  // Design audit
  runDesignAudit,

  // CAPTCHA
  detectCaptcha,
  solveCaptcha,
} from '@tangle-network/browser-agent-driver'

Multi-Actor Sessions

import { MultiActorSession } from '@tangle-network/browser-agent-driver'

const session = await MultiActorSession.create(browser, {
  actors: {
    admin:   { storageState: '.auth/admin.json' },
    user:    {},
  },
  agentConfig: { model: 'gpt-5.4', observationMode: 'hybrid' },
})

// Sequential
await session.actor('admin').run({ goal: 'Create project', startUrl: '/admin' })

// Parallel
await session.parallel(
  ['admin', { goal: 'Monitor dashboard', startUrl: '/admin' }],
  ['user',  { goal: 'Submit form', startUrl: '/app' }],
)

await session.close()

Test Suites

import { TestRunner } from '@tangle-network/browser-agent-driver'

const runner = new TestRunner({
  driver,
  config: { model: 'gpt-5.4', observationMode: 'hybrid' },
  concurrency: 4,
})

const results = await runner.runSuite([
  {
    id: 'login',
    goal: 'Log in with [email protected] / password123',
    startUrl: 'https://app.example.com/login',
  },
  {
    id: 'create-project',
    goal: 'Create a new project called "Test Project"',
    startUrl: 'https://app.example.com/projects',
  },
])

Drivers

The agent loop is decoupled from the browser via the Driver interface:

// Local Playwright (default)
const driver = new PlaywrightDriver(page)

// Steel cloud browser (anti-bot, residential proxies, CAPTCHA)
const driver = await SteelDriver.create({
  apiKey: process.env.STEEL_API_KEY,
  sessionOptions: { useProxy: true, solveCaptcha: true },
})

// Any Driver implementation works
const agent = new BrowserAgent({ driver, config })

Benchmarks

| Benchmark | Score | Cost | Notes | |-----------|-------|------|-------| | WebVoyager (590 tasks, 15 sites) | 91.3% | $0.09/task | Full run, Gen 25 | | Competitive (10 real-web sites) | 100% | $0.03/task | Held-out, never optimized | | WebbBench-50 (50 diverse sites) | 88% raw, 95.7% excl. DataDome | — | Held-out generalization |

Competitive position

| Agent | WebVoyager | Cost/task | |-------|-----------|-----------| | Surfer-2 | 97.1% | $1-5 | | Magnitude | 93.9% | ~$0.10 | | bad | 91.3% | $0.09 | | OpenAI Operator | 87% | ChatGPT Pro |

Running benchmarks

# WebVoyager full 590
node scripts/run-scenario-track.mjs \
  --cases bench/external/webvoyager/cases.json \
  --config bench/scenarios/configs/vision-hybrid.mjs \
  --model gpt-5.4 --modes fast-explore --concurrency 5 \
  --out agent-results/webvoyager-full

# Multi-rep validation (≥3 reps required for any claim)
node scripts/run-multi-rep.mjs \
  --cases bench/scenarios/cases/my-cases.json \
  --config bench/scenarios/configs/vision-hybrid.mjs \
  --reps 3 --out agent-results/my-validation

GitHub Action

- uses: tangle-network/browser-agent-driver/.github/actions/design-audit@main
  with:
    url: ${{ steps.deploy.outputs.preview_url }}
    pages: 5
    fail-on-score-below: '6.5'
    evolve: claude-code
    openai-api-key: ${{ secrets.OPENAI_API_KEY }}

Posts Top Fixes as a PR comment, uploads full report as artifact, optionally fails on score regressions.

Session Viewer

bad view audit-results/stripe.com-1775502457141

Web UI with per-turn screenshots, action JSON, reasoning, element highlights. Pair with --show-cursor for animated cursor recordings.

Guides

Development

pnpm build              # TypeScript → dist/
pnpm test               # 993 tests
pnpm lint               # type-check
pnpm check:boundaries   # architecture boundaries

Publishing

Automated via Changesets + OIDC trusted publishing. Add a changeset with pnpm changeset, merge the auto-generated release PR, and npm publish fires automatically with provenance attestation.

License

Dual-licensed under MIT and Apache 2.0. See LICENSE.