npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@mylesiyabor/betterbrowse

v0.7.0

Published

Zero-dependency browser automation via Chrome DevTools Protocol with ARIA accessibility snapshots — 10-100x cheaper than vision-based approaches

Readme

betterbrowse

Zero-dependency browser automation via Chrome DevTools Protocol with ARIA accessibility snapshots — 10-100x cheaper than vision-based approaches.

Why?

Most browser automation agents use screenshots + vision models. That's expensive and slow. betterbrowse uses ARIA accessibility snapshots instead — a text representation of the page that any LLM can understand. This means:

  • 10-100x cheaper — text tokens vs image tokens
  • Works with any text model — no vision model required
  • Faster — no image encoding/decoding overhead
  • More reliable — structured data vs pixel interpretation
  • Video recording — record browser sessions as MP4 (via CDP screencast + ffmpeg)

Install

Project (library):

npm install @mylesiyabor/betterbrowse

Global (CLI — easy for agents):

npm install -g @mylesiyabor/betterbrowse

Then use the CLI from any terminal or agent (see below).

Requires Node.js >= 20.10.0 and Chrome/Chromium installed locally.

CLI (easy for agents)

The simplest way for agents to use betterbrowse is the CLI. Install globally, then:

Snapshot only (no API key) — get the ARIA snapshot of a page on stdout:

betterbrowse https://example.com

Agent mode (uses OpenAI; set OPENAI_API_KEY) — complete a task and print the result to stdout:

betterbrowse https://news.ycombinator.com "What is the top story title?"
betterbrowse https://example.com "Click the first link" --no-headless

| Option | Description | |--------|-------------| | betterbrowse <url> | Print ARIA snapshot of the page | | betterbrowse <url> "<task>" | Run browser agent; result to stdout | | betterbrowse search "<query>" | Search the web (multi-provider, free) | | --model <name> | OpenAI model (default: gpt-4o-mini) | | --no-headless | Show browser window | | --record | Record the session as video (MP4 if ffmpeg installed) | | --record-dir <dir> | Directory for recording output (default: cwd or temp) | | --json | Output search results as JSON | | --deep | Visit top results and extract page content | | --max <n> | Max search results (default: 5) | | -v, --version | Print version | | -h, --help | Show help |

Agents can capture stdout for the snapshot or the task result. No extra dependencies — agent mode calls the OpenAI API with fetch.

Video recording: Use --record (and optionally --record-dir ./out). The browser session is captured via CDP screencast; if ffmpeg is installed, frames are stitched into recording.mp4. The output path is printed to stderr so stdout stays clean for the result.

Quick Start (library)

Browser Class (Tool Harness)

import { Browser } from '@mylesiyabor/betterbrowse';

const browser = new Browser({ headless: true });
await browser.launch();
await browser.navigate('https://example.com');

// Get ARIA snapshot — structured text representation of the page
const snapshot = await browser.getSnapshot();
console.log(snapshot);
// - heading "Example Domain" [ref=e1]
// - text "This domain is for use in illustrative examples..."
// - link "More information..." [ref=e2]

// Interact using refs from the snapshot
await browser.clickRef('e2');

// Take a screenshot
const png = await browser.screenshot(); // base64

await browser.close();

Agent (LLM-Driven Loop)

import { browseWeb } from '@mylesiyabor/betterbrowse';

const result = await browseWeb('https://news.ycombinator.com', 'Find the top story title', {
  chat: async (messages, { tools, maxTokens }) => {
    // Wire up your LLM here — OpenAI, Anthropic, Google, etc.
    const response = await yourLLM.chat(messages, { tools, maxTokens });
    return {
      content: response.text,
      toolCalls: response.toolCalls, // [{ name, arguments, id }]
      usage: { input: response.inputTokens, output: response.outputTokens },
    };
  },
});

console.log(result.result);  // "The top story is: ..."
console.log(result.usage);   // { inputTokens, outputTokens, modelCalls }
console.log(result.steps);   // [{ step, action, ref, text, result }, ...]

API

Browser

new Browser({ headless?: boolean, useProfile?: boolean, port?: number })

Extends EventEmitter. Events: launch, navigate, action, snapshot, close, error.

| Method | Description | |---|---| | launch() | Start Chrome and connect via CDP | | navigate(url) | Navigate to a URL | | getSnapshot() | Get optimized ARIA snapshot | | getRawSnapshot() | Get raw snapshot + refMap | | clickRef(ref) | Click element by ref (e.g. "e5") | | fillRef(ref, text) | Type into input by ref | | hover(ref) | Mouse hover by ref | | selectOption(ref, value) | Select dropdown option by ref | | waitForSelector(selector, timeout?) | Wait for CSS selector | | screenshot() | Capture PNG (base64) | | extractText() | Get all visible text | | evaluate(expr) | Run JS in page | | close() | Close browser |

browseWeb(url, task, opts)

LLM-driven browser agent. Returns { result, usage, steps, recording }.

Required option: chat — async function matching:

(messages, { tools, maxTokens }) => Promise<{ content, toolCalls?, usage? }>

Optional: record: true — record the session; recordDir: string — output directory. When recording, the returned object includes recording: { video, frameDir, frameCount, frames } (MP4 path in video if ffmpeg is installed).

Snapshot Utilities

import { optimizeAll, computeDiff, analyzeWaste } from '@mylesiyabor/betterbrowse';

// Optimize a raw ARIA snapshot
const optimized = optimizeAll(rawSnapshot, { maxItems: 10 });

// Compute diff between two snapshots
const diff = computeDiff(prevSnapshot, currSnapshot, prevUrl, currUrl);

// Analyze snapshot waste
const report = analyzeWaste(rawSnapshot);

How ARIA Snapshots Work

Instead of screenshots, we fetch the browser's accessibility tree via CDP and convert it to a compact text format:

- heading "Search Results" [ref=e1]
- textbox "Search query" [ref=e2]
- button "Search" [ref=e3]
- list
  - listitem
    - link "First Result" [ref=e4]
  - listitem
    - link "Second Result" [ref=e5]

Interactive elements get [ref=eXX] tags. The agent uses these refs to click, fill, hover, and select — no pixel coordinates needed.

The snapshot optimizer pipeline strips chrome (headers/footers), deduplicates links, compresses long names, and truncates lists — reducing token count by 60-90%.

Video recording

You can record browser sessions as video (CLI or library).

  • CLI: betterbrowse <url> "<task>" --record or betterbrowse <url> --record. Use --record-dir <dir> to choose where the file is saved. The session is captured via Chrome DevTools screencast; if ffmpeg is on your PATH, frames are stitched into recording.mp4 in that directory. The path is printed to stderr.
  • Library: Pass record: true and optionally recordDir: './recordings' to browseWeb(). The return value includes recording: { video, frameDir, frameCount, frames } (or recording: null if not recording). Frames are always saved; video is set only when ffmpeg is available.

Use in agents (global install)

Install betterbrowse globally so any agent (Cursor, MCP, scripts) can run it as a CLI:

npm install -g @mylesiyabor/betterbrowse
  • CLI (recommended): Run betterbrowse <url> or betterbrowse <url> "<task>". Result/snapshot goes to stdout — easy for agents to capture. For task mode set OPENAI_API_KEY.
  • As a library: In your agent code, import { Browser, browseWeb } from '@mylesiyabor/betterbrowse' and call the API (e.g. with your own chat function).
  • Project-local: Run npm install @mylesiyabor/betterbrowse in your project and use the CLI from npx betterbrowse or import the module.

License

MIT