npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@lightfeed/browser-agent

v0.2.3

Published

Serverless browser agent

Readme

Overview

@lightfeed/browser-agent is a TypeScript browser agent library built for cutting LLM token use on every rerun.

Most browser-agent work has two parts:

  • Navigation — many clicks / types / scrolls to reach a target page. Most of the steps, most of the tokens, usually the same every run if the page structure is stable. Today's agents pay for these tokens every single time.
  • Extraction — pull typed data out of whatever is on screen. Must re-run AI each time because the content is live.

This library lets you run navigation once with AI, save it as a plan, and replay it with zero LLM calls — no screenshots, no DOM map, no tokens. Then run a cheap .extract() on the result page for the dynamic tail. If the DOM drifts, optional aiFallback re-plans only the broken step, so you still pay tokens for a fraction of the flow instead of all of it.

Runs anywhere your browser lives — the same BrowserAgent API drives a local Chromium for dev, a serverless Chromium (AWS Lambda via @sparticuz/chromium) for scheduled jobs, or a remote CDP endpoint (Brightdata Scraping Browser, any browser farm, or your own). Swap backends by changing one config field; prompts, plans, and .extract() calls stay identical.

Install

npm install @lightfeed/browser-agent

Example

Go to the Hacker News Show section, click through to the next page, and grab the top 3 posts. Navigation (open Show, paginate) is the expensive-but-stable part; extraction is the live-data part.

import { BrowserAgent } from "@lightfeed/browser-agent";
import { ChatGoogleGenerativeAI } from "@langchain/google-genai";
import { z } from "zod";

const agent = new BrowserAgent({
  browserProvider: "Local",
  llm: new ChatGoogleGenerativeAI({ model: "gemini-2.5-flash" }),
});

const page = await agent.newPage();

// 1. AI navigation — recordable, replayable.
const nav = await page.ai(
  "Go to Hacker News show section, go to next page"
);
await agent.savePlan("hn show page 2", nav, "./hn.plan.json");

// 2. AI extraction — typed by a Zod schema, runs AI every call.
const { articles } = await page.extract(
  "The top 3 articles on this page",
  z.object({
    articles: z
      .array(
        z.object({
          title: z.string(),
          url: z.string(),
          points: z.number(),
          commentsUrl: z.string(),
        })
      )
      .max(3),
  })
);

Every subsequent run — navigation is free:

await agent.replay("./hn.plan.json", { page });   // zero tokens
const { articles } = await page.extract(/* ... */); // tokens only here

CLI

Everything above is available without writing code:

# Record while running
browser-agent-cli run --save-plan ./hn.plan.json \
  -c "Go to Hacker News show section, go to next page and find top 3 articles"

# Replay: deterministic navigation (no LLM), then one fresh AI pass on the
# result page to produce an up-to-date final response. The navigation part
# is free; only the final pass spends tokens.
browser-agent-cli replay ./hn.plan.json

# Pure replay — skip the final AI pass and just get the browser onto the
# result page (zero LLM calls end-to-end).
browser-agent-cli replay ./hn.plan.json --no-ai-finish

# Use a different finishing task (e.g. ask for a custom summary of the
# current page instead of re-running the recorded task).
browser-agent-cli replay ./hn.plan.json \
  --finish-task "Return the titles of the first 3 posts as a bullet list"

# Self-heal drifted steps during replay (independent of the finish pass).
browser-agent-cli replay ./hn.plan.json --ai-fallback

LLM auto-detected from GOOGLE_API_KEY / GEMINI_API_KEYOPENAI_API_KEYANTHROPIC_API_KEY. Override the model with --llm-model or GEMINI_MODEL / OPENAI_MODEL / ANTHROPIC_MODEL. replay only needs an LLM with --ai-fallback. Interactive: ctrl+p pause, ctrl+r resume.

Browser providers

The same BrowserAgent API works against three backends.

const agent = new BrowserAgent({ browserProvider: "Local" });
const agent = new BrowserAgent({
  browserProvider: "Remote",
  remoteConfig: {
    browserWSEndpoint: "ws://your-remote-browser:9222/devtools/browser/ws",
  },
});

Version pinning: This project uses Playwright, which ships with a specific version of Chromium. You need a matching @sparticuz/chromium. We're on Playwright 1.49 (Chromium 133), so install @sparticuz/chromium@133. For AWS Lambda, ARM64 is preferred; you also need the canvas native dependencies — see lambda-layer-build.sh.

import { BrowserAgent } from "@lightfeed/browser-agent";
import chromium from "@sparticuz/chromium";

const agent = new BrowserAgent({
  browserProvider: "Serverless",
  serverlessConfig: {
    executablePath: await chromium.executablePath(),
    options: { args: chromium.args },
  },
});

page.ai vs agent.executeTask vs agent.executeTaskAsync

All three drive the browser with AI, return the same TaskOutput, and can be recorded + replayed.

| API | Use when | | ------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------- | | page.ai(task) | You already have a page and want to mix Playwright calls (page.goto, page.clickElement) with AI steps on the same tab. Resolves when done. | | agent.executeTask(task) | "Here's a goal, figure it out." The agent owns the page; include URLs in the prompt and it navigates itself. Resolves when done. | | agent.executeTaskAsync(task) | Same as executeTask but returns a Task control handle immediately — task.pause(), task.resume(), task.cancel(), and per-step event callbacks. For long-running flows, CLIs, or anything a user can interrupt. |

Record & replay

  • agent.savePlan(task, result, path) writes a JSON plan with the action sequence and a stable xpath + cssPath for each clicked / typed element.
  • agent.replay(path, { page }) re-runs those actions with no LLM calls, no screenshots, no DOM map.
  • aiFallback: true re-plans only a drifted step with the LLM; the rest stays free.
  • startingUrl (option, or --url on the CLI) retargets a plan at a different URL — useful for staging / preview deploys / different queries.
  • Plans are human-readable and hand-editable (tweak an inputText value, reorder or delete steps).

The output string the model produced while recording is frozen in the plan — the programmatic agent.replay() does not regenerate it. The CLI's replay command, by default, runs one fresh AI pass (page.ai(plan.task, { maxSteps: 3 })) on the result page after navigation so every CLI run ends with an up-to-date response; pass --no-ai-finish to get pure token-free replay and fall back to the recorded output. If you're wiring this up programmatically, run your own .extract() / .ai() on the page after agent.replay() instead of relying on the recorded output.

License

MIT. Forked from HyperAgent (b49afe). Serverless browser support by @sparticuz/chromium.