npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@possumtech/rummy.web

v2.5.5

Published

Web search and fetch plugin for rummy — Playwright browser automation and SearXNG search integration

Readme

@possumtech/rummy.web

Web search and fetch plugin for rummy. Adds <search> and URL-aware <get> tools powered by Playwright, Mozilla Readability, and a SearXNG backend (which itself federates upstream engines).

Install

npm install @possumtech/rummy.web
npx playwright install chromium

Setup

Load via environment variable:

RUMMY_PLUGIN_WEB=@possumtech/rummy.web

Search Backend

SearXNG is the only backend — point it at a running instance:

RUMMY_WEB_SEARXNG_URL=http://127.0.0.1:8888

SearXNG can itself federate to Brave, DuckDuckGo, Wikipedia, and many other engines (configured upstream in your SearXNG instance); we take its normalized output as gospel.

Environment Variables

| Variable | Required | Default | Description | |---|---|---|---| | RUMMY_PLUGIN_WEB | Yes | — | Set to @possumtech/rummy.web to load | | RUMMY_WEB_SEARXNG_URL | Yes | — | SearXNG base URL | | RUMMY_WEB_FETCH_TIMEOUT | Yes | — | Timeout in ms for page loads and search requests | | RUMMY_WEB_SEARCH_MAX | Yes | — | Max <search> commands per turn | | RUMMY_WEB_SEARCH_RESULTS | Yes | — | Default candidate count when the model doesn't set results=N | | RUMMY_TOKEN_DIVISOR | Yes | — | Page chars-per-token ratio for the listing's (N tokens) badge | | RUMMY_WEB_PLAYWRIGHT_WS | No | — | CDP ws:// endpoint to connect to a shared chromium instead of launching locally | | RUMMY_WEB_NO_SANDBOX | No | — | 1 to drop chromium's user-namespace sandbox (docker-friendly; mild security tradeoff) | | RUMMY_WEB_CHROMIUM_HEAP_MB | No | — | Cap chromium's V8 old-space heap (MB) |

The Yes variables have no in-code defaults — Number(undefined) becomes NaN and downstream behavior degrades silently. The package ships .env.example with sensible values and the test runner cascades it before .env.test. In production, the parent rummy app's env file is the source of truth.

Tools

<search> — Web Search

Queries the configured search backend, fetches each result in parallel, archives the bodies as <https> run entries (visibility: "archived"), and returns the surviving candidates with their token costs as a single log entry. A subsequent <get> on any listed URL is a pure visibility flip — no second round trip.

<search>node.js streams backpressure</search>
<search results="5">SQLite WAL mode</search>
  • Results default to 12; set the results attribute to limit.
  • Every candidate URL is fetched in parallel (10s timeout) — to validate reachability, measure token cost, and archive the body for a zero-network <get>. Candidates already archived within the last 10 minutes are served from the existing entry; only stale or new URLs hit the network.
  • Unreachable results (404, timeout, network error) are dropped from the listing. The header reports N of M results (M-N unreachable) so the model knows some were filtered.
  • The search log entry's body is a markdown bullet list. Each result renders as * URL — title (N tokens) followed (when populated) by an indented date line (YYYY-MM-DD) and the page's description. The leading * is load-bearing: it marks the body as rendered output the model has no training prior for emitting as a tool. Token count is the signal for the model's "which one is worth promoting" decision.
  • Each successfully-fetched URL lands as an archived <https> entry (state: "resolved", visibility: "archived") with the body and {title, content, publishedDate, engine, excerpt, byline, siteName, fetched_at} attributes. <get> on a listed URL becomes a pure visibility flip; no re-fetch.
  • Hard-capped at RUMMY_WEB_SEARCH_MAX searches per turn; further searches are refused (error logged with status 429).

<get> — URL Fetch

When <get> targets an http:// or https:// URL, this plugin intercepts at priority 5 (before the core get handler at 10).

<get>https://en.wikipedia.org/wiki/Mitch_Hedberg</get>
  • If the URL is already a known entry and was fetched within the last 10 minutes, this handler skips the network call and lets the core get handler promote the existing entry. Stale entries (older than 10 min) fall through to a fresh fetch and overwrite the archive.
  • Otherwise, fetches the page with headless Chromium. HTML responses go through Readability + Turndown for clean markdown; non-HTML responses (text/plain, application/json, source files, …) are read as raw text.
  • Wikipedia URLs are redirected to the mobile-html API for cleaner content. GitHub blob/ URLs are redirected to raw.githubusercontent.com so source files come through as their raw bytes instead of the JS-rendered SPA (whose CSP forbids Readability injection anyway).
  • All fetches use mobile device emulation (Pixel 5) for lighter page responses.

This makes <get> the universal URL-fetching verb regardless of where the URL came from — search results, page prose, or operator input.

Programmatic Access

WebFetcher is available as a standalone export:

import WebFetcher from "@possumtech/rummy.web/fetcher";

const fetcher = new WebFetcher();
const runId = 1; // required — scopes the BrowserContext (cookies, cache, etc.)

const page = await fetcher.fetch("https://example.com", { runId });
console.log(page.title, page.content);

const pages = await fetcher.fetchAll(["https://a.com", "https://b.com"], { runId, timeout: 10000 });

const results = await fetcher.search("query");

fetcher.closeContext(runId); // free the run's context when done
await fetcher.close();         // tear down the browser

runId is load-bearing: each run gets its own BrowserContext (isolated cookies, localStorage, cache) on a shared chromium. Calling fetch/fetchAll without one throws.

WebFetcher.fetch(url, opts?) Response

{ url, title, content, excerpt, byline, siteName }
// or on failure:
{ url, title: null, content: null, error: "message" }

WebFetcher.fetchAll(urls, opts?) Response

Returns Promise.allSettled — array of { status, value } objects, each value matching the fetch response shape.

WebFetcher.search(query, opts?) Response

Returns SearXNG's native per-result shape verbatim, sliced to limit:

[{
  url, title, content,
  engine, engines,
  publishedDate, score, category,
  thumbnail, // … plus any other fields SearXNG emits
}]

content is the upstream snippet (already lxml-extracted by SearXNG — no entity-decode or highlight-strip needed on our side). score is SearXNG's per-result quality signal; engines is the set of upstream engines that returned this URL (a useful cross-backend agreement signal when SearXNG is configured with multiple engines).

WebFetcher.cleanUrl(raw)

Static. Strips query params, hash fragments, and trailing slashes.

License

MIT