npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@kevinjosethomas/rlm

v0.1.1

Published

Recursive Language Models in TypeScript — browser-native RLM

Readme

rlm-ts

A TypeScript re-implementation of Recursive Language Models that runs entirely in the browser. Based on Prime Intellect's RLM paper and reference implementation.

What is an RLM?

Standard LLMs degrade on long contexts — performance drops and cost scales linearly. Recursive Language Models flip this: instead of dumping all data into one prompt, the model gets a persistent JavaScript REPL and can spawn sub-LLM calls to process data in parallel. The main model's context stays small; heavy data processing is delegated.

Three primitives:

  1. Persistent REPL — the model writes JavaScript in ```repl``` blocks. Variables declared with var persist across executions, building up state incrementally.
  2. llm_query(prompt) — spawns a sub-LLM call for summarization, extraction, or interpretation. The sub-LLM handles the data; the main model only sees the summary.
  3. llm_batch(prompts) — runs multiple sub-LLM calls concurrently for parallel analysis.

The model iterates — writing code, inspecting results, delegating to sub-LLMs, refining its analysis — until it produces a final answer.

How it works

Browser                                    Your Server
┌─────────────────────────────────┐        ┌──────────────────────┐
│                                 │        │                      │
│  RLM Core (main thread)        │        │  /api/rlm            │
│  ├── Iteration loop             │        │  ├── Main calls      │
│  ├── Parse ```repl``` blocks   ├──fetch──┤  │   (Sonnet)        │
│  ├── Build message history      │        │  ├── Sub-LLM calls   │
│  └── Detect FINAL answer        │        │  │   (Haiku)         │
│       │                         │        │  └── Batch calls     │
│       ▼                         │        │                      │
│  Web Worker (sandbox)           │        └──────────┬───────────┘
│  ├── eval() code execution      │                   │
│  ├── Persistent var state       │                   ▼
│  ├── llm_query() via Atomics    │           LLM Provider API
│  └── Console capture            │
│                                 │
└─────────────────────────────────┘

The sandbox runs in a Web Worker — an isolated thread with no DOM, no network access, and no main-thread scope. When the model calls llm_query() inside the sandbox, it blocks synchronously via SharedArrayBuffer + Atomics.wait() while the main thread fetches from your API route and writes the response back to shared memory.

This means the model can write natural synchronous code:

var data = sleep_data.slice(0, 30)
var analysis = llm_query("Analyze these sleep records: " + JSON.stringify(data))
console.log(analysis)

Even though the LLM call is async under the hood.

Install

npm install rlm-ts

Usage

1. Create an API route

The RLM needs a server-side proxy to keep API keys off the client. Here's a Next.js example using the Vercel AI SDK:

// app/api/rlm/route.ts
import { generateText } from "ai";
import { gateway } from "@ai-sdk/gateway";

export async function POST(req: Request) {
  const body = await req.json();

  if (body.type === "batch") {
    const results = await Promise.all(
      body.prompts.map((prompt: string) =>
        generateText({
          model: gateway(body.model || "anthropic/claude-haiku-4.5"),
          prompt,
        }).then((r) => r.text)
      )
    );
    return Response.json({ results });
  }

  if (body.messages) {
    const { text } = await generateText({
      model: gateway(body.model || "anthropic/claude-sonnet-4-5"),
      messages: body.messages,
    });
    return Response.json({ text });
  }

  const { text } = await generateText({
    model: gateway(body.model || "anthropic/claude-haiku-4.5"),
    prompt: body.prompt,
  });
  return Response.json({ text });
}

The route handles three request shapes:

  • Messages array — main RLM loop calls (uses the strong model)
  • Prompt string — sub-LLM calls from the sandbox (uses the cheap model)
  • Batch — parallel sub-LLM calls

2. Add required headers

SharedArrayBuffer requires these HTTP headers:

// next.config.mjs
headers: async () => [
  {
    source: "/(.*)",
    headers: [
      { key: "Cross-Origin-Opener-Policy", value: "same-origin" },
      { key: "Cross-Origin-Embedder-Policy", value: "require-corp" },
    ],
  },
],

3. Run it

import { RLM } from "rlm-ts";

const rlm = new RLM({
  endpoint: "/api/rlm",
  tools: {
    users: {
      tool: [{ name: "Alice", age: 30 }, { name: "Bob", age: 25 }],
      description: "User records: {name, age}",
    },
  },
  onIteration: (iteration) => {
    console.log(`Iteration ${iteration.index + 1}:`, iteration.response);
  },
});

const result = await rlm.run("Who is the oldest user?");
console.log(result.answer);

API

new RLM(options)

| Option | Type | Default | Description | |--------|------|---------|-------------| | endpoint | string | required | URL of your LLM proxy route | | model | string | "anthropic/claude-sonnet-4-5" | Model for the main RLM loop | | subModel | string | "anthropic/claude-haiku-4.5" | Model for sub-LLM calls | | maxIterations | number | 20 | Max loop iterations before forcing a final answer | | maxOutputChars | number | 20000 | Truncate execution output beyond this length | | maxTimeout | number | — | Max total run time in ms | | maxErrors | number | 5 | Stop after this many consecutive sandbox errors | | systemPrompt | string | Built-in prompt | Override the system prompt | | tools | Record<string, ToolDefinition> | {} | Data injected into the sandbox | | onIteration | (iteration: RLMIteration) => void | — | Callback after each iteration | | onSubLLMCall | (req: { prompt, model? }) => void | — | Callback when a sub-LLM call is made |

rlm.run(prompt): Promise<RLMResult>

Returns:

interface RLMResult {
  answer: string;           // The final answer
  iterations: RLMIteration[]; // Full iteration history
  totalTime: number;         // Total execution time in ms
}

Tools

Tools are data values injected into the sandbox as global variables. The model can access them by name in its code.

tools: {
  // With description (shown in system prompt)
  my_data: {
    tool: [1, 2, 3],
    description: "An array of numbers",
  },
  // Plain value
  config: { threshold: 0.5 },
}

Tools must be JSON-serializable (no functions — they can't be sent to a Web Worker via postMessage).

Architecture

Why eval() in a Web Worker?

The Python RLM uses exec() with restricted builtins. We use eval() in a Web Worker, which gives us:

  • Variable persistencevar declarations in eval() become Worker globals and survive across executeCode() calls. This is critical for the RLM loop where the model builds up state incrementally.
  • Natural sandboxing — Web Workers have no DOM, no document, no window, no localStorage, no cookies. Network access (fetch, XMLHttpRequest) is explicitly blocked.
  • No infrastructure — the Python version needs Docker containers or Modal sandboxes. The Worker IS the sandbox.

Why SharedArrayBuffer + Atomics?

The model writes synchronous code (var result = llm_query("...")), but LLM calls require async fetch(). Atomics.wait() blocks the Worker thread while the main thread handles the fetch and writes the response to shared memory. This lets the model write natural code without async/await.

This requires Cross-Origin-Opener-Policy: same-origin and Cross-Origin-Embedder-Policy: require-corp headers.

Differences from the Python implementation

| | Python RLM | rlm-ts | |---|---|---| | Sandbox | exec() + restricted builtins, Docker, Modal | Web Worker + eval() | | IPC | TCP sockets (LMHandler) | postMessage + SharedArrayBuffer | | State persistence | dill serialization, self.locals dict | Worker globals via eval() | | Sub-LLM sync | Blocking TCP request | Atomics.wait() | | Recursion | rlm_query() spawns child RLM with own REPL | Not yet (v0.1) | | Filesystem | tempfile, open(), os | None (browser) |

Development

npm run build          # Build worker + main package
npm run sync           # Build and copy to ../dashboard/node_modules/rlm-ts
npm test               # Run tests

License

MIT