npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

rlm-cli

v0.6.1

Published

Standalone CLI for Recursive Language Models (RLMs) — implements Algorithm 1 from arXiv:2512.24601

Readme

rlm-cli

npm version license node

CLI for Recursive Language Models — based on the RLM paper.

Instead of dumping a huge context into a single LLM call, RLM lets the model write Python code to process it — slicing, chunking, running sub-queries on pieces, and building up an answer across multiple iterations.

Quickstart

npm install -g rlm-cli
rlm                                          # interactive — first run sets up your provider + key
rlm run --file big.log "which errors repeat most, and when?"

Works with Anthropic, OpenAI, Google, OpenRouter, or local Ollama models. The model-generated Python runs in an OS-level sandbox by default (details).

Security

rlm runs Python that the LLM writes, and a prompt-injected context document (a file or fetched URL) can steer that code — so the subprocess is treated as untrusted.

  • Sandboxed by default. The Python subprocess runs inside an OS-level sandbox (macOS sandbox-exec/Seatbelt, Linux bwrap/bubblewrap) that blocks all network and hides ~/.rlm, so injected code can't exfiltrate your API keys. All LLM calls are proxied through the parent process, so the child needs neither network nor credential access — nothing legitimate breaks. The sandbox is probe-tested before use.
  • Graceful fallback. Where no sandbox is available (Windows, or bwrap not installed), rlm prints a warning and runs unsandboxed — the code then has full access as your user, so only point it at content you trust or run inside a container/VM.
  • Opt out with RLM_NO_SANDBOX=1 when model code legitimately needs network or local-file access and you trust the input.

The sandbox confines the primary exfiltration paths (network + credential reads). It does not yet fully confine arbitrary local file writes — treat untrusted context with care regardless.

What's New in v0.6.0

  • Sandboxed execution — model-generated Python runs in an OS-level sandbox by default (no network, no access to ~/.rlm), so prompt-injected code can't exfiltrate your keys — see Security
  • Ollama support — use any locally-installed model (llama3, mistral, qwen, etc.) with zero API key setup
  • Mixed-model modesub_model in config lets you use a cheap/fast model for sub-queries and a powerful model for the root loop (mirrors the paper's GPT-5 + GPT-5-mini setup)
  • Paper-aligned system prompt — per-iteration budget awareness, sub-query strategy guidance, parallel async patterns from arXiv:2512.24601
  • Session-based trajectories — runs grouped into ~/.rlm/sessions/<session-id>/ instead of a flat directory
  • Refreshed terminal UI — Electric Amber RGB palette, two-column welcome panel with version in border, silent operation (no noise between queries)
  • Honest runtime limitsmax_depth is pinned to 1 because the current runtime implements flat paper-style sub-calls, not nested recursive RLM agents

Install

npm install -g rlm-cli

Requires Node.js >= 20 and Python 3.

Run rlm to start. First launch will prompt for a provider + API key (saved to ~/.rlm/credentials).


Supported Providers

| Provider | Env Variable | Default Model | |----------|-------------|---------------| | Anthropic | ANTHROPIC_API_KEY | claude-sonnet-4-6 | | OpenAI | OPENAI_API_KEY | gpt-4o | | Google | GEMINI_API_KEY | gemini-2.5-flash | | OpenRouter | OPENROUTER_API_KEY | auto | | Ollama | (no key needed) | any installed model |

Ollama (local models)

If Ollama is running, rlm-cli auto-detects it at startup — no config needed.

ollama pull llama3.1:8b
rlm
# → /model llama3.1:8b   or   /provider → choose Ollama

Set a custom daemon URL with OLLAMA_BASE_URL=http://....

Keys are loaded from (highest priority wins):

  1. Shell environment variables
  2. .env file in the current working directory (falls back to the package root)
  3. ~/.rlm/credentials

From Source

git clone https://github.com/viplismism/rlm-cli.git
cd rlm-cli
npm install
npm run build
npm link

Usage

Interactive Terminal

rlm

Persistent session with a two-column welcome panel showing your model, provider, context, and quick-ref slash commands. Everything auto-saves to a session folder.

Slash commands:

| Command | What it does | |---------|-------------| | /file <path> | Load file, directory, or glob as context | | /url <url> | Fetch URL as context | | @file <query> | Load file + run query in one step | | /model [id] | List or switch model by ID (shows Ollama models too) | | /provider | Switch provider (includes Ollama if running) | | /trace | Open the live RLM trace window | | /trajectories | Browse saved sessions | | /clear | Clear the transcript | | /help | Full command reference | | /quit | Exit |

Tips:

  • Just type a question — no context needed for general queries
  • Paste a URL directly to fetch it as context
  • Ctrl+C stops a running query, Ctrl+C twice exits

Single-Shot Mode

rlm run "Explain recursive language models"
rlm run --file large-file.txt "List all classes and their methods"
rlm run --url https://example.com/data.txt "Summarize this"
cat data.txt | rlm run --stdin "Count the errors"
rlm run --model gpt-4o --file code.py "Find bugs"

Answer goes to stdout, progress to stderr — pipe-friendly.

Trajectory Viewer

rlm viewer

Browse saved runs in a TUI. Navigate iterations, inspect code and output at each step, drill into sub-queries. Sessions are saved to ~/.rlm/sessions/.


Benchmarks

Compare direct LLM vs RLM on the same query from standard long-context datasets.

| Benchmark | Dataset | What it tests | |-----------|---------|---------------| | oolong | Oolong Synth | Synthetic long-context: timeline ordering, user tracking, counting | | longbench | LongBench NarrativeQA | Reading comprehension over long narratives |

rlm benchmark oolong          # default: index 4743
rlm benchmark longbench       # default: index 182
rlm benchmark oolong --idx 10

Python dependencies are auto-installed into a .venv on first run.

Note: rlm benchmark requires a source checkout of the repo (see From Source) — it is not available in npm installs.


How It Works

  1. Your full context is loaded into a persistent Python REPL as a context variable
  2. The LLM gets metadata about the context (size, preview) plus your query
  3. It writes Python code that can slice context, call llm_query(chunk, instruction) for sub-questions, and call FINAL(answer) when done
  4. Code runs, output is captured and fed back for the next iteration
  5. Loop continues until FINAL() is called or max iterations are reached

For large documents, the model chunks the text and runs parallel sub-queries with async_llm_query() + asyncio.gather(), then aggregates the results.


Configuration

Create rlm_config.yaml in your working directory:

max_iterations: 20       # Max iterations before giving up (1-100)
max_depth: 1             # Fixed at 1 in the current runtime
max_sub_queries: 50      # Max total sub-queries (1-500)
truncate_len: 5000       # Truncate REPL output beyond this (500-50000)
metadata_preview_lines: 20

# Use a cheaper/faster model for sub-queries (paper: GPT-5-mini for sub-calls)
# sub_model: gpt-4o-mini
# sub_model: claude-haiku-3-5
# sub_model: llama3.1:8b        # Ollama model for free sub-queries!

The sub_model option is the key cost-saving trick from the paper — a fast cheap model handles the chunking work while the root model synthesizes the final answer.


Project Structure

src/
  main.ts          CLI entry point and command router
  interactive.ts   Interactive terminal REPL
  rlm.ts           Core RLM loop (Algorithm 1 from paper)
  repl.ts          Python REPL subprocess manager
  sandbox.ts       OS-level sandbox for the Python subprocess (Seatbelt/bubblewrap)
  runtime.py       Python runtime (FINAL, FINAL_VAR, llm_query, async_llm_query)
  cli.ts           Single-shot CLI mode
  viewer.ts        Trajectory viewer TUI
  colors.ts        Terminal color palette (Electric Amber RGB)
  ollama.ts        Ollama local model integration
  config.ts        Config loader
  env.ts           Environment variable loader
benchmarks/
  oolong_synth.ts
  longbench_narrativeqa.ts
  requirements.txt
bin/
  rlm.mjs          Global CLI shim

License

MIT