npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

matryoshka-rlm

v0.2.6

Published

Recursive Language Model - Process documents larger than LLM context windows

Readme

Recursive Language Model (RLM)

Process documents 100x larger than your LLM's context window—without vector databases or chunking heuristics.

The Problem

LLMs have fixed context windows. Traditional solutions (RAG, chunking) lose information or miss connections across chunks. RLM takes a different approach: the model reasons about your query and outputs symbolic commands that a logic engine executes against the document.

Based on the Recursive Language Models paper.

How It Works

Unlike traditional approaches where an LLM writes arbitrary code, RLM uses Nucleus—a constrained symbolic language based on S-expressions. The LLM outputs Nucleus commands, which are parsed, type-checked, and executed by Lattice, our logic engine.

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│   User Query    │────▶│   LLM Reasons   │────▶│ Nucleus Command │
│ "total sales?"  │     │  about intent   │     │  (sum RESULTS)  │
└─────────────────┘     └─────────────────┘     └────────┬────────┘
                                                         │
┌─────────────────┐     ┌─────────────────┐     ┌────────▼────────┐
│  Final Answer   │◀────│ Lattice Engine  │◀────│     Parser      │
│   13,000,000    │     │    Executes     │     │    Validates    │
└─────────────────┘     └─────────────────┘     └─────────────────┘

Why this works better than code generation:

  1. Reduced entropy - Nucleus has a rigid grammar with fewer valid outputs than JavaScript
  2. Fail-fast validation - Parser rejects malformed commands before execution
  3. Safe execution - Lattice only executes known operations, no arbitrary code
  4. Small model friendly - 7B models handle symbolic grammars better than freeform code

Architecture

The Nucleus DSL

The LLM outputs commands in the Nucleus DSL—an S-expression language designed for document analysis:

; Search for patterns
(grep "SALES_DATA")

; Filter results
(filter RESULTS (lambda x (match x "NORTH" 0)))

; Aggregate
(sum RESULTS)    ; Auto-extracts numbers like "$2,340,000" from lines
(count RESULTS)  ; Count matching items

; Final answer
<<<FINAL>>>13000000<<<END>>>

The Lattice Engine

The Lattice engine (src/logic/) processes Nucleus commands:

  1. Parser (lc-parser.ts) - Parses S-expressions into an AST
  2. Type Inference (type-inference.ts) - Validates types before execution
  3. Constraint Resolver (constraint-resolver.ts) - Handles symbolic constraints like [Σ⚡μ]
  4. Solver (lc-solver.ts) - Executes commands against the document

Lattice uses miniKanren (a relational programming engine) for pattern classification and filtering operations.

SQLite Handle-Based Persistence

For large result sets, RLM uses a handle-based architecture (src/persistence/) that achieves 97%+ token savings:

Traditional:  LLM sees full array    [15,000 tokens for 1000 results]
Handle-based: LLM sees stub          [50 tokens: "$res1: Array(1000) [preview...]"]

How it works:

  1. Results are stored in SQLite with FTS5 full-text indexing
  2. LLM receives only handle references ($res1, $res2, etc.)
  3. Operations execute server-side, returning new handles
  4. Full data is only materialized when needed

Components:

  • SessionDB - In-memory SQLite with FTS5 for fast full-text search
  • HandleRegistry - Stores arrays, returns compact handle references
  • HandleOps - Server-side filter/map/count/sum on handles
  • FTS5Search - Phrase queries, boolean operators, relevance ranking
  • CheckpointManager - Save/restore session state

Pre-Search Optimization

Before calling the LLM, the system extracts keywords from your query and pre-runs grep:

Query: "What is the total of all north sales data values?"
                    │
                    ▼
┌─────────────────────────────────────────────────────┐
│ Pre-search extracts: "north", "sales", "data"       │
│ Tries compound patterns: SALES.*NORTH, NORTH.*SALES │
│ Pre-populates RESULTS before LLM is called          │
└─────────────────────────────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────────────────┐
│ LLM receives: "RESULTS has 1 match"                 │
│ LLM outputs: (sum RESULTS)  ← skips search step!   │
└─────────────────────────────────────────────────────┘

This saves turns by pre-populating RESULTS so the model can immediately aggregate.

The Role of the LLM

The LLM does reasoning, not code generation:

  1. Understands intent - Interprets "total of north sales" as needing grep + filter + sum
  2. Chooses operations - Decides which Nucleus commands achieve the goal
  3. Verifies results - Checks if the current results answer the query
  4. Iterates - Refines search if results are too broad or narrow

The LLM never writes JavaScript. It outputs Nucleus commands that Lattice executes safely.

Components Summary

| Component | Purpose | |-----------|---------| | Nucleus Adapter | Prompts LLM to output Nucleus commands | | Lattice Parser | Parses S-expressions to AST | | Lattice Solver | Executes commands against document | | SQLite Persistence | Handle-based storage with FTS5 (97% token savings) | | miniKanren | Relational engine for classification | | Pre-Search | Extracts keywords and pre-runs grep | | RAG Hints | Few-shot examples from past successes |

Installation

Install from npm:

npm install -g matryoshka-rlm

Or run without installing:

npx matryoshka-rlm "What is the total of all sales values?" ./report.txt

Included Tools

The package provides several CLI tools:

| Command | Description | |---------|-------------| | rlm | Main CLI for document analysis with LLM reasoning | | rlm-mcp | MCP server exposing analyze_document tool | | lattice-mcp | MCP server exposing direct Nucleus commands (no LLM required) | | lattice-repl | Interactive REPL for Nucleus commands | | lattice-http | HTTP server for Nucleus queries | | lattice-pipe | Pipe adapter for programmatic access | | lattice-setup | Setup script for Claude Code integration |

From Source

git clone https://github.com/yogthos/Matryoshka.git
cd Matryoshka
npm install
npm run build

Configuration

Copy config.example.json to config.json and configure your LLM provider:

{
  "llm": {
    "provider": "ollama"
  },
  "providers": {
    "ollama": {
      "baseUrl": "http://localhost:11434",
      "model": "qwen2.5-coder:7b",
      "options": { "temperature": 0.2, "num_ctx": 8192 }
    },
    "deepseek": {
      "baseUrl": "https://api.deepseek.com",
      "apiKey": "${DEEPSEEK_API_KEY}",
      "model": "deepseek-chat",
      "options": { "temperature": 0.2 }
    }
  }
}

Usage

CLI

# Basic usage
rlm "What is the total of all sales values?" ./report.txt

# With options
rlm "Count all ERROR entries" ./logs.txt --max-turns 15 --verbose

# See all options
rlm --help

MCP Integration

RLM includes an MCP (Model Context Protocol) server that exposes the analyze_document tool. This allows coding agents to analyze documents that exceed their context window.

MCP Tool: analyze_document

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | query | string | Yes | The question or task to perform on the document | | filePath | string | Yes | Absolute path to the document file | | maxTurns | number | No | Maximum exploration turns (default: 10) | | timeoutMs | number | No | Timeout per turn in milliseconds (default: 30000) |

Example MCP config

{
  "mcp": {
    "rlm": {
      "type": "stdio",
      "command": "rlm-mcp"
    }
  }
}

Testing the MCP Server

rlm-mcp --test
# Output: MCP server ready
# Output: Available tools: analyze_document

Lattice MCP Server

For direct access to the Nucleus engine without LLM orchestration, use lattice-mcp. This is useful when you want to run precise, programmatic queries with 80%+ token savings compared to reading files directly.

Lattice MCP Tools

| Tool | Description | |------|-------------| | lattice_load | Load a document for analysis | | lattice_query | Execute Nucleus commands on the loaded document | | lattice_close | Close the session and free memory | | lattice_status | Get session status and document info | | lattice_bindings | Show current variable bindings | | lattice_reset | Reset bindings but keep document loaded | | lattice_help | Get Nucleus command reference |

Example Lattice MCP config

{
  "mcp": {
    "lattice": {
      "type": "stdio",
      "command": "lattice-mcp"
    }
  }
}

Efficient Usage Pattern

1. lattice_load("/path/to/large-file.txt")   # Load document (use for >500 lines)
2. lattice_query('(grep "ERROR")')           # Search - shows preview of first 20
3. lattice_query('(filter RESULTS ...)')     # Narrow down - updates RESULTS
4. lattice_query('(count RESULTS)')          # Get count without listing all
5. lattice_close()                           # Free memory when done

Token efficiency tips:

  • Use (count RESULTS) instead of viewing all results
  • Chain grep → filter → count/sum to refine progressively
  • Results show preview (first 20), use filter to narrow down
  • Previous results available as _1, _2, etc.

Programmatic

import { runRLM } from "matryoshka-rlm/rlm";
import { createLLMClient } from "matryoshka-rlm";

const llmClient = createLLMClient("ollama", {
  baseUrl: "http://localhost:11434",
  model: "qwen2.5-coder:7b",
  options: { temperature: 0.2 }
});

const result = await runRLM("What is the total of all sales values?", "./report.txt", {
  llmClient,
  maxTurns: 10,
  turnTimeoutMs: 30000,
});

Example Session

$ rlm "What is the total of all north sales data values?" ./report.txt --verbose

[Pre-search] Found 1 data matches for "SALES.*NORTH"
[Pre-search] RESULTS pre-populated with 1 matches

──────────────────────────────────────────────────
[Turn 1/10] Querying LLM...
[Turn 1] Term: (sum RESULTS)
[Turn 1] Console output:
  [Lattice] Summing 1 values
  [Lattice] Sum = 2340000
[Turn 1] Result: 2340000

──────────────────────────────────────────────────
[Turn 2/10] Querying LLM...
[Turn 2] Final answer received

2340000

The model:

  1. Received pre-populated RESULTS (pre-search found the data)
  2. Immediately summed the results (no grep needed)
  3. Output the final answer

Nucleus DSL Reference

Search Commands

(grep "pattern")              ; Regex search, returns matches with line numbers
(fuzzy_search "query" 10)     ; Fuzzy search, returns top N matches with scores
(text_stats)                  ; Document metadata (length, line count, samples)

Collection Operations

(filter RESULTS (lambda x (match x "pattern" 0)))  ; Filter by regex
(map RESULTS (lambda x (match x "(\\d+)" 1)))      ; Extract from each
(sum RESULTS)                                       ; Sum numbers in results
(count RESULTS)                                     ; Count items

String Operations

(match str "pattern" 0)       ; Regex match, return group N
(replace str "from" "to")     ; String replacement
(split str "," 0)             ; Split and get index
(parseInt str)                ; Parse integer
(parseFloat str)              ; Parse float

Type Coercion

When the model sees data that needs parsing, it can use declarative type coercion:

; Date parsing (returns ISO format YYYY-MM-DD)
(parseDate "Jan 15, 2024")           ; -> "2024-01-15"
(parseDate "01/15/2024" "US")        ; -> "2024-01-15" (MM/DD/YYYY)
(parseDate "15/01/2024" "EU")        ; -> "2024-01-15" (DD/MM/YYYY)

; Currency parsing (handles $, €, commas, etc.)
(parseCurrency "$1,234.56")          ; -> 1234.56
(parseCurrency "€1.234,56")          ; -> 1234.56 (EU format)

; Number parsing
(parseNumber "1,234,567")            ; -> 1234567
(parseNumber "50%")                  ; -> 0.5

; General coercion
(coerce value "date")                ; Coerce to date
(coerce value "currency")            ; Coerce to currency
(coerce value "number")              ; Coerce to number

; Extract and coerce in one step
(extract str "\\$[\\d,]+" 0 "currency")  ; Extract and parse as currency

Use in map for batch transformations:

; Parse all dates in results
(map RESULTS (lambda x (parseDate (match x "[A-Za-z]+ \\d+, \\d+" 0))))

; Extract and sum currencies
(map RESULTS (lambda x (parseCurrency (match x "\\$[\\d,]+" 0))))

Program Synthesis

For complex transformations, the model can synthesize functions from examples:

; Synthesize from input/output pairs
(synthesize
  ("$100" 100)
  ("$1,234" 1234)
  ("$50,000" 50000))
; -> Returns a function that extracts numbers from currency strings

This uses Barliman-style relational synthesis with miniKanren to automatically build extraction functions.

Cross-Turn State

Results from previous turns are available:

  • RESULTS - Latest array result (updated by grep, filter)
  • _0, _1, _2, ... - Results from specific turns

Final Answer

<<<FINAL>>>your answer here<<<END>>>

Troubleshooting

Model Answers Without Exploring

Symptom: The model provides an answer immediately with hallucinated data.

Solutions:

  1. Use a more capable model (7B+ recommended)
  2. Be specific in your query: "Find lines containing SALES_DATA and sum the dollar amounts"

Max Turns Reached

Symptom: "Max turns (N) reached without final answer"

Solutions:

  1. Increase --max-turns for complex documents
  2. Check --verbose output for repeated patterns (model stuck in loop)
  3. Simplify the query

Parse Errors

Symptom: "Parse error: no valid command"

Cause: Model output malformed S-expression.

Solutions:

  1. The system auto-converts JSON to S-expressions as fallback
  2. Use --verbose to see what the model is generating
  3. Try a different model tuned for code/symbolic output

Development

npm test                              # Run tests
npm test -- --coverage                # With coverage
RUN_E2E=1 npm test -- tests/e2e.test.ts  # E2E tests (requires Ollama)
npm run build                         # Build
npm run typecheck                     # Type check

Project Structure

src/
├── adapters/           # Model-specific prompting
│   ├── nucleus.ts      # Nucleus DSL adapter
│   └── types.ts        # Adapter interface
├── logic/              # Lattice engine
│   ├── lc-parser.ts    # Nucleus parser
│   ├── lc-solver.ts    # Command executor (uses miniKanren)
│   ├── type-inference.ts
│   └── constraint-resolver.ts
├── persistence/        # SQLite handle-based storage (97% token savings)
│   ├── session-db.ts   # In-memory SQLite with FTS5
│   ├── handle-registry.ts  # Handle creation and stubs
│   ├── handle-ops.ts   # Server-side operations
│   ├── fts5-search.ts  # Full-text search
│   └── checkpoint.ts   # Session persistence
├── engine/             # Nucleus execution engine
│   └── nucleus-engine.ts
├── minikanren/         # Relational programming engine
├── synthesis/          # Program synthesis (Barliman-style)
│   └── evalo/          # Extractor DSL
├── rag/                # Few-shot hint retrieval
└── rlm.ts              # Main execution loop

Acknowledgements

This project incorporates ideas and code from:

  • Nucleus - A symbolic S-expression language by Michael Whitford. RLM uses Nucleus syntax for the constrained DSL that the LLM outputs, providing a rigid grammar that reduces model errors.
  • ramo - A miniKanren implementation in TypeScript by Will Lewis. Used for constraint-based program synthesis.
  • Barliman - A prototype smart editor by William Byrd and Greg Rosenblatt that uses program synthesis to assist programmers. The Barliman-style approach of providing input/output constraints instead of code inspired the synthesis workflow.

License

MIT

References