npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

kado-rlm

v0.1.1

Published

Recursive Language Model library for handling arbitrarily long contexts

Readme

Kado RLM - Recursive Language Model Library

A production-ready Node.js/TypeScript library implementing Recursive Language Models (RLMs) for handling arbitrarily long contexts. Based on the RLM research paper, this library enables LLMs to process inputs up to two orders of magnitude beyond their native context windows.

Features

  • RLM Orchestration: Treats long prompts as external environment data, allowing LLMs to programmatically examine, decompose, and recursively call themselves over context snippets
  • Pluggable Tool System: Register any RAG, knowledge base, database, or API as callable functions
  • Multi-Provider Support: OpenAI, Anthropic, and Google AI out of the box
  • Secure Sandbox: V8 isolates for safe execution of LLM-generated code
  • Full Observability: Prometheus metrics, Loki logging, and Tempo tracing via Grafana stack
  • Built-in Benchmarking: Compare RLM performance against base LLM calls
  • Production Ready: Circuit breakers, retry logic, rate limiting, and health checks

Installation

As a Library (Recommended)

npm install kado-rlm
# or
pnpm add kado-rlm
# or
yarn add kado-rlm

From Source

git clone https://github.com/your-org/kado-rlm.git
cd kado-rlm
pnpm install
pnpm build

Quick Start

Library Usage

import { 
  RLMOrchestrator, 
  ContextManager, 
  createLLMClient, 
  defineTools 
} from 'kado-rlm';

// 1. Create an LLM client
const llmClient = createLLMClient('openai', { model: 'gpt-4o' });

// 2. Create a context manager with your long content
const contextManager = new ContextManager(yourLongDocument);

// 3. (Optional) Define custom tools for RAG, databases, etc.
const tools = defineTools([
  {
    name: 'search_docs',
    description: 'Search the knowledge base for relevant information',
    parameters: [
      { name: 'query', type: 'string', description: 'Search query', required: true },
      { name: 'limit', type: 'number', description: 'Max results', default: 5 },
    ],
    handler: async (query: string, limit = 5) => {
      return await yourVectorDB.search(query, { topK: limit });
    },
  },
]);

// 4. Create the orchestrator
const orchestrator = new RLMOrchestrator({
  llmClient,
  contextManager,
  customTools: tools,
  maxIterations: 20,
  maxDepth: 2,
});

// 5. Run!
const result = await orchestrator.run('What are the key findings in this document?');

console.log(result.answer);
console.log(`Completed in ${result.usage.iterations} iterations`);

Running as a Service

# Set up environment
cp env.example .env
# Edit .env with your API keys

# Start development server
pnpm dev

# Or production
pnpm build
pnpm start

Then make HTTP requests:

curl -X POST http://localhost:3000/v1/completion \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "What is the secret code mentioned in the text?",
    "context": "... your long context here ...",
    "provider": "openai",
    "model": "gpt-4o"
  }'

Custom Tools

The pluggable tool system lets you register any external service as a function the LLM can call during reasoning.

Defining Tools

import { defineTools } from 'kado-rlm';

const tools = defineTools([
  // RAG / Vector Search
  {
    name: 'rag_search',
    description: 'Search the vector database for semantically similar documents',
    parameters: [
      { name: 'query', type: 'string', description: 'Natural language search query', required: true },
      { name: 'topK', type: 'number', description: 'Number of results to return', default: 10 },
    ],
    returns: 'Array of { content, score, metadata }',
    handler: async (query: string, topK = 10) => {
      const embedding = await embeddings.embed(query);
      return await pinecone.query({ vector: embedding, topK });
    },
  },

  // Database Lookup
  {
    name: 'get_customer',
    description: 'Fetch customer details from the database',
    parameters: [
      { name: 'customerId', type: 'string', description: 'Customer ID', required: true },
    ],
    handler: async (customerId: string) => {
      return await db.customers.findById(customerId);
    },
  },

  // External API
  {
    name: 'check_weather',
    description: 'Get current weather for a location',
    parameters: [
      { name: 'city', type: 'string', description: 'City name', required: true },
    ],
    handler: async (city: string) => {
      const response = await fetch(`https://api.weather.com/v1/current?city=${city}`);
      return response.json();
    },
  },
]);

The LLM can then use these tools in its generated code:

// LLM-generated sandbox code
const docs = await rag_search("authentication flow", 5);
const customer = await get_customer("cust_12345");

for (const doc of docs) {
  print(`Found: ${doc.content.slice(0, 100)}...`);
}

giveFinalAnswer({
  message: "Based on the documentation and customer data...",
  data: { sources: docs.map(d => d.metadata.source) }
});

See the Tools Guide for detailed documentation on registering tools, and the RAG Integration Guide for RAG-specific patterns.

API Endpoints

| Method | Path | Description | |--------|------|-------------| | POST | /v1/completion | Run RLM completion with context | | POST | /v1/chat | Direct LLM call (baseline comparison) | | POST | /v1/benchmark | Start benchmark run | | GET | /v1/benchmark/:id | Get benchmark results | | GET | /v1/models | List available models | | GET | /health | Liveness probe | | GET | /ready | Readiness probe | | GET | /metrics | Prometheus metrics | | GET | /docs | Swagger UI documentation |

Benchmarking

The built-in benchmark system compares RLM performance against direct LLM calls.

Via API

curl -X POST http://localhost:3000/v1/benchmark \
  -H "Content-Type: application/json" \
  -d '{
    "tasks": ["sniah", "multi-niah", "aggregation"],
    "sizes": [8000, 16000, 32000, 64000],
    "provider": "openai",
    "model": "gpt-4o",
    "runs": 3
  }'

Via CLI

# Run benchmark suite
pnpm benchmark --tasks sniah,aggregation --sizes 8000,16000,32000 --provider openai --model gpt-4o

# Output as JSON
pnpm benchmark --output json > results.json

Task Types

| Task | Description | Complexity | |------|-------------|------------| | sniah | Single needle-in-haystack | Constant | | multi-niah | Multiple needles | Linear | | aggregation | Count/sum across context | Linear | | pairwise | Find matching pairs | Quadratic |

Observability

Local Development with Grafana Stack

cd docker
docker-compose up -d

# Access services:
# - Kado RLM: http://localhost:3000
# - Grafana: http://localhost:3001 (admin/admin)
# - Prometheus: http://localhost:9090

Metrics

Key metrics exposed at /metrics:

  • rlm_request_duration_seconds - Request latency histogram
  • rlm_iterations_total - RLM iteration count
  • rlm_recursion_depth - Recursion depth distribution
  • rlm_tokens_total - Token usage by type
  • rlm_errors_total - Error counts by type
  • rlm_circuit_breaker_state - Circuit breaker status

Logging

Structured JSON logs with correlation IDs. In development, pretty-printed via pino-pretty.

Tracing

OpenTelemetry traces exported to Tempo:

  • Span per API request
  • Child spans for LLM calls, sandbox executions, recursive sub-calls
  • Automatic trace ID propagation

Configuration

Configure via environment variables (see env.example):

# Server
PORT=3000
NODE_ENV=development

# LLM Providers
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=...
DEFAULT_PROVIDER=openai
DEFAULT_MODEL=gpt-4o

# RLM Settings
MAX_ITERATIONS=20
MAX_RECURSION_DEPTH=3
SANDBOX_TIMEOUT_MS=5000
SANDBOX_MEMORY_MB=128

# Observability
METRICS_ENABLED=true
LOKI_ENABLED=false
TRACING_ENABLED=false

Architecture

┌─────────────────────────────────────────────────────────┐
│                      API Layer                          │
│  (Fastify + Rate Limiting + Auth + Swagger)            │
└─────────────────────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────┐
│                   RLM Orchestrator                       │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐     │
│  │   Context   │  │   Sandbox   │  │   Custom    │     │
│  │   Manager   │  │ (V8 Isolate)│  │   Tools     │     │
│  └─────────────┘  └─────────────┘  └─────────────┘     │
└─────────────────────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────┐
│                    LLM Providers                         │
│  ┌────────┐  ┌───────────┐  ┌────────┐                 │
│  │ OpenAI │  │ Anthropic │  │ Google │                 │
│  └────────┘  └───────────┘  └────────┘                 │
└─────────────────────────────────────────────────────────┘

Development

# Install dependencies
pnpm install

# Run in development mode
pnpm dev

# Type check
pnpm typecheck

# Run tests
pnpm test

# Run tests with coverage
pnpm test:coverage

# Build for production
pnpm build

# Start production server
pnpm start

Publishing to npm

First-Time Setup

  1. Create an npm account at npmjs.com

  2. Login to npm:

    npm login
  3. Update package.json:

    • Change name if kado-rlm is taken (e.g., @your-org/kado-rlm)
    • Update repository.url to your actual repo
    • Set author field

Publishing

# 1. Make sure tests pass
pnpm test:run

# 2. Build the package
pnpm build

# 3. Verify what will be published
npm pack --dry-run

# 4. Publish (first time)
npm publish

# 5. For scoped packages (@your-org/kado-rlm)
npm publish --access public

Versioning

# Patch release (bug fixes): 0.1.0 → 0.1.1
npm version patch

# Minor release (new features): 0.1.0 → 0.2.0
npm version minor

# Major release (breaking changes): 0.1.0 → 1.0.0
npm version major

# Then publish
npm publish

Automated Publishing (GitHub Actions)

Create .github/workflows/publish.yml:

name: Publish to npm

on:
  release:
    types: [created]

jobs:
  publish:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: pnpm/action-setup@v2
        with:
          version: 8
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          registry-url: 'https://registry.npmjs.org'
      - run: pnpm install
      - run: pnpm test:run
      - run: pnpm build
      - run: npm publish
        env:
          NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}

Add your npm token as a GitHub secret named NPM_TOKEN.

Production Deployment

Docker

# Build image
docker build -f docker/Dockerfile -t kado-rlm .

# Run container
docker run -p 3000:3000 \
  -e OPENAI_API_KEY=sk-... \
  -e NODE_ENV=production \
  kado-rlm

Health Checks

  • /health - Basic liveness (process running)
  • /ready - Readiness (providers configured, memory OK)

Configure Kubernetes probes:

livenessProbe:
  httpGet:
    path: /health
    port: 3000
  initialDelaySeconds: 5
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /ready
    port: 3000
  initialDelaySeconds: 10
  periodSeconds: 5

Stability Features

Circuit Breaker

Automatic circuit breaking for LLM provider failures:

  • Opens after 5 consecutive failures
  • Half-open after 30s cooldown
  • Per-provider tracking

Retry Logic

Exponential backoff with jitter for transient errors:

  • 3 retries by default
  • Handles rate limits, timeouts, 5xx errors

Resource Limits

| Resource | Default | Configurable | |----------|---------|--------------| | Max iterations | 20 | Yes | | Max recursion depth | 3 | Yes | | Sandbox CPU time | 5s | Yes | | Sandbox memory | 128MB | Yes | | Request timeout | 300s | Yes | | Max context size | 10MB | Yes |

References

License

MIT