easyrlm
v1.9.1
Published
Recursive Language Models for Claude Code - Persistent context management with Kakaroto Fields
Maintainers
Readme
Easy RLM
Transform Claude Code into a reliable software engineering partner.
"The key insight is that long prompts should not be fed into the neural network directly but should instead be treated as part of the environment that the LLM can symbolically interact with." — Recursive Language Models (MIT CSAIL, 2025)
The Problem
Every developer using AI coding assistants has experienced this:
| Problem | What Happens | |---------|--------------| | Context Rot | Model quality degrades as conversation grows. GPT-5 drops from 100% to 20% accuracy on complex tasks as context increases. | | Session Amnesia | Start a new session? The AI forgot everything about your project. | | Hallucination Spiral | AI confidently generates code that doesn't match your codebase patterns. | | Compaction Loss | When context is "summarized", critical details vanish forever. | | Inconsistent Quality | Sometimes brilliant, sometimes useless. No reliability. |
The research is clear: Base LLMs fail catastrophically on long-context, information-dense tasks. On the OOLONG-Pairs benchmark, GPT-5 and Qwen3-Coder score <0.1% — essentially random.
The Solution: Recursive Language Models
MIT CSAIL's RLM paper demonstrates a paradigm shift:
Traditional LLM: [Entire Context] → Neural Network → Response
(context rot, hallucinations, lost details)
RLM Approach: [Context as Environment] → Symbolic Interaction → Focused Queries
(maintains accuracy at 10M+ tokens)Results from the Paper
| Benchmark | Base GPT-5 | RLM (GPT-5) | Improvement | |-----------|------------|-------------|-------------| | BrowseComp+ (6-11M tokens) | 0%* | 91.33% | ∞ (couldn't even run) | | OOLONG (131K tokens) | 44% | 56.50% | +28.4% | | OOLONG-Pairs (32K tokens) | 0.04% | 58.00% | +1,450x | | CodeQA (23K-4.2M tokens) | 24%* | 62.00% | +158% |
* Base model hit context limits or failed completely
Key insight: RLMs maintain strong performance even at 10M+ tokens while base models collapse.
Easy RLM: RLM Principles for Claude Code
Easy RLM implements RLM principles as a practical workflow system:
1. Context as External Environment
Instead of stuffing everything into the prompt, Easy RLM uses Kakaroto Fields — a 5-tier persistent context system:
.claude/context/
├── project.md # Always loaded (project identity)
├── architecture.md # Loaded for structural tasks
├── patterns.md # Loaded when creating code
├── knowledge.md # Searchable by keywords
└── current.md # Session state (survives restarts)Why it works: The AI loads only what's needed, keeping the working context small and focused. Like an RLM's REPL environment, context is accessed programmatically, not dumped wholesale.
2. Recursive Decomposition
Complex tasks are broken into phases with specialized sub-agents:
/feature "add user authentication"
│
├── Phase 1: Understand (interview, requirements)
├── Phase 2: Analyze (explore codebase)
├── Phase 3: Strategy (test approach)
├── Phase 4: RED (write failing tests)
├── Phase 5: GREEN (implement)
├── Phase 6: Quality (refactor + validate)
├── Phase 7: E2E Validation
├── Phase 8: Delivery (commit + sync)
└── Phase 9: Self-Evaluation (improve system)Each phase has focused context. No context rot.
3. Symbolic Interaction with Codebase
Instead of the AI guessing about your code, Easy RLM workflows:
- Search before assuming
- Read before modifying
- Validate before committing
- Test before declaring done
4. Persistent Memory Across Sessions
# Session 1: Implement feature
> /feature "add payment processing"
# ... AI learns about your Stripe integration, error patterns, etc.
# Session 2 (days later): Debug issue
> /debug "payment failing for international cards"
# AI already knows your payment architecture from context/knowledge.mdNo more re-explaining your project every session.
Installation
npx easyrlmThat's it. One command that:
- New project? Full installation
- Existing setup? Smart update (preserves your customizations)
- Broken files? Auto-repair
- Already configured? Status report
What Gets Installed
.claude/
├── CLAUDE.md # Your rules (preserved on updates)
├── ARCHITECTURE.md # System overview
├── commands/
│ ├── feature.md # 9-phase feature workflow
│ ├── debug.md # 6-phase debug workflow
│ ├── gate.md # Quality validation
│ └── [playbooks] # Category-specific guides
├── agents/ # 8 specialized sub-agents
├── context/ # Kakaroto Fields (your project memory)
└── templates/ # Reusable templates64 files, 15,000+ lines of battle-tested workflow logic.
Usage
Implementing Features
claude
> /feature "add dark mode toggle"The AI will:
- Interview you about requirements
- Analyze your existing codebase
- Propose a test strategy (TDD)
- Write failing tests first
- Implement minimal code to pass
- Refactor and validate quality
- Run E2E validation
- Commit with proper message
- Self-evaluate and improve
No more "here's some code, good luck" responses.
Debugging Issues
claude
> /debug "users can't log in after password reset"The AI will:
- Reproduce the bug (with evidence)
- Investigate root cause (not symptoms)
- Implement surgical fix
- Verify fix + create regression test
- Commit and document
- Evaluate debugging process
No more "try adding a console.log" suggestions.
Quality Gate
claude
> /gateRuns comprehensive validation:
- All tests pass
- TypeScript compiles
- Build succeeds
- Code review (security, types, quality)
- Visual validation (if UI changes)
- Environment consistency (if config changes)
Why Easy RLM Works
The Science
From the RLM paper:
"RLMs demonstrate extremely strong performance even at the 10M+ token scale, and dramatically outperform all other approaches at long-context processing, in most cases by double-digit percentage gains while maintaining a comparable or lower cost."
Easy RLM applies these principles:
| RLM Principle | Easy RLM Implementation | |---------------|------------------------| | Context as environment | Kakaroto Fields (.claude/context/) | | Recursive sub-calls | Specialized agents (code-reviewer, test-fixer, etc.) | | Symbolic manipulation | Workflow phases with focused queries | | REPL persistence | Git-versioned markdown files |
The Guarantee
Unlike MCP Memory servers that:
- Require running servers
- Can timeout or crash
- Lose data on compaction
- Only work locally
Easy RLM uses files + Git:
- No servers required
- No timeouts possible
- Git preserves everything
- Clone = full memory transfer
The Vision
We believe AI-assisted development should be:
- Reliable — Same quality every time, not random brilliance
- Persistent — Knowledge accumulates, never resets
- Structured — Clear workflows, not chaos
- Trustworthy — Test-driven, validated, verified
Easy RLM is our contribution to making this real.
This is an open-source project for the community. We want every developer using Claude Code to experience what reliable AI assistance feels like.
Comparison
| Aspect | Vanilla Claude Code | Easy RLM | |--------|--------------------|--------------------| | Context management | Compaction (lossy) | Kakaroto Fields (persistent) | | Session continuity | Lost on restart | Preserved in context/ | | Code quality | Variable | Enforced (TDD, quality gates) | | Debugging | Ad-hoc suggestions | 6-phase systematic workflow | | Features | "Here's code" | 9-phase TDD workflow | | Learning | Starts fresh | Accumulates in knowledge.md |
New in v1.1: Closer to Paper RLM
Version 1.1 adds four key improvements based on the RLM paper recommendations:
1. Benchmark Suite
Compare Easy RLM vs base model performance:
npm run benchmark # Run all benchmarks
npm run benchmark:compare # Side-by-side comparison
node benchmark/bin/benchmark.js --helpIncludes implementations of:
- S-NIAH — Needle-in-a-haystack (O(1) retrieval)
- OOLONG — Semantic aggregation (O(N) complexity)
- OOLONG-Pairs — Pairwise comparison (O(N²) complexity)
- CodeQA — Code understanding
2. Auto-Decomposition
Context manager now auto-triggers RLM decomposition for large codebases:
| Project Size | Trigger | Strategy | |--------------|---------|----------| | < 100 files | — | Direct queries | | 100-200 files | Automatic | Chunked decomposition | | > 200 files | Automatic | Recursive sub-agents |
No manual configuration needed.
3. Dynamic Context Refresh
Workflows now auto-refresh context when:
- Files are not found where expected
- Functions are undefined
- Types don't match
- Working across 3+ modules
The system re-queries the codebase and updates knowledge.md with discoveries.
4. Recursive Sub-Agents
Agents can now invoke other agents when needed:
code-reviewer
└── can invoke → test-fixer (when fixes break tests)
└── can invoke → Explore (when reviewing unfamiliar code)
test-fixer
└── can invoke → Explore (when tests reference unknown modules)Maximum recursion depth: 2-3 levels (prevents infinite loops).
Project Structure
easyrlm/
├── bin/cli.js # CLI entry point
├── lib/
│ ├── detector.js # Detect project state
│ ├── installer.js # Full installation
│ ├── updater.js # Smart updates
│ ├── fixer.js # Repair broken files
│ ├── merger.js # Merge CLAUDE.md intelligently
│ ├── validator.js # Validate installation
│ ├── decomposer.js # Auto-decomposition logic
│ ├── context-refresh.js # Dynamic refresh system
│ └── recursive-agents.js # Agent coordination
├── benchmark/
│ ├── bin/benchmark.js # Benchmark CLI
│ ├── lib/ # Runner, evaluators, reporter
│ ├── adapters/ # Claude base vs RLM adapters
│ └── benchmarks/ # S-NIAH, OOLONG, CodeQA
├── templates/.claude/ # All workflow files
└── package.jsonContributing
We welcome contributions! Areas of interest:
- New playbooks — Domain-specific workflows (mobile, ML, etc.)
- Agent improvements — Smarter sub-agents
- Translations — Help translate remaining files to English
- Documentation — Tutorials, examples, case studies
Research Foundation
Easy RLM is inspired by and builds upon:
- Recursive Language Models (Zhang, Kraska, Khattab — MIT CSAIL, 2025)
- Context Rot Research (Hong et al., 2025)
- OOLONG Benchmark (Bertsch et al., 2025)
License
MIT — Use freely, contribute back.
