ralph-cursor
v0.1.0
Published
CLI for Ralph Wiggum autonomous development loop for Cursor
Downloads
7
Readme
Ralph Wiggum for Cursor
An implementation of Geoffrey Huntley's Ralph Wiggum technique for Cursor, enabling autonomous AI development with deliberate context management.
"That's the beauty of Ralph - the technique is deterministically bad in an undeterministic world."
What is Ralph?
Ralph is a technique for autonomous AI development that treats LLM context like memory:
while :; do cat PROMPT.md | agent ; doneThe same prompt is fed repeatedly to an AI agent. Progress persists in files and git, not in the LLM's context window. When context fills up, you get a fresh agent with fresh context.
The malloc/free Problem
In traditional programming:
malloc()allocates memoryfree()releases memory
In LLM context:
- Reading files, tool outputs, conversation =
malloc() - There is no
free()- context cannot be selectively released - Only way to free: start a new conversation
This creates two problems:
- Context pollution - Failed attempts, unrelated code, and mixed concerns accumulate and confuse the model
- The gutter - Once polluted, the model keeps referencing bad context. Like a bowling ball in the gutter, there's no saving it.
Ralph's solution: Deliberately rotate to fresh context before pollution builds up. State lives in files and git, not in the LLM's memory.
Architecture
┌─────────────────────────────────────────────────────────────┐
│ ralph-cursor run / ralph-cursor loop │
│ │ │
│ ┌────────────┴────────────┐ │
│ ▼ ▼ │
│ [interactive] [fallback] │
│ Model selection Simple prompts │
│ Max iterations │
│ Options (branch, PR) │
│ │ │ │
│ └────────────┬────────────┘ │
│ ▼ │
│ cursor-agent -p --force --output-format stream-json │
│ │ │
│ ▼ │
│ stream parser (Node) │
│ │ │ │
│ ┌────────────────┴────────┴────────────────┐ │
│ ▼ ▼ │
│ .ralph/ Signals │
│ ├── activity.log (tool calls) ├── WARN at 70k │
│ ├── errors.log (failures) ├── ROTATE at 80k│
│ ├── progress.md (agent writes) ├── COMPLETE │
│ ├── guardrails.md (lessons learned) ├── GUTTER │
│ └── tasks.yaml (cached task state) └── DEFER │
│ │
│ When ROTATE → fresh context, continue from git │
│ When DEFER → exponential backoff, retry same task │
└─────────────────────────────────────────────────────────────┘Key features:
- Interactive setup - Prompts for model selection and options (ralph-cursor run)
- Accurate token tracking - Parser counts actual bytes from every file read/write
- Gutter detection - Detects when agent is stuck (same command failed 3x, file thrashing)
- Rate limit handling - Detects rate limits/network errors, waits with exponential backoff
- Task caching - YAML backend with mtime invalidation for efficient task parsing
- Learning from failures - Agent updates
.ralph/guardrails.mdwith lessons - State in git - Commits frequently so next agent picks up from git history
- Branch/PR workflow - Optionally work on a branch and open PR when complete
Prerequisites
| Requirement | Check | How to Set Up |
|-------------|-------|---------------|
| Git repo | git status works | git init |
| cursor-agent CLI | which cursor-agent | curl https://cursor.com/install -fsS \| bash |
| Node.js 18+ (for CLI) | node -v | nodejs.org |
Quick Start (Node CLI)
Option A — npx (no install):
cd your-project
npx ralph-cursor init # create .ralph/ and RALPH_TASK.md template
# Edit RALPH_TASK.md with your task and success criteria
npx ralph-cursor run # interactive runOption B — install globally
From npm (if the package is published):
npm install -g ralph-cursorFrom the repo (development or before publishing):
cd /path/to/ralph-cursor
bun run build
npm install -g .Or use npm link from the repo, then run ralph-cursor from any project. If ralph-cursor --help prints nothing, another binary may be in your PATH first; run $(npm root -g)/.bin/ralph-cursor --help to use the linked CLI.
Option C — from repo (dev):
cd ralph-cursor
bun install
bun run build
# Run locally from repo:
bun run ralph-cursor init
bun run ralph-cursor run # or: node dist/cli.js run
bun run ralph-cursor loop -y
# Or link globally: npm link, then in any project run ralph-cursor init, ralph-cursor run.
# If `ralph-cursor` does nothing (another binary in PATH), run from repo: node dist/cli.js --help
npm link
cd /path/to/your-project
ralph-cursor init
ralph-cursor runFrom another project without linking: node /path/to/ralph-cursor/dist/cli.js init (and run, loop, etc.).
Publishing: bun run build && bun test then npm version patch (or minor/major) and npm publish. prepublishOnly runs build automatically.
CLI commands:
| Command | Description |
|---------|-------------|
| ralph-cursor run | Interactive run (prompts for model/iterations, optional single iteration first, then loop). Options: --branch, --pr, -y. |
| ralph-cursor once | Single iteration (no loop). |
| ralph-cursor loop | Non-interactive loop with flags (-n, -m, --task, --branch, --pr, --parallel, etc.). |
| ralph-cursor task list | List all tasks (id, status, description). |
| ralph-cursor task next | Print next incomplete task (id|status|description). |
| ralph-cursor task complete <id> | Mark task complete by id (e.g. line_5). |
| ralph-cursor task incomplete <id> | Mark task incomplete by id. |
| ralph-cursor task parse | Parse task file and update .ralph/tasks.yaml. |
| ralph-cursor task export | Print task cache as YAML (same as .ralph/tasks.yaml). |
| ralph-cursor status | Print task summary (done/total, last iteration). |
| ralph-cursor logs | Tail activity.log (or --errors for errors.log); -n N, --no-follow. Cross-platform (no tail). |
| ralph-cursor init | Create .ralph/ and optional RALPH_TASK.md template. |
| ralph-cursor doctor | Check environment (cursor-agent, git, task file, .ralph writable). |
Global flags: -V, --version (print version), -v, --verbose (log each stream-json line to stderr with [ralph] prefix for debugging).
Exit codes: 0 = success; 1 = failure or prerequisite check failed (no task file, not git, cursor-agent missing, gutter, max iterations); 2 = usage error (e.g. invalid args). See table below for per-command behaviour.
| Command | 0 | 1 | 2 |
|---------|---|---|---|
| run, loop, once | Task complete or ran successfully | Prereq failed (no task file, not git, no cursor-agent), gutter, max iterations, or loop ended without completion | Usage (e.g. --pr without --branch) |
| task list/next/complete/incomplete | OK | Invalid task id / file error | No task file / workspace not found |
| status, logs | OK | — | No task file / .ralph not initialized |
| init | .ralph created | — | — |
| doctor | All checks passed | — | cursor-agent missing, not git, task file missing, or .ralph not writable |
Config: Optional .ralph/config.json (and env overrides):
{
"warn_threshold": 70000,
"rotate_threshold": 80000,
"default_model": "opus-4.5-thinking",
"max_iterations": 20
}Env (override config): RALPH_WARN_THRESHOLD, RALPH_ROTATE_THRESHOLD, RALPH_MODEL, MAX_ITERATIONS. WARN_THRESHOLD and ROTATE_THRESHOLD are also supported. CLI flags override config and env.
Environment variables (reference):
| Variable | Purpose |
|----------|---------|
| RALPH_WARN_THRESHOLD | Token count at which to emit WARN (default 70000). |
| RALPH_ROTATE_THRESHOLD | Token count at which to rotate context (default 80000). |
| RALPH_MODEL | Default model for cursor-agent (e.g. opus-4.5-thinking). |
| MAX_ITERATIONS | Default max loop iterations. |
| RALPH_TASK_FILE | Path to task file (default RALPH_TASK.md). |
| RALPH_VERBOSE | Set to 1 to log each stream-json line to stderr (same as -v). |
| WARN_THRESHOLD, ROTATE_THRESHOLD | Same as RALPH_*. |
| DEFAULT_GROUP, RALPH_DEFAULT_GROUP | Default parallel group for tasks without <!-- group: N --> (default 999999). |
Iteration: The CLI resumes from .ralph/.iteration (incremented each loop run). To force a fresh start, delete .ralph/.iteration or set it to 0.
Parallel mode: ralph-cursor loop --parallel runs multiple agents in git worktrees (one per task or batch). Uses .ralph-worktrees/, lock in .ralph/locks/parallel.lock, merge phase after each group. Options: --max-parallel N, --branch (integration branch), --pr (create PR when done), --no-merge (skip auto-merge). Requires RALPH_TASK.md committed and tasks with <!-- group: N --> for grouping.
Troubleshooting
| Issue | What to do |
|-------|------------|
| cursor-agent not found | Install: curl https://cursor.com/install -fsS \| bash (or see Cursor docs). Run ralph-cursor doctor to verify. |
| No task file / exit 2 | Run ralph-cursor init in project root, then edit RALPH_TASK.md with your task and success criteria. |
| Agent seems stuck (same error repeatedly) | Ralph may emit GUTTER and stop. Check .ralph/errors.log and .ralph/guardrails.md; refine task or guardrails and run again. |
| Rate limits / 429 / DEFER | Ralph backs off automatically. If it persists, reduce concurrency or switch model; set thresholds in .ralph/config.json if needed. |
| Context too large (WARN/ROTATE) | Default 70k/80k token thresholds trigger rotation. Adjust warn_threshold / rotate_threshold in config or env. |
Quick Start
In your project root:
npx ralph-cursor initThis creates .ralph/ (progress.md, guardrails.md, activity.log, errors.log) and RALPH_TASK.md if missing. Then:
1. Define Your Task
Edit RALPH_TASK.md:
---
task: Build a REST API
test_command: "pnpm test"
---
# Task: REST API
Build a REST API with user management.
## Success Criteria
1. [ ] GET /health returns 200
2. [ ] POST /users creates a user
3. [ ] GET /users/:id returns user
4. [ ] All tests pass
## Context
- Use Express.js
- Store users in memory (no database needed)Important: Use [ ] checkboxes. Ralph tracks completion by counting unchecked boxes.
2. Start the Loop
npx ralph-cursor run
# or: ralph-cursor loop -y (non-interactive)Ralph will:
- Show interactive prompts for model and options (or use
ralph-cursor loopwith flags) - Run
cursor-agentwith your task - Parse output in real-time, tracking token usage
- At 70k tokens: warn agent to wrap up current work
- At 80k tokens: rotate to fresh context
- Repeat until all
[ ]are[x](or max iterations reached)
3. Monitor Progress
ralph-cursor logs
# or: tail -f .ralph/activity.log
# Example output:
# [12:34:56] 🟢 READ src/index.ts (245 lines, ~24.5KB)
# [12:34:58] 🟢 WRITE src/routes/users.ts (50 lines, 2.1KB)
# [12:35:01] 🟢 SHELL pnpm test → exit 0
# [12:35:10] 🟢 TOKENS: 45,230 / 80,000 (56%) [read:30KB write:5KB assist:10KB shell:0KB]
# Check for failures
cat .ralph/errors.logCommands (Node CLI)
| Command | Description |
|---------|-------------|
| ralph-cursor run | Primary - Interactive setup + run loop |
| ralph-cursor once | Test single iteration before going AFK |
| ralph-cursor loop | CLI mode for scripting (see flags below) |
| ralph-cursor init | Create .ralph/ and RALPH_TASK.md template |
ralph-cursor loop flags (scripting/CI)
ralph-cursor loop [options] [workspace]
Options:
-n, --iterations N Max iterations (default: 20)
-m, --model MODEL Model to use (default: opus-4.5-thinking)
--branch NAME Sequential: create/work on branch; Parallel: integration branch name
--pr Sequential: open PR (requires --branch); Parallel: open ONE integration PR (branch optional)
--parallel Run tasks in parallel with worktrees
--max-parallel N Max parallel agents (default: 3)
--no-merge Skip auto-merge in parallel mode
-y, --yes Skip confirmation promptExamples:
# Scripted PR workflow
ralph-cursor loop --branch feature/api --pr -y
# Use a different model with more iterations
ralph-cursor loop -n 50 -m gpt-5.2-high
# Run 4 agents in parallel
ralph-cursor loop --parallel --max-parallel 4
# Parallel: keep branches separate
ralph-cursor loop --parallel --no-merge
# Parallel: merge into an integration branch + open ONE PR
ralph-cursor loop --parallel --max-parallel 5 --branch feature/multi-task --pr
# Parallel: open ONE PR using an auto-named integration branch
ralph-cursor loop --parallel --max-parallel 5 --prParallel Execution
Ralph can run multiple agents concurrently, each in an isolated git worktree.
Starting Parallel Mode
Via interactive run:
ralph-cursor run
# Choose "Run in parallel mode?" and set max parallel agents when promptedVia CLI (scripting/CI):
# Run 3 agents in parallel (default)
ralph-cursor loop --parallel
# Run 10 agents in parallel (no hard cap)
ralph-cursor loop --parallel --max-parallel 10
# Keep branches separate (no auto-merge)
ralph-cursor loop --parallel --no-merge
# Merge into an integration branch (no PR)
ralph-cursor loop --parallel --max-parallel 5 --branch feature/multi-task
# Merge into an integration branch and open ONE PR
ralph-cursor loop --parallel --max-parallel 5 --branch feature/multi-task --pr
# Open ONE PR using an auto-named integration branch
ralph-cursor loop --parallel --max-parallel 5 --prNote: There's no hard limit on
--max-parallel. The practical limit depends on your machine's resources and API rate limits.
Integration branch + single PR
Parallel --pr creates one integration branch (either your --branch NAME or an auto-named ralph/parallel-<run_id>), merges all successful agent branches into it, then opens one PR back to the base branch.
This avoids “one PR per task” spam while keeping agents isolated.
How Parallel Mode Works
┌────────────────────────────────────────────────────────────────┐
│ Parallel Execution Flow │
├────────────────────────────────────────────────────────────────┤
│ │
│ RALPH_TASK.md │
│ - [ ] Task A │
│ - [ ] Task B ┌──────────────────────────┐ │
│ - [ ] Task C ───▶ │ Create Worktrees │ │
│ └──────────────────────────┘ │
│ │ │
│ ┌───────────────┼───────────────┐ │
│ ▼ ▼ ▼ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Agent 1 │ │ Agent 2 │ │ Agent 3 │ │
│ │ worktree │ │ worktree │ │ worktree │ │
│ │ Task A │ │ Task B │ │ Task C │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ branch-a branch-b branch-c │
│ │ │ │ │
│ └──────────────┼──────────────┘ │
│ ▼ │
│ ┌──────────────┐ │
│ │ Auto-Merge │ │
│ │ to base │ │
│ └──────────────┘ │
└────────────────────────────────────────────────────────────────┘Key benefits:
- Each agent works in complete isolation (separate git worktree)
- No interference between agents working on different tasks
- Branches auto-merge after completion (or keep separate with
--no-merge) - Conflict detection and reporting
- Tasks are processed in batches (e.g., 5 agents = 5 tasks per batch)
- In parallel mode, agents do not update
.ralph/progress.md(they write per-agent reports instead)
When to use parallel mode:
- Multiple independent tasks that don't conflict
- Large task lists you want completed faster
- CI/CD pipelines with parallelization budget
When to use sequential mode:
- Tasks that depend on each other
- Single complex task that needs focused attention
- Limited API rate limits
Recommended Workflow: Parallel + Integration Pass
For best results, structure your work in two phases:
Phase 1: Parallel execution (isolated, independent tasks)
# Tasks
- [ ] Add user authentication to /api/auth
- [ ] Create dashboard component
- [ ] Implement data export feature
- [ ] Add unit tests for utils/Phase 2: Integration pass (one sequential agent, repo-wide polish)
# Tasks
- [ ] Update README with new features
- [ ] Bump version in package.json
- [ ] Update CHANGELOG
- [ ] Fix any integration issues from parallel workThis pattern maximizes parallelism while avoiding merge conflicts on shared files. The integration pass runs after parallel agents finish and handles all "touch everything" work.
Task Groups (Phased Execution)
Control execution order with <!-- group: N --> annotations:
# Tasks
- [ ] Create database schema <!-- group: 1 -->
- [ ] Create User model <!-- group: 1 -->
- [ ] Create Post model <!-- group: 1 -->
- [ ] Add relationships between models <!-- group: 2 -->
- [ ] Build API endpoints <!-- group: 3 -->
- [ ] Update README # no annotation = runs LASTExecution order:
- Group 1 - runs first (all tasks in parallel, up to
--max-parallel) - Group 2 - runs after group 1 merges complete
- Group 3 - runs after group 2 merges complete
- Unannotated tasks - run LAST (after all annotated groups)
Why unannotated = last?
- Safer default: forgetting to annotate doesn't jump the queue
- Integration/polish tasks naturally go last
- Override with
DEFAULT_GROUP=0env var if you prefer unannotated first
Within each group:
- Tasks run in parallel (up to
--max-parallel) - All merges complete before next group starts
- RALPH_TASK.md checkboxes updated per group
Worktree structure:
project/
├── .ralph-worktrees/ # Temporary worktrees (auto-cleaned)
│ ├── <run_id>-job1/ # Agent worktree (isolated)
│ ├── <run_id>-job2/ # Agent worktree (isolated)
│ └── <run_id>-job3/ # Agent worktree (isolated)
└── (original project files)Worktrees are automatically cleaned up after agents complete. Failed agents preserve their worktree for manual inspection.
Parallel logs & per-agent reports
Each parallel run creates a run directory:
.ralph/parallel/<run_id>/
├── manifest.tsv # job_id -> task_id -> branch -> status -> log
└── jobN.log # full cursor-agent output for that jobAgents are instructed to write a committed per-agent report (to avoid .ralph/progress.md merge conflicts):
.ralph/parallel/<run_id>/agent-jobN.mdHow It Works
The Loop
Iteration 1 Iteration 2 Iteration N
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ Fresh context │ │ Fresh context │ │ Fresh context │
│ │ │ │ │ │ │ │ │
│ ▼ │ │ ▼ │ │ ▼ │
│ Read RALPH_TASK │ │ Read RALPH_TASK │ │ Read RALPH_TASK │
│ Read guardrails │──────────│ Read guardrails │──────────│ Read guardrails │
│ Read progress │ (state │ Read progress │ (state │ Read progress │
│ │ │ in git) │ │ │ in git) │ │ │
│ ▼ │ │ ▼ │ │ ▼ │
│ Work on criteria │ │ Work on criteria │ │ Work on criteria │
│ Commit to git │ │ Commit to git │ │ Commit to git │
│ │ │ │ │ │ │ │ │
│ ▼ │ │ ▼ │ │ ▼ │
│ 80k tokens │ │ 80k tokens │ │ All [x] done! │
│ ROTATE ──────────┼──────────┼──────────────────┼──────────┼──► COMPLETE │
└──────────────────┘ └──────────────────┘ └──────────────────┘Each iteration:
- Reads task and state from files (not from previous context)
- Works on unchecked criteria
- Commits progress to git
- Updates
.ralph/progress.mdand.ralph/guardrails.md - Rotates when context is full
Git Protocol
The agent is instructed to commit frequently:
# After each criterion
git add -A && git commit -m 'ralph: [criterion] - description'
# Push periodically
git pushCommits are the agent's memory. The next iteration picks up from git history.
The Learning Loop (Signs)
When something fails, the agent adds a "Sign" to .ralph/guardrails.md:
### Sign: Check imports before adding
- **Trigger**: Adding a new import statement
- **Instruction**: First check if import already exists in file
- **Added after**: Iteration 3 - duplicate import caused build failureFuture iterations read guardrails first and follow them, preventing repeated mistakes.
Error occurs → errors.log → Agent analyzes → Updates guardrails.md → Future agents followContext Health Indicators
The activity log shows context health with emoji:
| Emoji | Status | Token % | Meaning | |-------|--------|---------|---------| | 🟢 | Healthy | < 60% | Plenty of room | | 🟡 | Warning | 60-80% | Approaching limit | | 🔴 | Critical | > 80% | Rotation imminent |
Example:
[12:34:56] 🟢 READ src/index.ts (245 lines, ~24.5KB)
[12:40:22] 🟡 TOKENS: 58,000 / 80,000 (72%) - approaching limit [read:40KB write:8KB assist:10KB shell:0KB]
[12:45:33] 🔴 TOKENS: 72,500 / 80,000 (90%) - rotation imminentGutter Detection
The parser detects when the agent is stuck:
| Pattern | Trigger | What Happens |
|---------|---------|--------------|
| Repeated failure | Same command failed 3x | GUTTER signal |
| File thrashing | Same file written 5x in 10 min | GUTTER signal |
| Agent signals | Agent outputs <ralph>GUTTER</ralph> | GUTTER signal |
When gutter is detected:
- Check
.ralph/errors.logfor the pattern - Fix the issue manually or add a guardrail
- Re-run the loop
Rate Limit & Transient Error Handling
The parser detects retryable API errors and handles them gracefully:
| Error Type | Examples | What Happens | |------------|----------|--------------| | Rate limits | 429, "rate limit exceeded", "quota" | DEFER signal | | Network errors | timeout, connection reset, ECONNRESET | DEFER signal | | Server errors | 502, 503, 504, "service unavailable" | DEFER signal |
When DEFER is triggered:
- Agent stops current iteration
- Waits with exponential backoff (15s base, doubles each retry, max 120s)
- Adds jitter (0-25%) to prevent thundering herd
- Retries the same task (does not increment iteration)
Example log:
⏸️ Rate limit or transient error detected.
Waiting 32s before retrying (attempt 2)...
Resuming...Completion Detection
Ralph detects completion in two ways:
- Checkbox check: All
[ ]in RALPH_TASK.md changed to[x] - Agent sigil: Agent outputs
<ralph>COMPLETE</ralph>
Both are verified before declaring success.
File Reference
| File | Purpose | Who Uses It |
|------|---------|-------------|
| RALPH_TASK.md | Task definition + success criteria | You define, agent reads |
| .ralph/progress.md | What's been accomplished | Agent writes after work |
| .ralph/guardrails.md | Lessons learned (Signs) | Agent reads first, writes after failures |
| .ralph/activity.log | Tool call log with token counts | Parser writes, you monitor |
| .ralph/errors.log | Failures + gutter detection | Parser writes, agent reads |
| .ralph/tasks.yaml | Cached task state (auto-generated) | Task parser writes/reads |
| .ralph/tasks.mtime | Task file modification time | Cache invalidation |
| .ralph/.iteration | Current iteration number | Parser reads/writes |
Configuration
Configuration is set via command-line flags or environment variables:
# Via flags (recommended)
ralph-cursor loop -n 50 -m gpt-5.2-high
# Via environment
RALPH_MODEL=gpt-5.2-high MAX_ITERATIONS=50 ralph-cursor loopDefault thresholds (config or env):
MAX_ITERATIONS=20 # Max rotations before giving up
WARN_THRESHOLD=70000 # Tokens: send wrapup warning
ROTATE_THRESHOLD=80000 # Tokens: force rotationTroubleshooting
"cursor-agent CLI not found"
curl https://cursor.com/install -fsS | bashAgent keeps failing on same thing
Check .ralph/errors.log for the pattern. Either:
- Fix the underlying issue manually
- Add a guardrail to
.ralph/guardrails.mdexplaining what to do differently
Context rotates too frequently
The agent might be reading too many large files. Check activity.log for large READs and consider:
- Adding a guardrail: "Don't read the entire file, use grep to find relevant sections"
- Breaking the task into smaller pieces
Task never completes
Check if criteria are too vague. Each criterion should be:
- Specific and testable
- Achievable in a single iteration
- Not dependent on manual steps
Workflows
Basic (default)
ralph-cursor run # Interactive setup → runs loop → doneHuman-in-the-loop (recommended for new tasks)
ralph-cursor once # Run ONE iteration
# Review changes...
ralph-cursor run # Continue with full loopScripted/CI
ralph-cursor loop --branch feature/foo --pr -yLearn More
- Original Ralph technique - Geoffrey Huntley
- Context as memory - The malloc/free metaphor
- Cursor CLI docs
Credits
- Original technique: Geoffrey Huntley - the Ralph Wiggum methodology
- Cursor port: Agrim Singh - this implementation
License
MIT
