@ralph-e-cli/ralph-e
v5.1.0
Published
Autonomous AI development loop with integrated Playwright UI/UX validation, visual regression testing, and self-healing builds.
Maintainers
Readme
Ralph-E
Autonomous AI coding loop. Runs AI agents on tasks until done.
Install
npm install -g @ralph-e-cli/ralph-eQuick Start
# Single task
ralph-e "add login button"
# Work through a task list
ralph-e --prd PRD.mdTwo Modes
Single task - just tell it what to do:
ralph-e "add dark mode"
ralph-e "fix the auth bug"Task list - work through a PRD:
ralph-e # uses PRD.md
ralph-e --prd tasks.mdProject Config
Optional. Stores rules the AI must follow.
ralph-e --init # auto-detects project settings
ralph-e --config # view config
ralph-e --add-rule "use TypeScript strict mode"Creates .ralph-e/config.yaml:
project:
name: "my-app"
language: "TypeScript"
framework: "Next.js"
commands:
test: "npm test"
lint: "npm run lint"
build: "npm run build"
rules:
- "use server actions not API routes"
- "follow error pattern in src/utils/errors.ts"
boundaries:
never_touch:
- "src/legacy/**"
- "*.lock"AI Engines
ralph-e # Claude Code (default)
ralph-e --opencode # OpenCode
ralph-e --cursor # Cursor
ralph-e --codex # Codex
ralph-e --qwen # Qwen-Code
ralph-e --droid # Factory Droid
ralph-e --copilot # GitHub CopilotModel Override
ralph-e --model sonnet "add feature" # use sonnet with Claude
ralph-e --sonnet "add feature" # shortcut for above
ralph-e --opencode --model opencode/glm-4.7-free "task"Engine-Specific Arguments
Pass additional arguments to the underlying engine CLI using -- separator:
ralph-e --copilot "add feature" -- --allow-all-tools --stream on
ralph-e --claude "fix bug" -- --no-permissions-promptTask Sources
Markdown file (default):
ralph-e --prd PRD.mdMarkdown folder (for large projects):
ralph-e --prd ./prd/Reads all .md files in the folder and aggregates tasks.
YAML:
ralph-e --yaml tasks.yamlGitHub Issues:
ralph-e --github owner/repo
ralph-e --github owner/repo --github-label "ready"Parallel Execution
ralph-e --parallel # 3 agents default
ralph-e --parallel --max-parallel 5 # 5 agentsEach agent gets isolated worktree + branch. Without --create-pr: auto-merges back with AI conflict resolution. With --create-pr: keeps branches, creates PRs. With --no-merge: keeps branches without merging.
Smart Scheduling
Use AI to predict which files each task will modify, then automatically group non-conflicting tasks for safe parallel execution:
ralph-e --parallel --smart-schedule
ralph-e --parallel --smart-schedule --planning-model haiku # use cheaper model for planningSmart scheduling uses the DSatur graph coloring algorithm to:
- Predict file modifications for each task using AI
- Build a conflict graph where edges represent file overlaps
- Color the graph to group non-conflicting tasks
- Execute each color group in parallel
This allows more tasks to run simultaneously while avoiding merge conflicts.
Sandbox Mode and Parallel Reliability
For large repos with big node_modules or dependency directories, use sandbox mode instead of git worktrees:
ralph-e --parallel --sandboxSandboxes are faster because they:
- Symlink read-only dependencies (
node_modules,.git,vendor,.venv, etc.) - Copy only source files that agents might modify
This avoids duplicating gigabytes of dependencies across worktrees. Changes are synced back to the original directory after each task completes.
Parallel execution reliability:
- If worktree operations fail (e.g., nested worktree repos), ralph-e falls back to sandbox mode automatically
- Retryable rate-limit or quota errors are detected and deferred for later retry
- Local changes are stashed before the merge phase and restored after
- Agents should not modify PRD files,
.ralph-e/progress.txt,.ralph-e-worktrees, or.ralph-e-sandboxes
Branch Workflow
ralph-e --branch-per-task # branch per task
ralph-e --branch-per-task --create-pr # + create PRs
ralph-e --branch-per-task --draft-pr # + draft PRsBrowser Automation
Ralph-E supports two approaches for browser-based testing:
Option 1: Playwright MCP (Test While Coding)
Give the AI agent direct browser access so it can test as it builds. This enables a code → test → fix loop during development.
Setup for Claude Code:
Add to your ~/.claude/mcp.json (or project .claude/mcp.json):
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["@anthropics/mcp-playwright"]
}
}
}The AI agent gets tools like browser_navigate, browser_click, browser_screenshot, browser_fill and can test UI changes immediately after writing code.
Setup for other agents:
Use agent-browser instead:
ralph-e "add login form" --browser # enable browser automation
ralph-e "fix checkout" --no-browser # disable browser automationOption 2: Ralph-E Validation (Test After Task)
Ralph-E's built-in Playwright validation runs after each task completes, providing a final quality gate before marking the task done. See Playwright Visual Testing below.
Recommended: Use Both
For maximum coverage, use both approaches together:
- Playwright MCP / agent-browser - AI tests while coding (catches issues early)
- Ralph-E validation - Final gate before commit (visual regression, accessibility, performance)
# .ralph-e/config.yaml
playwright:
enabled: true
validateAfterTask: true
onFailure: "block" # Don't mark task complete if validation failsThis gives you: AI-driven testing during development + automated validation before commit.
Playwright Visual Testing
Ralph-E includes integrated Playwright validation for UI/UX testing with visual regression, accessibility, and performance monitoring.
ralph-e "add feature" --playwright # enable Playwright validation
ralph-e "fix ui" --playwright --visual-regression # with visual regression testing
ralph-e "update form" --playwright --accessibility # with accessibility testing
ralph-e --playwright-url http://localhost:5173 # custom base URLFeatures
- Visual Regression Testing: Compare screenshots against baselines to catch unintended UI changes
- Accessibility Testing: WCAG compliance checking (wcag2a, wcag2aa, wcag2aaa)
- Console Error Detection: Catch JavaScript errors during page load
- Network Failure Monitoring: Detect failed API requests
- Performance Budgets: Monitor Core Web Vitals (LCP, CLS, TTFB)
- Self-Healing Tests: Learn from repeated failures and avoid making the same mistakes
- Rollback on Failure: Automatically revert changes when validation fails
Configuration
Add to .ralph-e/config.yaml:
playwright:
enabled: true
baseUrl: "http://localhost:3000"
# Visual regression settings
visualRegression: true
baselineDir: ".ralph-e/baselines"
diffDir: ".ralph-e/diffs"
pixelThreshold: 0.1 # 0-1, how much difference is acceptable
# Accessibility testing
accessibilityCheck: true
accessibilityStandard: "wcag2aa" # wcag2a, wcag2aa, wcag2aaa
# When to validate
validateAfterTask: true
validateBeforeCommit: true
# What to do on failure
onFailure: "warn" # warn, block, or rollback
# Routes to test
routes:
- "/"
- "/dashboard"
- "/settings"
# Viewports to test
viewports:
- name: "desktop"
width: 1280
height: 720
- name: "mobile"
width: 375
height: 667
# Performance budgets (optional)
performanceBudget:
lcp: 2500 # Largest Contentful Paint (ms)
cls: 0.1 # Cumulative Layout Shift
ttfb: 800 # Time to First Byte (ms)
# Dev server settings
devServerCommand: "npm run dev"
devServerReadyPattern: "ready|started|listening"
devServerTimeout: 60000Self-Healing & Guardrails
When validation fails repeatedly (2+ times within 24 hours), Ralph-E automatically creates guardrails in .ralph-e/guardrails.md. These rules are injected into the AI agent's prompt to prevent repeating the same mistakes.
Example guardrail:
## Accessibility Issue: /dashboard
- Route: `/dashboard`
- Issue: Button "Submit" missing aria-label
- **Rule**: Ensure all interactive elements have proper ARIA labels
- Suggestion: Add aria-label or aria-labelledby to interactive elementsRollback on Failure
When onFailure: "rollback" is set, Ralph-E creates a git checkpoint before validation. If validation fails, changes are automatically rolled back:
playwright:
enabled: true
onFailure: "rollback" # automatically revert on validation failureThe rollback creates a git tag checkpoint, stashes any uncommitted changes, and resets to the pre-task state.
Options
| Flag | What it does |
|------|--------------|
| --prd PATH | task file or folder (auto-detected, default: PRD.md) |
| --yaml FILE | YAML task file |
| --github REPO | use GitHub issues |
| --github-label TAG | filter issues by label |
| --model NAME | override model for any engine |
| --sonnet | shortcut for --claude --model sonnet |
| --parallel | run parallel |
| --max-parallel N | max agents (default: 3) |
| --sandbox | use lightweight sandboxes instead of git worktrees |
| --smart-schedule | use AI to predict file conflicts and optimize parallel grouping |
| --planning-model MODEL | model for smart scheduling predictions (default: same as main) |
| --no-merge | skip auto-merge in parallel mode |
| --branch-per-task | branch per task |
| --base-branch BRANCH | base branch for PRs |
| --create-pr | create PRs |
| --draft-pr | draft PRs |
| --no-tests | skip tests |
| --no-lint | skip lint |
| --fast | skip tests + lint |
| --no-commit | don't auto-commit |
| --browser | enable browser automation |
| --no-browser | disable browser automation |
| --playwright | enable Playwright UI/UX validation |
| --no-playwright | disable Playwright validation |
| --playwright-url URL | base URL for Playwright (default: http://localhost:3000) |
| --visual-regression | enable visual regression testing (requires --playwright) |
| --accessibility | enable accessibility testing (requires --playwright) |
| --max-iterations N | stop after N tasks |
| --max-retries N | retries per task (default: 3) |
| --retry-delay N | delay between retries in seconds (default: 5) |
| --dry-run | preview only |
| -v, --verbose | debug output |
| --init | setup .ralph-e/ config |
| --config | show config |
| --add-rule "rule" | add rule to config |
Webhook Notifications
Get notified when sessions complete via Discord, Slack, or custom webhooks.
Configure in .ralph-e/config.yaml:
notifications:
discord_webhook: "https://discord.com/api/webhooks/..."
slack_webhook: "https://hooks.slack.com/services/..."
custom_webhook: "https://your-api.com/webhook"Telemetry (Opt-in)
Collect session data for building AI agent evaluation datasets:
# .ralph-e/config.yaml
telemetry:
enabled: true
privacyLevel: "anonymous" # or "full" for prompts/responses
format: "jsonl" # or "deepeval" or "openai-evals"
outputDir: ".ralph-e/telemetry"Export formats:
- jsonl: Raw session data, one session per line
- deepeval: DeepEval compatible format for LLM evaluation
- openai-evals: OpenAI Evals compatible format
Privacy levels:
- anonymous: Only aggregate metrics (token counts, durations, success rates)
- full: Full session data including prompts and responses
Requirements
- Node.js 18+ or Bun
- AI CLI: Claude Code, OpenCode, Cursor, Codex, Qwen-Code, Factory Droid, or GitHub Copilot
gh(optional, for GitHub issues /--create-pr)playwright(optional, for visual regression testing - installed automatically as optional dependency)@anthropics/mcp-playwright(optional, for AI-driven browser testing during development)
Credits
Ralph-E is based on ralphy by Michael Shimeles, which itself builds on the open-source Ralph project. Ralph-E extends it with integrated Playwright UI/UX validation, visual regression testing, accessibility checks, self-healing tests, and rollback capabilities.
Links
License
MIT
