milhouse-cli
v1.0.0
Published
Correctness-first AI coding orchestrator - Evidence-based diagnostics, WBS planning, and iterative execution
Downloads
541
Maintainers
Readme
Milhouse CLI
AI coding orchestrator. Diagnoses, plans, and executes correct work with evidence-based verification. Milhouse is neither Bart (Auto Vibe Coder) nor Ralph (Auto Loop Coder) because he’s a correctness-only QA/planner/problemsolver. Milhouse doesn’t invent new architecture like Bart, and he doesn’t just execute and move on like Ralph. Milhouse verifies with evidence, aligns code with the real environment, and turns issues into safe, one-commit tasks with clear DoD and dependencies.
Installation
Prerequisites
- Node.js >= 18.0.0
- pnpm >= 9.0.0 (for development)
- Bun (for building binaries)
Development Setup
# Install dependencies
pnpm install
# Run in development mode
pnpm dev
# Run tests
pnpm test
# Build binaries
pnpm buildPackage Manager
This project uses pnpm for package management and Bun for:
- Running TypeScript directly in development
- Building cross-platform binaries
- Running tests (bun:test)
Building Binaries
# Build all platforms
pnpm build
# Build specific platform
pnpm build:linux
pnpm build:mac-arm
pnpm build:mac-x64
pnpm build:windowsGlobal Installation
npm install -g milhouse-cliThree Modes
1. Single Task
Just tell it what to do:
milhouse "add dark mode"
milhouse "fix the auth bug"2. Task List
Work through a PRD:
milhouse # uses PRD.md
milhouse --prd tasks.md3. Investigation Pipeline ⭐ NEW
Multi-agent investigation and execution:
milhouse --scan --scope "frontend zustand" # Creates isolated run
milhouse --validate # Validate issues
milhouse --plan # Generate tasks
milhouse --consolidate # Merge plans
milhouse --exec --exec-by-issue # Execute grouped by issue (recommended!)
milhouse --verify # Verify results
# Or run full pipeline (uses --exec-by-issue automatically)
milhouse --runInvestigation Pipeline
6-phase pipeline with specialized AI agents:
| Phase | Agent | Description | |-------|-------|-------------| | scan | LI (Lead Investigator) | Scans codebase, identifies issues | | validate | IV (Issue Validators) | Validates with probes | | plan | PL (Planners) | Generates WBS per issue | | consolidate | CO (Consolidator) | Merges into unified plan | | exec | EX (Executors) | Executes tasks | | verify | VE (Verifiers) | Runs verification gates |
Pipeline Runs
Each scan creates isolated state:
milhouse --scan --scope "frontend" # Creates run-abc
milhouse --scan --scope "backend" # Creates run-def
milhouse runs list # List all runs
milhouse runs switch run-abc # Switch active run
milhouse runs info # Show current run
milhouse runs delete run-def # Delete a runProject Config
Optional. Stores rules the AI must follow.
milhouse --init # auto-detects project settings
milhouse --config # view config
milhouse --add-rule "use TypeScript strict mode"Creates .milhouse/config.yaml:
project:
name: "my-app"
language: "TypeScript"
framework: "Next.js"
commands:
test: "npm test"
lint: "npm run lint"
build: "npm run build"
rules:
- "use server actions not API routes"
- "follow error pattern in src/utils/errors.ts"
boundaries:
never_touch:
- "src/legacy/**"
- "*.lock"AI Engines
milhouse # Claude Code (default)
milhouse --opencode # OpenCode
milhouse --cursor # Cursor
milhouse --codex # Codex
milhouse --qwen # Qwen-Code
milhouse --droid # Factory DroidModel Override
milhouse --model sonnet "add feature" # use sonnet with Claude
milhouse --sonnet "add feature" # shortcut for above
milhouse --opencode --model opencode/glm-4.7-free "task"Task Sources
Markdown file (default):
milhouse --prd PRD.mdMarkdown folder (for large projects):
milhouse --prd ./prd/Reads all .md files in the folder and aggregates tasks.
YAML:
milhouse --yaml tasks.yamlGitHub Issues:
milhouse --github owner/repo
milhouse --github owner/repo --github-label "ready"Parallel Execution
Issue-Based Execution (Recommended)
milhouse --exec --exec-by-issue # Each issue in its own worktree
milhouse --exec --exec-by-issue --max-parallel 3 # 3 issues in parallelHow it works:
- Groups all tasks by their parent issue
- Each issue runs in an isolated worktree with a dedicated Claude agent
- Agent receives: issue details + validation report + WBS plan + all tasks
- Agent completes ALL tasks for that issue in one session
- Branches auto-merge back after completion
Benefits:
- Better context: Agent has full issue context, not just single task
- Fewer context switches: One agent handles related tasks together
- Faster overall: ~5 minutes per issue vs ~5 minutes per task
Task-Based Execution (Legacy)
milhouse --parallel # 3 agents default
milhouse --parallel --max-parallel 5 # 5 agentsEach agent gets isolated worktree + branch. Without --create-pr: auto-merges back with AI conflict resolution. With --create-pr: keeps branches, creates PRs. With --no-merge: keeps branches without merging.
Branch Workflow
milhouse --branch-per-task # branch per task
milhouse --branch-per-task --create-pr # + create PRs
milhouse --branch-per-task --draft-pr # + draft PRsBrowser Automation
Milhouse supports browser automation via agent-browser for testing web UIs.
milhouse "add login form" --browser # enable browser automation
milhouse "fix checkout" --no-browser # disable browser automationWhen enabled (and agent-browser is installed), the AI can:
- Open URLs and navigate pages
- Click elements and fill forms
- Take screenshots for verification
- Test web UI changes after implementation
Issue Filtering
Milhouse supports filtering issues by ID and severity level at any pipeline stage.
Filter by Issue IDs
# Process only specific issues
milhouse --validate --issues P-xxx,P-yyy,P-zzz
# Exclude specific issues
milhouse --plan --exclude-issues P-xxxFilter by Severity
# Process only CRITICAL and HIGH severity issues
milhouse --validate --severity CRITICAL,HIGH
# Process issues with severity HIGH or above
milhouse --run --min-severity HIGHSeverity Levels
Severity levels in order of priority:
- CRITICAL - Highest priority
- HIGH
- MEDIUM
- LOW - Lowest priority
Combining Filters
Filters can be combined (AND logic):
# Validate specific issues that are also HIGH+ severity
milhouse --validate --issues P-xxx,P-yyy --min-severity HIGHOptions
| Flag | What it does |
|------|--------------|
| Pipeline | |
| --scan | Run Lead Investigator |
| --scope FOCUS | Focus scan on specific area |
| --validate | Validate issues with probes |
| --plan | Generate WBS |
| --consolidate | Merge into execution plan |
| --exec | Execute tasks |
| --verify | Run verification gates |
| --run | Run full pipeline |
| --resume | Resume from last phase |
| Issue Filtering | |
| --issues IDS | Comma-separated issue IDs to process |
| --exclude-issues IDS | Comma-separated issue IDs to exclude |
| --severity LEVELS | Filter by severity (CRITICAL,HIGH,MEDIUM,LOW) |
| --min-severity LEVEL | Minimum severity level to process |
| Tasks | |
| --prd PATH | task file or folder (auto-detected, default: PRD.md) |
| --yaml FILE | YAML task file |
| --github REPO | use GitHub issues |
| --github-label TAG | filter issues by label |
| Engine | |
| --model NAME | override model for any engine |
| --sonnet | shortcut for --claude --model sonnet |
| Execution | |
| --parallel | run tasks in parallel (legacy, per-task) |
| --exec-by-issue | execute tasks grouped by issue (recommended!) |
| --max-parallel N | max parallel agents/issues (default: 3) |
| --no-merge | skip auto-merge in parallel mode |
| --branch-per-task | branch per task |
| --base-branch BRANCH | base branch for PRs |
| --create-pr | create PRs |
| --draft-pr | draft PRs |
| --worktrees | force worktree isolation |
| --exec-fail-fast | stop on first task failure |
| Testing | |
| --no-tests | skip tests |
| --no-lint | skip lint |
| --fast | skip tests + lint |
| --no-commit | don't auto-commit |
| --browser | enable browser automation |
| --no-browser | disable browser automation |
| General | |
| --max-iterations N | stop after N tasks |
| --max-retries N | retries per task (default: 3) |
| --retry-delay N | delay between retries in seconds (default: 5) |
| --dry-run | preview only |
| -v, --verbose | debug output |
| --init | setup .milhouse/ config |
| --config | show config |
| --add-rule "rule" | add rule to config |
Requirements
- Node.js 18+ or Bun
- AI CLI: Claude Code, OpenCode, Cursor, Codex, Qwen-Code, or Factory Droid
gh(optional, for GitHub issues /--create-pr)
Links
License
MIT
