code-orchestrator

v2.4.1

Published

2 months ago

Autonomous multi-mode developer tool for AI-powered coding agents. Build, fix, audit, test, review, refactor — all with automatic code review and crash recovery.

0High
0Medium
0Low

fedevgonzalez

claude claude-code orchestrator ai automation code-review code-generation developer-tools cli

Code Orchestrator

Give your AI coding agent a spec. Get a reviewed codebase back.

Code Orchestrator turns AI coding agents into an autonomous multi-phase build system. It analyzes your project, generates a phased execution plan, runs each task via claude -p, self-reviews and scores every output, auto-fixes failures, and validates the result -- all with crash recovery and real-time monitoring.

Why This Tool?

Claude Code is powerful, but for large tasks it needs structure: phased plans, automatic validation, crash recovery, and self-review. Code Orchestrator provides that structure.

| | Claude Code (raw) | Code Orchestrator | |---|---|---| | Execution | Single prompt | Multi-phase plan with dependencies | | Review | Manual | Auto-review + scoring (1-10) + auto-fix | | Crash recovery | None | Checkpoint after every task, auto-restart | | Validation | Manual | Build, test, E2E, custom validators per phase | | Monitoring | Terminal output | Real-time WebSocket dashboard | | Modes | General purpose | 8 specialized modes (build, fix, audit...) |

Key Features

8 execution modes -- build, feature, fix, audit, test, review, refactor, exec
Automatic code review -- every task is reviewed and scored (1-10), auto-fixed if below threshold
Crash recovery -- checkpoint after every task, auto-restart with exponential backoff
Multi-language support -- Node.js, Python, Go, Rust, Java, Ruby, PHP, .NET
Parallel task execution -- independent tasks can run concurrently when configured
Real-time dashboard -- WebSocket-powered monitoring UI with live phase/task progress
Plugin system -- custom validators and lifecycle hooks via project config
Dry-run mode -- preview the generated plan without executing anything

How It Works

For each run, the orchestrator follows this pipeline:

Analyze --> Plan --> Execute --> Review --> Validate

Analyze -- scans your codebase (framework, ORM, auth, styling, structure) and interprets the request
Plan -- generates a multi-phase execution plan with ordered tasks and dependencies
Execute -- runs each task via headless claude -p with session continuity (--resume)
Review -- self-reviews each task output (score 1-10), auto-fixes if score < 7
Validate -- runs build checks, test suites, file existence checks, and custom commands per phase

Crash recovery saves a checkpoint after every task. On failure, the supervisor auto-restarts from the exact point of interruption with exponential backoff (5s to 60s). The restart counter resets whenever a phase completes successfully.

Quick Start

Prerequisites

Node.js 18+
Claude Code CLI installed and authenticated (install guide)
PM2 for background process management (npm i -g pm2)

Install

npm install -g code-orchestrator

This installs three equivalent binaries: code-orch, claude-orch, and claude-orchestrator. Use whichever you prefer.

Note: PM2 must be installed globally (npm i -g pm2). The tool will not start without it.

Or run directly with npx:

npx code-orch feature "add dark mode" --cwd /path/to/project

Usage by Mode

Build -- create a full project from a spec file (24 phases, from scaffolding to deployment):

code-orch build spec.md

Feature -- add a feature to an existing project:

code-orch feature "add Stripe billing with free/pro tiers" --cwd .

Fix -- diagnose and fix a bug (even vague descriptions work):

code-orch fix "users can't reset their password" --cwd .

Audit -- run a code audit (security, performance, quality, accessibility):

code-orch audit --type security --cwd .
code-orch audit --fix --cwd .          # audit + auto-fix

Test -- run tests, generate missing tests, fix failures:

code-orch test --fix --cwd .

Review -- comprehensive code review with detailed report:

code-orch review --cwd .

Refactor -- refactor code with regression checks:

code-orch refactor "extract auth into a standalone service" --cwd .

Exec -- generic prompt (catch-all for anything else):

code-orch exec "update all dependencies and fix breaking changes" --cwd .

Dry Run

Preview the execution plan without running any tasks:

code-orch feature "add notifications" --cwd . --dry-run

Monitor Running Instances

code-orch --status              # list all running instances
code-orch --logs myproject      # view live progress
code-orch --stop myproject      # stop an instance
code-orch --stop-all            # stop everything
code-orch --restart myproject   # restart an instance
code-orch --resume /path/to    # resume from checkpoint

Auto-Recovery (Watchdog)

Install the system watchdog so orchestrator processes automatically restart after a reboot or crash:

code-orch --install-watchdog    # register system-level auto-recovery
code-orch --watchdog-status     # check if watchdog is active
code-orch --uninstall-watchdog  # remove the watchdog

On Windows this creates a scheduled task that runs on logon. On Linux/macOS it adds a cron job (every 10 minutes). The watchdog calls pm2 resurrect to restore any saved orchestrator processes.

Configuration

Project Config File

Create .orchestrator.config.mjs (or .orchestrator.config.js) in your project root to customize behavior:

export default {
  // Build and dev commands
  buildCommand: "pnpm run build",
  devCommand: "pnpm run dev",
  testCommand: "pnpm test",
  devServerPort: 5173,

  // Timeouts
  turnTimeout: 15 * 60_000,       // 15 min per task
  phaseTimeout: 2 * 3600_000,     // 2h per phase
  totalTimeout: 24 * 3600_000,    // 24h max run

  // Review thresholds
  minTaskScore: 7,                // minimum score to pass review
  maxReviewCycles: 2,             // max review iterations per task

  // Rate limiting / parallelism
  maxConcurrentClaude: 1,         // set > 1 for parallel task execution
  claudeMinDelayMs: 1000,         // minimum delay between Claude calls

  // Permissions
  allowUnsafePermissions: true,   // false = Claude will prompt for permission

  // Plugins
  plugins: ["./my-validator.mjs"],
};

Config files are searched in this order: .orchestrator.config.mjs, .orchestrator.config.js, .orchestrator.config.cjs, orchestrator.config.mjs, orchestrator.config.js.

CLI Reference

Subcommands:

| Command | Description | |---------|-------------| | build <spec.md> | Build a full project from a markdown spec | | feature "<description>" | Add a feature to an existing project | | fix "<description>" | Diagnose and fix a bug | | audit | Run a code audit | | test | Run and generate tests | | review | Full code review | | refactor "<description>" | Refactor code | | exec "<prompt>" | Execute any prompt |

Flags:

| Flag | Description | |------|-------------| | --cwd <dir> | Project directory (default: current directory) | | --dry-run | Generate the plan without executing tasks | | --no-review | Skip the code review step after each task | | --fix | Auto-fix issues found during audit or test modes | | --type <type> | Audit type: security, performance, quality, a11y, full | | --dev-port <port> | Dev server port for validation (default: auto-assigned) | | --port <port> | Dashboard HTTP port (default: auto-assigned from 3111) | | --max-restarts <n> | Maximum auto-restart attempts (default: 50) | | --verbose | Enable verbose logging |

Management flags:

| Flag | Description | |------|-------------| | --status | Show all running orchestrator instances | | --logs [name] | View logs for an instance | | --stop [name] | Stop an instance | | --stop-all | Stop all instances | | --restart [name] | Restart an instance | | --resume <project-dir> | Resume from a saved checkpoint | | --install-watchdog | Register system watchdog for auto-recovery after reboot | | --uninstall-watchdog | Remove the system watchdog | | --watchdog-status | Check if watchdog is active |

VS Code / Cursor Extension

Install the extension for a fully integrated editor experience:

# From the repo (dev build)
cd vscode-extension && npm install && npx @vscode/vsce package
# Then install the .vsix in VS Code/Cursor

Features

Right-click any .md file -- smart file analyzer detects bugs, features, specs and recommends the best orchestration mode with a generated prompt
Duplicate run protection -- checks PM2 for existing instances before starting, requires explicit confirmation to replace
Sidebar dashboard -- embedded WebSocket panel showing real-time phase/task progress
Status bar -- live progress indicator with task count, percentage, and cost
Command palette -- all 8 modes available via Ctrl+Shift+P -> "Code Orchestrator"
Run history -- tree view in the sidebar showing past runs with status and duration

Smart File Analysis

When you right-click a .md file, the extension reads the content and automatically:

Detects file type (spec, backlog, status report, bug report, test plan, etc.)
Extracts item IDs (B1, B2, F1, F2, etc.) with types and priorities
Recommends the best mode (build, fix, exec, etc.) with a confidence level
Generates a detailed prompt referencing specific items and file paths
Shows the recommendation with options to accept, customize, or override

+------------------------------------------------------+
| Code Orchestrator -- Smart Analysis                  |
|------------------------------------------------------|
| (check) Build from Backlog (Recommended)  mode: exec |
| Backlog with 23 items (5 bugs, 18 features).        |
| Will implement in priority order.                    |
|------------------------------------------------------|
| (edit) Customize prompt before running               |
|------------------------------------------------------|
| (play) exec    (wrench) fix    (search) audit   ... |
+------------------------------------------------------+

Dashboard

Each orchestrator instance serves a real-time monitoring dashboard over HTTP. The default port is auto-assigned starting from 3111.

http://localhost:3111

The dashboard features:

OKLCH color system with perceptually uniform accents and 4-level depth hierarchy
Task status chips and score badges for instant scannability
Live WebSocket connection with automatic reconnection
Phase timeline with active border indicators
Log viewer with split timestamps and colored message types
Accessibility -- ARIA roles, keyboard navigation, focus-visible, prefers-reduced-motion
Responsive -- auto-fit grid, mobile padding overrides

Dashboard Authentication

Set the ORCHESTRATOR_TOKEN environment variable to require authentication:

export ORCHESTRATOR_TOKEN=my-secret-token

When set, all dashboard and API requests must include the token:

Query parameter: ?token=my-secret-token
Header: Authorization: Bearer my-secret-token
WebSocket: ws://localhost:3111?token=my-secret-token

The /health endpoint is always accessible without authentication.

HTTP API

| Endpoint | Method | Description | |----------|--------|-------------| | / | GET | Dashboard HTML | | /health | GET | Instance health and uptime (no auth required) | | /state | GET | Full orchestrator state, phases, history | | /logs | GET | Last 200 log entries | | /restart | POST | Restart the orchestrator | | /stop | POST | Stop the orchestrator |

WebSocket Events

Connect to ws://localhost:<port> for real-time events:

initial_state -- sent on connection with current state
plan_ready -- execution plan generated
phase_start / phase_done -- phase lifecycle
task_start / task_done -- task lifecycle with review scores
task_reviewed -- code review results (score, approved, issues)
state_update -- full state sync on key events
orchestrator_restarting / orchestrator_completed -- supervisor events
run_complete -- final status with task counts and elapsed time
error -- error details

Plugin System

Plugins add custom validators and lifecycle hooks. Create a plugin file that exports a register function:

// my-validator.mjs
export function register(orch) {
  // Add a custom validator (runs during phase validation)
  orch.addValidator("my-check", async (cwd, config) => {
    // Run your checks here
    return { type: "my-check", ok: true, message: "All checks passed" };
  });

  // Add a lifecycle hook
  orch.addHook("afterTask", (task, phase) => {
    console.log(`Task ${task.id} completed with score ${task.reviewScore}`);
  });
}

// .orchestrator.config.mjs
export default {
  plugins: ["./my-validator.mjs"],
};

Available Hook Events

| Hook | Arguments | Description | |------|-----------|-------------| | beforeRun | (orchestrator) | Before orchestration starts | | afterRun | (orchestrator, status) | After orchestration ends | | beforePhase | (phase, phaseIdx) | Before a phase starts | | afterPhase | (phase, phaseIdx) | After a phase completes | | beforeTask | (task, phase) | Before a task starts | | afterTask | (task, phase) | After a task completes | | beforePhaseValidation | (phase, phaseIdx) | Before phase validation runs | | onValidationFail | (result, phase) | When a validation check fails | | onReviewComplete | (task, review) | When a task review finishes | | onEvent | (event) | All orchestrator events |

Environment Variables

| Variable | Description | |----------|-------------| | ORCHESTRATOR_TOKEN | Dashboard and API authentication token. When set, all requests (except /health) require this token. |

Multi-Language Support

The analyzer automatically detects your project's ecosystem and adjusts build/test/dev commands accordingly. Supported ecosystems:

| Ecosystem | Detection | Default Build | Default Test | |-----------|-----------|---------------|--------------| | Node.js | package.json | npm run build | npm test | | Python | pyproject.toml, requirements.txt, setup.py, Pipfile | varies | pytest | | Go | go.mod | go build ./... | go test ./... | | Rust | Cargo.toml | cargo build | cargo test | | Java | pom.xml, build.gradle, build.gradle.kts | mvn package / gradle build | mvn test / gradle test | | Ruby | Gemfile | bundle exec rake | bundle exec rspec | | PHP | composer.json | composer install | vendor/bin/phpunit | | .NET | *.csproj, *.sln | dotnet build | dotnet test |

Override any detected command in your .orchestrator.config.mjs.

Architecture Overview

code-orchestrator/
├── watcher/                       # Core CLI + engine
│   ├── cli.mjs                    # CLI entry point, subcommand routing, PM2 daemon
│   ├── watcher.mjs                # Supervisor: HTTP/WS server, auto-restart, lifecycle
│   ├── watchdog.mjs               # System watchdog (reboot recovery)
│   ├── dashboard/index.html       # Real-time monitoring dashboard (OKLCH, accessible)
│   ├── package.json
│   └── src/
│       ├── orchestrator.mjs       # Core execution engine (phase/task loop)
│       ├── analyzer.mjs           # Two-phase codebase analyzer (local scan + Claude)
│       ├── claude-cli.mjs         # Headless claude -p adapter with cost tracking
│       ├── reviewer.mjs           # Code review via Claude pipe mode
│       ├── validator.mjs          # Build, test, e2e, custom validation
│       ├── spec.mjs               # Spec-to-plan converter (24-phase build pipeline)
│       ├── checkpoint.mjs         # Atomic checkpoint save/load for crash recovery
│       ├── rate-limiter.mjs       # Rate limiter for Claude API calls
│       ├── config.mjs             # Project config loader and merger
│       ├── plugins.mjs            # Plugin registry (validators + hooks)
│       ├── history.mjs            # Run history tracking and stats
│       ├── planner.mjs            # Mode dispatcher
│       ├── models.mjs             # Constants, enums, factory functions
│       ├── jsonl.mjs              # JSONL transcript writer
│       └── modes/                 # 8 specialized execution modes
├── vscode-extension/              # VS Code / Cursor extension
│   ├── src/
│   │   ├── extension.ts           # Extension entry, commands, webview dashboard
│   │   ├── file-analyzer.ts       # Smart .md file analyzer (mode + prompt generation)
│   │   ├── runner.ts              # CLI runner with binary resolution + duplicate check
│   │   ├── status-bar.ts          # Status bar progress indicator
│   │   └── run-history.ts         # Run history tree view
│   ├── media/sidebar-icon.svg
│   └── package.json
├── spec.example.md
├── ROADMAP.md
├── SECURITY.md
├── CONTRIBUTING.md
├── CHANGELOG.md
├── LICENSE
└── README.md

Writing a Spec File

For build mode, create a markdown spec describing your project:

# My SaaS App

## Overview
A project management tool for small teams.

## Tech Stack
- Framework: Next.js 14 with App Router
- Database: PostgreSQL with Drizzle ORM
- Auth: Better Auth with Google OAuth
- Payments: Stripe

## Entities
- Project: name, description, status, owner
- Task: title, description, priority, assignee, project
- Comment: text, author, task

## Features
- Dashboard with project overview and task board
- Kanban board with drag-and-drop
- Team member invitations via email
- Real-time notifications
- Stripe billing with free/pro/enterprise tiers

See spec.example.md for a complete example.

Platform Notes

Windows: Claude CLI resolved at ~/.claude/local/claude.exe or via PATH
Linux/macOS: Claude CLI resolved via PATH
Each claude -p invocation is a clean subprocess -- no PTY, no zombie processes
PM2 is used for background process management and log persistence

Cost Guidance

Code Orchestrator calls claude -p for each task, review, and fix attempt. Costs depend on the mode and project complexity:

| Mode | Typical Claude Calls | Estimated Cost Range | |------|---------------------|---------------------| | fix (simple bug) | 3-8 | $0.10 - $0.50 | | feature (medium) | 10-25 | $0.50 - $2.00 | | audit | 5-15 | $0.30 - $1.50 | | build (full project) | 50-200+ | $5.00 - $30.00+ |

Use --dry-run to preview the plan and estimate calls before executing. The dashboard shows real-time cost tracking during execution.

Roadmap

See ROADMAP.md for planned features and what's coming next.

Security

See SECURITY.md for the security model, permission modes, and responsible disclosure policy.

Contributing

See CONTRIBUTING.md for development setup, code style, and PR guidelines.

License

MIT