@malayvuong/conductor-ai
v2026.3.1
Published
Autonomous execution supervisor for AI coding CLIs
Maintainers
Readme
Conductor
Supervisor for AI coding CLIs. Manages sessions, decomposes goals into work packages, runs engines (Claude, Codex) in a loop, and produces structured reports with full execution history.
User
|
v
cdx config set engine claude # one-time setup
cdx doctor # check environment
cdx session start my-project
cdx execute plan.md --until-done # plan mode
cdx execute "fix login bug" --until-done # no-plan mode
cdx status # check progress
cdx inspect # deep dive
cdx inspect --goal 1 --insights # drill into goal
cdx session pause # pause current session
cdx session switch other-project # switch sessions
|
v
┌─────────────── Supervisor Layer ───────────────┐
│ Session → Goal → Work Packages → Snapshots │
│ Compactor (context between runs) │
│ Scheduler (WP ordering, retries, blockers) │
│ Progress detection (evidence-based) │
│ Closeout summary (per goal) │
└────────────────────┬───────────────────────────┘
|
┌────────────────────v───────────────────────────┐
│ Execution Layer │
│ Task → Run → Logs → Report → Heartbeat │
│ Engine adapters (Claude, Codex) │
│ Stream parser, log interpreter │
└────────────────────────────────────────────────┘
|
v
SQLite (local)Architecture
Two-layer design:
- Supervisor Layer — Sessions, goals, work packages, snapshots, execution attempts. Manages the "what to do next" loop: parse plan → create WPs → dispatch engine → evaluate result → snapshot state → repeat or finish.
- Execution Layer — Tasks, runs, logs, reports, heartbeat. Handles the "how to run" mechanics: classify task → build prompt → spawn engine → stream output → generate report.
Session-first UX: Sessions are the primary user-facing surface. Session names are reusable labels (e.g. solo-defender, ispa-cms) — not permanent unique IDs. When a session completes, it is archived automatically, and the label becomes available for a new run. Goals are internal state managed by the supervisor. Users never need to handle goal IDs or invent -v2 suffixes.
Prerequisites
- Node.js 22+
- One of the supported AI CLI tools installed:
- Claude Code (
claude) - Codex CLI (
codex)
- Claude Code (
Install
git clone <repo-url> conductor
cd conductor
npm install
npm run build
npm link # makes `cdx` available globallyQuick Start
# First time: set defaults (once)
cdx config set engine claude
cdx config set path /path/to/project
# Check environment
cdx doctor
# Start a session
cdx session start my-project
# Execute a plan file (decomposes into WPs, runs in loop)
cdx execute plan.md --until-done
# Or execute an ad-hoc task (no plan file needed)
cdx execute "fix the login API 500 error" --until-done
# Check progress
cdx status
# Deep inspection (WPs, attempts, snapshots, closeout)
cdx inspect
# Resume after interruption (Ctrl+C pauses, next execute resumes)
cdx execute --until-done
# View session history
cdx session history
cdx session history my-project # all runs for a label
# Session management
cdx session current # which session is active
cdx session pause # pause session + goal
cdx session resume # resume most recent paused
cdx session switch other-project # switch sessions
cdx session close # close + archive session
# Label reuse — after completion, same label starts a fresh run
cdx session start my-project # → creates run #2
# Inspect drill-down
cdx inspect --goal 1 # single goal detail
cdx inspect --goal 1 --attempts # attempt timeline
cdx inspect --goal 1 --snapshots # snapshot chain
cdx inspect --goal 1 --insights # decisions, assumptions, questions
cdx inspect --label my-project # all runs for a label (including archived)Commands
Session Management
cdx session start <label>
Start a session under a reusable label.
cdx session start solo-defender # uses config defaults
cdx session start solo-defender --engine codex # override engine
cdx session start solo-defender --path /other/project # override path| Flag | Required | Description |
|------|----------|-------------|
| --engine | No | Override default engine. Uses config/env fallback if omitted. |
| --path | No | Override workspace path. Uses config default or cwd if omitted. |
Behavior depends on existing sessions for that label:
| Existing state | Action | |----------------|--------| | No sessions | Create new session (run #1) | | Only archived runs | Create new session (run #N+1), show prior run count | | Active/created session | Focus existing session | | Paused session | Resume existing session |
Labels are reusable — once a session is archived (completed or closed), the same label can be used for a fresh run. No need to invent -v2, -v3 suffixes.
cdx session list
List all sessions with status and goal count.
cdx session list
cdx session list --status activecdx session current
Show which session is active.
cdx session pause
Pause current session and its active goal.
cdx session resume [name]
Resume a paused session. Without name, resumes most recent paused session.
cdx session switch <name>
Switch to another session (pauses current, activates target).
cdx session close
Close and archive the current session. Completed goals stay completed, unfinished goals are abandoned with closeout summaries. The label becomes available for a new session.
cdx status
Show current session status: engine, path, active goal, WP progress, retries. Includes hygiene warnings for stale sessions and too many paused goals.
cdx inspect
Detailed inspection of current session or a label's full history.
cdx inspect # current active session
cdx inspect --label solo-defender # all runs for a label (including archived)
cdx inspect --goal <N> # single goal detail
cdx inspect --goal <N> --attempts # attempt timeline
cdx inspect --goal <N> --snapshots # snapshot chain
cdx inspect --goal <N> --insights # decisions, assumptions, questions, follow-ups, constraintsThe --label flag is especially useful after a session has been archived — it lets you inspect completed work without needing an active session.
cdx session history [label]
View session goal history. Without a label, shows the current active session. With a label, shows all runs (including archived) grouped by run index.
cdx session history # current session
cdx session history solo-defender # all runs for labelExecution
cdx execute [source] --until-done
The main execution command. Behavior depends on input:
| Input | Mode | Behavior |
|-------|------|----------|
| cdx execute plan.md --until-done | Plan mode | Parse plan → create WPs → run loop |
| cdx execute "fix bug" --until-done | No-plan mode | Create single WP → run with evidence-based completion |
| cdx execute --until-done | Resume | Continue active unfinished goal |
Resume vs New Goal rules:
- Source provided (file or text) → always creates new goal. If an active unfinished goal exists, it is auto-paused.
- No source → resume only. Continues the active unfinished goal.
Execution Layer (Low-Level)
cdx run
Run a single task directly (bypasses supervisor layer).
cdx run --engine claude --task "fix the login bug"cdx resume <taskId>
Resume a task with curated context from previous runs.
cdx tasks
List all tasks with status.
cdx logs <runId>
View saved logs for a run.
cdx report <runId>
View the structured report for a run.
cdx runs show <runId>
Inspect run metadata.
Configuration
cdx config set <key> <value>
Set a global config value. Short aliases supported.
cdx config set engine claude # set default engine
cdx config set path /path/to/project # set default workspace path
cdx config set heartbeat 30 # heartbeat interval (seconds)
cdx config set stuck-threshold 120 # stall detection threshold (seconds)cdx config get <key>
Get a single config value.
cdx config get engine # → claudecdx config show
Show all config values. On fresh install, shows quick-start guide.
cdx config unset <key>
Remove a config value.
cdx config unset engine| Key | Alias for | Description |
|-----|-----------|-------------|
| engine | defaultEngine | Default engine: claude or codex |
| path | defaultPath | Default workspace path |
| heartbeat | heartbeatIntervalSec | Heartbeat check interval (seconds, default 15) |
| stuck-threshold | stuckThresholdSec | Stall detection threshold (seconds, default 60) |
Legacy aliases: cdx set-path, cdx get-path, cdx clear-path still work.
cdx doctor
Check environment and configuration. Shows what's set up, what's missing, and what to do next.
cdx doctor [OK] Config file ~/.conductor/config.json
[OK] Default engine claude
[OK] Engine: claude found in PATH
[--] Engine: codex not found in PATH
[OK] Default path /Users/me/projects/my-app
[OK] Active session my-project [active]
Ready to go.Configuration
Persistent config at ~/.conductor/config.json:
{
"defaultPath": "/Users/me/projects/my-app",
"defaultEngine": "claude",
"heartbeatIntervalSec": 15,
"stuckThresholdSec": 60
}Engine Resolution
Engine is resolved with a priority chain — no need to specify --engine every time:
--engineflag on current command- Engine stored in session
defaultEnginein configDEFAULT_ENGINEenvironment variable- Helpful error with onboarding instructions
# One-time setup
cdx config set engine claude
# Then just use sessions
cdx session start my-projectSupervisor Loop
The supervisor loop (cdx execute ... --until-done) works as follows:
- Parse input — Plan file → decompose into work packages. Ad-hoc text → single WP.
- Schedule — Pick next WP by seq order, skip completed/blocked.
- Build prompt — Include goal context, WP description, snapshot from previous run (if any), done criteria.
- Dispatch engine — Spawn Claude/Codex with the prompt.
- Evaluate — Parse report, detect progress, check WP completion.
- Snapshot — Capture state (completed items, in-progress, remaining, decisions, files, blockers) for next run's context.
- Advance or retry — Complete WP → next WP. No progress → retry with escalated prompt strategy. Exhausted retries → mark failed.
- Finalize check — After each WP result, immediately check if all WPs are done. If yes, finalize atomically (goal + session + closeout in one transaction). This prevents the SIGINT race where Ctrl+C during the last engine run could leave stale state.
- Loop until all WPs done or hard-blocked.
Prompt strategy escalation: normal → focused → surgical → recovery (based on retry count).
Evidence-based completion (ad-hoc mode): Completion signal alone is not enough — needs evidence (files_changed, files_inspected, fix_applied, verification, what_implemented, substantial output, or findings).
Goal lifecycle: created → active → paused/completed/failed/hard_blocked/abandoned.
Session lifecycle: created → active → paused → archived. Completed sessions are automatically archived. The session label (e.g. solo-defender) becomes immediately reusable for a new run.
Closeout summary: Generated at every terminal state with objective, files touched, decisions, blockers, and next recommended action.
State Consistency Guarantees
The supervisor loop ensures that DB state is always consistent with execution reality:
- Transactional finalization — Goal status + session status + closeout summary are written in a single SQLite transaction. No partial writes possible.
- SIGINT-safe completion — After each WP completes, the loop checks for goal completion before checking the interrupted flag. If Ctrl+C fires during the last engine run, the goal still finalizes as completed.
- Interrupt safety net — The interrupt path also checks if all WPs are actually done before marking as paused. If the goal completed, it finalizes correctly regardless of interrupt timing.
- No silent error swallowing — Closeout generation errors propagate rather than being silently logged.
Live Progress During Execution
While the loop runs, periodic heartbeat events show real-time status:
── session: my-project | goal: Implement CMS ──
[WP 1/5] Scan structure — attempt 1 (normal)
[WP 1/5] heartbeat: alive | files: 3 | idle: 2s | last: Read | strategy: normal
[WP 1/5] heartbeat: alive | files: 8 | idle: 0s | last: Edit | strategy: normal
[WP 1/5] completed
[WP 2/5] Build models — attempt 1 (normal)
[WP 2/5] heartbeat: alive | files: 14 | idle: 5s | last: Write | strategy: normal
[WP 2/5] no output for 65s — possible stallEach heartbeat shows:
- status: alive, idle, suspected_stuck, recovered
- files: unique files touched during current run
- idle: seconds since last engine output
- last: most recent tool used (Read, Edit, Write, etc.)
- strategy: current prompt strategy
Stall detection: When no engine output exceeds the stuck threshold (default 60s), a visible warning is emitted. Heartbeat events are also persisted to the database for post-run analysis.
Live Status While Running
cdx status and cdx inspect show live run info when a session is actively executing:
Session: my-project [Running]
Engine: claude
Path: /path/to/project
Goal: Implement CMS [active]
Progress: 2/5 WPs completed
Current: Build models
Run: 3m 42s elapsed | strategy: focused
Heartbeat: idle | idle: 35sData Storage
SQLite at data/conductor.db with two layers:
Supervisor tables:
- sessions — name (reusable label), run_index, title, engine, path, status, active_goal_id, working_summary, decisions
- goals — title, description, type, source_type, status, completion_rules, closeout_summary
- work_packages — seq, title, status, retry_count/budget, blocker_type/detail, done_criteria
- snapshots — trigger, summary, completed/in-progress/remaining items, decisions, files, blockers, next_action, assumptions, unresolved_questions, follow_ups
- execution_attempts — attempt_no, status, prompt_strategy, progress_detected, files_changed_count
Execution tables:
- tasks — raw_input, workspace, engine, classification, status
- runs — command, prompt, PID, exit code, timestamps, resumed_from_run_id, cost_usd
- run_logs — stdout/stderr/system lines, sequenced
- heartbeat_events — alive/idle/suspected_stuck/recovered
- run_reports — structured post-run analysis with task-type-specific fields
Schema migrations run automatically on startup.
Project Structure
conductor/
src/
cli/
index.ts # CLI entry + command registration
commands/
session.ts # Session management + display helpers
execute.ts # Supervisor execution (plan + no-plan)
goal.ts # [Internal] Goal management
run.ts # Single-run orchestration
resume.ts # Resume with curated context
tasks.ts # List tasks
logs.ts # View run logs
report.ts # Task-type-specific report display
runs.ts # Run metadata inspection
config.ts # cdx config set/get/show/unset + legacy aliases
doctor.ts # cdx doctor — environment check
core/
config/service.ts # Config read/write, resolveEngine(), key aliases
supervisor/
loop.ts # Main supervisor loop (transactional finalization, SIGINT-safe)
scheduler.ts # WP scheduling, status counting
plan-parser.ts # Markdown plan → WP decomposition
prompt-builder.ts # Supervisor prompt (plan + ad-hoc)
progress.ts # Evidence-based progress detection
compactor.ts # Snapshot builder + decision extraction
closeout.ts # Goal closeout summary generation
progress-reporter.ts # Live progress event formatting
live-tracker.ts # Real-time file/tool tracking during runs
hygiene.ts # Session health warnings
storage/
schema.ts # SQL DDL + migrations
db.ts # SQLite singleton (WAL mode)
repository.ts # Execution layer CRUD
supervisor-repository.ts # Supervisor layer CRUD
task/normalizer.ts # Keyword-based task classification
prompt/builder.ts # Template loading + substitution
engine/
types.ts # EngineAdapter interface + factory
claude.ts # Claude CLI adapter (stream-json)
codex.ts # Codex CLI adapter
stream-parser.ts # JSON event parser
log-interpreter.ts # Unified log parsing into typed events
runner/process.ts # child_process.spawn wrapper
heartbeat/monitor.ts # State-tracked output monitoring
report/generator.ts # Task-type-aware report extraction
resume/
context.ts # Best-run selection + typed context
prompt.ts # Structured resume prompt rendering
types/
index.ts # Execution layer types
supervisor.ts # Supervisor layer types
utils/
logger.ts # Timestamped console logger
lookup.ts # Short-ID prefix resolution
prompts/ # Prompt templates per engine/task type
data/ # SQLite DB (gitignored)
tests/ # Vitest test suite (349 tests, 31 files)Development
npm run dev -- <command> # Run CLI in dev mode (tsx)
npm test # Run all tests (349 tests)
npm run test:watch # Watch mode
npm run build # Compile TypeScript to dist/
npm link # Link cdx command globallyTech Stack
- Runtime: Node.js 22+, TypeScript
- CLI: commander
- Validation: zod
- Database: better-sqlite3 (WAL mode, foreign keys)
- Process: child_process.spawn with stdin pipe
- Tests: vitest
License
MIT
