@jterrats/open-orchestra
v1.2.0
Published
Local control plane for AI-assisted development orchestration, evidence gates, and agent workflows.
Maintainers
Readme
Open Orchestra
You already have Claude Code, Codex, or Cursor. The problem is everything around them: prompts drift, context is assembled by hand, handoffs get lost, and "done" means different things every time.
Open Orchestra is the governance layer that runs on top of the agents you
already use. It sequences PM → PO → Architect → Developer → QA → Release,
injects the right context at each phase, enforces human gates where it
matters, and leaves a verifiable audit trail. The public CLI is orchestra.
Unlike autonomous agents like Devin, Orchestra doesn't replace your tools — it governs them. Unlike agent frameworks like CrewAI, you don't write code to define agents — you run a workflow against your existing AI runtime.
Real result from a governed story — negative means time saved vs. that mode:
Actual: 1.4d | vs Solo: -72% | vs AI-unguided: -53% | Cost: $0.0257-72% means the task took 72% less time than the solo baseline declared at story start. Orchestra measures this automatically from the run log — no manual input after the initial estimate.
Package naming: install @jterrats/open-orchestra; run the CLI as
orchestra. The shorter @jterrats/orchestra package is not the canonical
published install target.
Quick Start
Individual Mode: First Value In Minutes
npm install -g @jterrats/open-orchestra
orchestra init
orchestra task add --id DEMO-001 --title "Ship a governed README update" --owner developer --paths "README.md"
orchestra workflow run --task DEMO-001 --gates none
orchestra statusWhat happens automatically:
- Orchestra scores and selects a workflow template for the task.
- The active agent receives task-scoped context: role, paths, acceptance criteria, risks, evidence expectations, and relevant skills.
- The run creates durable phase artifacts, handoffs, reviews, and evidence in
.agent-workflow/. - Benchmark data can compare solo, AI-unguided, and AI-guided effort after the task is complete.
Use this path when you are one developer working quickly and still want a repeatable trail instead of one-off prompts.
Team Mode: Human Gates And Release Readiness
orchestra task add --id STORY-001 --title "Ship a governed change" --owner product_owner --paths "README.md"
orchestra estimate --task STORY-001 --sizing s --solo-days 1 --ai-unguided-days 0.5 --ai-guided-days 0.25
orchestra decision add --task STORY-001 --owner architect --title "Story sizing" --decision "s 2 points" --context "Production story" --consequences "Developer phase can start"
orchestra workflow run --task STORY-001 --gates phase
orchestra evidence add --task STORY-001 --role developer --type command --summary "Validation passed" --command "npm run precommit" --exit-code 0
orchestra review --task STORY-001 --role qa --result approve --findings "Acceptance criteria covered" --recommendation "Ready for release review"
orchestra release check --jsonUse --gates phase when PO, architecture, QA, or release decisions need human
review. Use --gates all for regulated or high-risk work that must pause at
every phase transition.
Prefer a visual workflow? Start the local console with:
orchestra webStable installs use npm install -g @jterrats/open-orchestra@latest. Beta
dogfooding uses npm install -g @jterrats/open-orchestra@beta, followed by:
orchestra upgrade --smoke --jsonFor discovery or audits before adopting project-owned instruction files, start in advisory mode:
orchestra init --advisory
orchestra advisory convert --jsonWhen developing Open Orchestra from this repository, use npm install,
npm run build, and node bin/orchestra.js ... to exercise the local source.
The short path is: initialize workflow files, create a task, run the local workflow, inspect status, and preview release readiness. The production path adds explicit estimates, architecture sizing, human gates, command evidence, QA review, and release checks. For a disposable fake-provider walkthrough, see docs/end-to-end-demo.md.
Core Product Surface
Start with the small command set that supports production delivery:
- Workspace:
orchestra init,orchestra health --json,orchestra status. - Work intake:
orchestra task add,orchestra task list,orchestra task show. - Governed execution:
orchestra workflow run,orchestra decision add,orchestra evidence add,orchestra review. - Tracker and release:
orchestra github sync --issue <number>,orchestra release check --json,orchestra upgrade --smoke --json. - Workflow tooling:
orchestra workflow phase-plan --task <id>,orchestra refresh --check --json,orchestra cursor canvas status --json.
Use orchestra -h for the human quickstart and orchestra help commands for
the full command catalog. Use orchestra commands manifest --json for the
complete machine-readable reference. See
docs/core-command-surface.md and
docs/command-contracts.md for the stable/advanced
split and automation contract.
For Claude, Codex, Cursor, VS Code, Windsurf, and generic LLM usage, see
docs/runtime-llm-flow.md and
docs/runtime-adapters.md.
For the system map, see docs/architecture.md.
To start by job-to-be-done, use the
persona workflows guide for PO refinement,
developer execution, QA validation, tech lead oversight, and release
management.
For package naming rules, see docs/package-naming.md.
For tracker fallback behavior when gh is unavailable, see
docs/tracker-adapter-contract.md.
For the local, SaaS, and orchestrator threat model, see
docs/security-saas-orchestrator.md.
For the 1.0.0 documentation path across install, workflow, providers,
trackers, web console, release operations, troubleshooting, and migration, see
docs/adoption-guide.md.
For the public site content workflow, generated site manifest, Mermaid alignment,
and Technical Writer review expectations, see
docs/site-content-workflow.md.
1.0.0 Workflow Tooling
The 1.0.0 workflow surface keeps the CLI as the source of truth while making generated files and runtime views easier to audit:
orchestra workflow phase-plan --task <id> --jsonrecommends optional UX, docs, release, and risk-owner phases from task scope and project signals.- Phase playbooks live in
.agent-workflow/playbooks/and are loaded only for the active phase in provider prompts,workflow render, runtime briefs, and runtime delegation packets. orchestra refresh --check --jsondetects stale generated instruction files;orchestra refresh --dry-runpreviews changes;orchestra refresh --forceupdates managed blocks without overwriting user-authored content.- Provider progress is visible in
workflow runoutput and the local web console through/api/workflow/progress. orchestra cursor canvas status|sync|clean --jsonis a Cursor-specific local bridge. It is optional and separate from portable runtime bootstrap files.
Supported Platforms
Open Orchestra supports current Node.js LTS/runtime releases on macOS, Linux,
and Windows. Repository scripts avoid Unix-only shell requirements for the
core quality and workflow gates; npm run validate:workflow runs through a
Node helper so it works in Windows shells as well as POSIX shells. The CI
dogfood path validates installed-package init and workflow validation on
ubuntu-latest, macos-latest, and windows-latest.
Known limitations: publish/release automation may still use GitHub-hosted Linux shell steps, but local development, validation, and installed CLI dogfooding are expected to be cross-platform.
Release readiness should be judged against the latest pushed commit. A production release is ready only after the local precommit gate and CI installed-package dogfood pass on Ubuntu, macOS, and Windows for the current HEAD.
For the 1.0.0 milestone, orchestra release check --json also reports a
gaReadiness go/no-go section. GA blockers include uncovered acceptance
criteria, blocking reviews, active locks, missing smoke or rollback evidence,
missing documentation or observability evidence, missing security and package
provenance evidence, missing public CLI contract evidence, missing migration
readiness evidence, missing release test matrix evidence, and accepted risks
without a follow-up or expiry.
The release test matrix is available with:
npm run release:matrix -- --jsonSee docs/release-test-matrix.md.
Autonomous Workflow
The workflow run command executes a full story lifecycle as a governed multi-phase sequence without manual step-by-step commands. Each phase creates a sub-task, generates handoff artifacts, and persists state in an append-only run log.
Use orchestra workflow run when you want Orchestra to drive the complete
delivery lifecycle: PM, PO, Architect, Developer, QA, and Release, including
phase gates and resumable run state. Use orchestra run only when you want to
execute the already computed local task plan; it does not run the full
multi-phase lifecycle and should not be treated as a replacement for workflow
gates or release evidence.
PM → PO [gate] → Architect [sizing gate] → Developer → QA [gate] → Release# Dry run — inspect the phase graph and gate annotations without persisting state
orchestra workflow run --task FEAT-001 --dry-run --gates phase
# Full autonomous run, no human gates
orchestra workflow run --task FEAT-001 --gates none
# Run with human approval gates at po→architect and qa→release
orchestra workflow run --task FEAT-001 --gates phase
# Resume a paused or clarification-suspended run
orchestra workflow run --task FEAT-001 --resume <run-id>
# List all runs with status and phase trace
orchestra workflow runsGate modes:
| Mode | Gates |
|------|-------|
| none | Fully autonomous — no human approval required |
| phase | Pauses at po→architect and qa→release |
| all | Pauses at every phase transition |
Architect sizing gate is always enforced regardless of --gates mode. The architect must record a sizing decision (xs/s/m/l/xl) before the developer phase starts. If missing, the run fails with the exact command to resolve it.
Clarification Loop
Developers or QA engineers can surface blocking questions to the PO or architect mid-phase without stopping the workflow or making unvalidated assumptions.
# Developer asks PO a question (suspends the current phase)
orchestra workflow clarify --run <run-id> --from developer --to po \
--question "Should empty input return null or throw?"
# PO answers (resumes the phase)
orchestra workflow clarify-respond --run <run-id> --clarification <id> \
--answer "Return null — downstream code handles it."
# Resume execution after the answer is recorded
orchestra workflow run --task FEAT-001 --resume <run-id>
# Inspect all clarifications for a run
orchestra workflow clarify-list --run <run-id>Clarifications are persisted in .agent-workflow/clarifications.jsonl and visible in task context.
Benchmark & Sprint Burndown
Open Orchestra measures effectiveness across three development modes and generates a sprint burndown from story point estimates.
Estimate (declare baselines once, at story start)
orchestra estimate \
--task FEAT-001 \
--sizing m \
--solo-days 5 \
--ai-unguided-days 3 \
--ai-guided-days 2 \
--confidence highBenchmark (auto-computed after run completes)
# Per-story report: cycle time, savings %, quality signals
orchestra benchmark --task FEAT-001
# Sprint summary table
orchestra benchmark --summaryExample output:
Benchmark: FEAT-001 [complete]
Sizing: m
Solo: 5d (declared)
AI-unguided: 3d (declared)
AI-guided: 2d (declared)
Actual: 1.4d
vs Solo: -72%
vs AI-U: -53%
vs AI-G: -30%
QA loops: 1
Reviews: 3 (0 blocking)
Evidence: 5 artifacts
Lessons: 2
Tokens: 17500in / 5000out
Cost: $0.0257Quality signals (reviews, evidence, lessons, gate blocks, token usage, cost) are read automatically from the event log — no manual input after the initial estimate.
Sprint Burndown
Developer story points take priority over architect sizing; falls back to architect if developer hasn't estimated yet.
# ASCII chart + task breakdown
orchestra burndown --sprint FEAT-001,FEAT-002,FEAT-003
# JSON series for dashboards
orchestra burndown --sprint FEAT-001,FEAT-002,FEAT-003 --jsonDeveloper records their own estimate with:
orchestra decision add \
--task FEAT-001 \
--owner developer \
--title "Dev estimate" \
--decision "M / 8 points" \
--context "..." --consequences "..." --status acceptedSee docs/benchmark.md for the full reference.
Role Catalog
Open Orchestra treats roles as capabilities and governance responsibilities, not only human job titles. Projects can keep roles inactive until risk, scope, impact area, or a workflow gate requires them.
Core delivery roles: Product Manager, Product Owner, Business Analyst, Architect, Developer, QA, Security, DevOps, SRE, DBA, Data Engineer, UX/UI Designer, Accessibility Reviewer, Release Manager, Compliance/Privacy, Technical Writer, Tech Lead, SDET, Platform Engineer, Frontend Specialist, Backend Specialist, Mobile Specialist, AI Evaluation Engineer, and Support/Customer Operations.
Orchestration roles for modern multi-agent systems:
- Parent Agent / Orchestrator — sequencing, handoffs, locks, escalation, integration.
- Planner — work breakdown, dependency mapping, role activation rationale.
- Reviewer / Critic — independent review before gates or handoffs.
- Toolsmith / Integration Engineer — tools, MCPs, providers, adapters, automation contracts.
- Context Curator / Memory Manager — decisions, assumptions, stale context, shared memory hygiene.
- Policy / Governance Agent — approvals, budgets, workflow rules, compliance gates.
- Observability / Incident Response — telemetry, alerts, runbooks, incident readiness.
- Data / Privacy Officer — PII, retention, encryption, access, data compliance.
- Domain Expert — project-specific business or industry judgment.
- Performance Engineer — load, latency, scalability, caching, concurrency, graceful degradation.
- Game Designer — gameplay loops, tutorialization, player feedback, balance risk.
Each default role declares activation criteria, expected evidence, and gate participation so a parent agent can select only the roles needed for a task. See docs/dev-team-specialist-role-profiles.md for specialist profiles.
Workflow Files
.agent-workflow/
config.json
roles.json
tasks.json
locks.json
events.jsonl
workflow-runs.jsonl ← autonomous run state (append-only)
clarifications.jsonl ← clarification loop records (append-only)
estimates.jsonl ← declared effort baselines (append-only)
approvals/
decisions/
handoffs/
evidence/
reviews/
runs/Skills and Context Loading
Primary instruction files should stay short. Detailed procedures live in task-scoped skills loaded only when needed. See docs/skill-loading-strategy.md for the manifest, loading flow, and built-in skill candidates.
Prompt Registry
Open Orchestra scaffolds a stack-agnostic .generated-prompts/ registry during orchestra init. Agents use it to preserve prompt intent and generation conventions without bloating main instruction files. Split by artifact type: code, UI, services, tests, CI/CD, docs, diagrams, and evals.
VS Code Control Center
The VS Code extension scaffold lives in extensions/vscode-open-orchestra. It consumes stable CLI JSON contracts for status, validation, graph plan, roles, approvals, evidence, and config inspection. The CLI remains the source of truth.
npm run build
code extensions/vscode-open-orchestraCompatibility
- Existing
.agent-workflow/data remains valid. - Existing
AGENTS.md,CLAUDE.md, Cursor rules, and generated instruction files remain supported. ORCHESTRA.mdis the intended future primary guide name and can coexist with current agent instruction files.
See docs/orchestra-mvp.md for the full command reference.
