@chllming/wave-orchestration
v0.8.8
Published
Generic wave-based multi-agent orchestration for repository work.
Readme
Wave Orchestration
Wave Orchestration is my framework for "vibe-coding." It keeps the speed of agentic coding, but makes the runtime, coordination, and context model explicit enough to inspect, replay, and improve.
The framework does three things:
- It abstracts the agent runtime away without flattening everything to the lowest common denominator. The same waves, skills, planning, evaluation, proof, and traces can run across Claude, Codex, and OpenCode while still preserving runtime-native features through executor adapters.
- It runs work as a blackboard-style multi-agent system. Agents do not just exchange chat messages; they work against shared state, generated inboxes, explicit ownership, and staged closure, and a wave keeps going until the declared goals, proof, production-live criteria, or eval targets are actually satisfied.
- It compiles context dynamically for the task at hand. Shared memory, generated runtime files, project defaults, skills, Context7, and cached external docs are assembled at runtime so you do not have to hand-maintain separate Claude, Codex, or other context files.
Core Ideas
One orchestrator, many runtimes.Planning, skills, evals, proof, and traces stay constant while the executor adapter changes.A blackboard-style multi-agent system.Wave definitions, the coordination log, the control-plane log, and immutable result envelopes form the machine-trustable authority set; the rolling board, shared summary, inboxes, ledger, and integration views are generated projections over that state.Completion is goal-driven and proof-bounded.Waves close only when deliverables, proof artifacts, eval targets, dependencies, and closure stewards agree.Context is compiled, not hand-maintained.Wave builds runtime context from repo state, project memory, skills, Context7, and generated overlays.The system is inspectable and replayable.Dry-run previews, logs, dashboards, ledgers, traces, and replay make the system debuggable instead of mysterious.Telemetry is local-first and proof-oriented.Wave Control records typed run, proof, and benchmark events without making remote delivery part of the scheduler's critical path.
How The Architecture Works
- Define shared docs plus
docs/plans/waves/wave-<n>.mdfiles, or generate them withwave draft. - Run
wave launch --dry-runto validate the wave and materialize prompts, shared summaries, inboxes, dashboards, and executor previews before any live execution. - During live execution, implementation agents write claims, evidence, requests, and decisions into the canonical coordination log instead of relying on ad hoc terminal narration.
- Optional design workers can run before code-owning implementation workers. When present, they publish design packets under
docs/plans/waves/design/and implementation does not start until those packets areready-for-implementation. - Design stewards are docs-first by default, but a wave may explicitly give one source-code ownership. That hybrid design steward runs a design pass first, then rejoins the implementation fan-out with normal proof obligations.
- The reducer and derived-state engines materialize blackboard projections from the canonical authority set: rolling board, shared summary, per-agent inboxes, ledger, docs queue, dependency views, and integration summaries. Helper-assignment blocking, retry target selection, and resume planning read from reducer state during live runs.
- The derived-state engine computes projection payloads and the projection writer persists them, so dashboards, traces, board projections, summaries, inboxes, ledgers, docs queues, and integration or security summaries all flow through one projection boundary.
- Live closure is result-envelope-first. Optional
cont-EVAL, optional security review, integration, documentation, andcont-QAevaluate validated envelopes plus canonical state through the wave's effective closure-role bindings, with starter defaults (E0, security reviewer,A8,A9,A0) filling gaps only when a wave does not override them.
Runtime Modules
launcher.mjsThin orchestrator: parses args, acquires the launcher lock, and sequences the engines.implementation-engine.mjsSelects the design-first or implementation fan-out for a wave or retry attempt.derived-state-engine.mjsComputes shared summary, inboxes, assignments, dependency views, ledger, docs queue, and integration/security projection payloads from canonical state.gate-engine.mjsEvaluates implementation, component, assignment, dependency, clarification,cont-EVAL, security, integration, documentation, andcont-QAgates.retry-engine.mjsPlans reducer-driven resume and retry targets, reusable work, executor fallback changes, and blocking conditions.closure-engine.mjsSequences the staged closure sweep from implementation proof through finalcont-QA.wave-state-reducer.mjsRebuilds deterministic wave state from canonical inputs for live queries and replay.session-supervisor.mjsOwns launches, waits, tmux sessions, lock handling, resident orchestrator sessions, and observedwave_run,attempt, andagent_runlifecycle facts.projection-writer.mjsPersists dashboards, traces, summaries, inboxes, board projections, assignment/dependency snapshots, ledgers, docs queues, and integration/security summaries.
Architecture Surfaces
Wave contractShared plan docs, wave markdown, deliverables, proof artifacts, and eval targets define the goal.Shared stateDecisions come from the canonical authority set; boards, inboxes, dashboards, and other summaries are human-facing or operator-facing projections.Runtime abstractionExecutor adapters preserve Codex, Claude, and OpenCode-specific launch features without changing the higher-level wave contract.Compiled contextProject profile memory, shared summary, inboxes, skills, Context7, and runtime overlays are generated for the chosen executor.Proof and closureExit contracts, proof artifacts, eval markers, and closure stewards stop waves from closing on narrative-only PASS.Replay and auditTraces capture the attempt so failures can be inspected and replayed instead of guessed from screenshots.Telemetry and control planeLocal-first event spools plus the Railway-hosted Wave Control service keep proof, benchmark validity, and selected artifacts queryable across runs.
Example Output
Representative rolling message board output from a real wave run:
Common MAS Failure Cases
Recent multi-agent research keeps returning to the same failure modes:
Cosmetic board, no canonical stateAgents appear coordinated, but there is no machine-trustable authority set underneath the conversation.Hidden evidence never gets pooledOne agent has the critical fact, but it never reaches shared state before closure.Communication without global-state reconstructionAgents exchange information, but nobody reconstructs the correct cross-agent picture.Simultaneous coordination collapseA team that looks fine in serial work falls apart when multiple owners, blockers, or resources must move together.Expert signal gets averaged awayThe strongest specialist view is diluted into a weaker compromise.Contradictions get smoothed overConflicts are narrated away instead of being turned into explicit repair work.Premature closureAgents say they are done before proof, evals, or integrated state actually support PASS.
Wave is built to mitigate those failures with a canonical authority set, generated blackboard projections, explicit ownership, goal-driven, proof-bounded closure, replayable traces, and local-first telemetry. For the research framing and the current gaps, see docs/research/coordination-failure-review.md. For the concrete signal map, see docs/reference/proof-metrics.md.
Quick Start
Current release:
@chllming/[email protected]- Release tag:
v0.8.8 - Public install path: npmjs
- Authenticated fallback: GitHub Packages
Highlights in 0.8.8:
- The shipped starter surface now includes
skills/signal-hygiene/plus seededscripts/wave-status.shandscripts/wave-watch.shwrappers for long-running-agent and operator wait loops. - Long-running agents and resident orchestrators now get prompt-level signal-state and signal-ack paths, so wakeups are edge-triggered by versioned signal changes instead of relying on terminal injection.
- Versioned wave or agent signal snapshots are now a first-class operator surface under
.tmp/<lane>-wave-launcher/signals/, with failure treated as terminal in both the runtime and the wrapper exit contract. 0.8.5design-role and hybrid design-steward behavior remains part of the shipped release surface, and the current release line keeps the0.8.7capability-specific same-wave helper routing, blocker-severity consistency, and stable per-wave tmux session reuse hardening.- Release docs, current-state notes, migration guidance, publishing instructions, and the packaged operator recommendations guide now point at the
0.8.8surface.
Requirements:
- Node.js 22+
pnpmtmuxonPATHfor dashboarded runs- at least one executor on
PATH:codex,claude, oropencode - optional:
CONTEXT7_API_KEYfor launcher-side prefetch - optional:
WAVE_CONTROL_AUTH_TOKENfor remote Wave Control reporting
Install into another repo:
pnpm add -D @chllming/wave-orchestration
pnpm exec wave init
pnpm exec wave doctor
pnpm exec wave launch --lane main --dry-run --no-dashboard
pnpm exec wave coord show --lane main --wave 0 --dry-run --jsonIf the repo already has Wave config, plans, or waves you want to keep:
pnpm exec wave init --adopt-existingFresh init also seeds a starter skills/ library plus docs/evals/benchmark-catalog.json. The launcher projects those skill bundles into Codex, Claude, OpenCode, and local executor overlays after the final runtime for each agent is resolved, and waves that include cont-EVAL can declare ## Eval targets against that catalog.
The starter surface includes:
docs/agents/wave-design-role.mdskills/role-design/skills/tui-design/for terminal and operator-surface design workskills/signal-hygiene/for intentionally long-running watcher agentsscripts/wave-status.shandscripts/wave-watch.shfor external wait loopswave.config.jsondefaults forroles.designRolePromptPath,skills.byRole.design, and thedesign-passexecutor profile
Interactive wave draft scaffolds the docs-first design-steward path. If you want a hybrid design steward, author that wave explicitly or use an agentic planner payload that gives the same design agent implementation-owned paths plus the normal implementation contract sections.
If a non-resident agent should stay alive and react only to orchestrator-written signal changes, add signal-hygiene explicitly in ### Skills. That bundle uses the prompt-injected signal-state and ack paths instead of inventing a second wakeup surface. For shell automation and the wrapper contract, see docs/guides/signal-wrappers.md.
When runtime launch commands detect a newer npmjs release, Wave prints a non-blocking update notice on stderr. The fast path is pnpm exec wave self-update, which updates the dependency, prints the changelog delta, and then records the workspace upgrade report.
Common Commands
# Save project defaults and draft a new wave
pnpm exec wave project setup
pnpm exec wave draft --wave 1 --template implementation
# Run one wave with a real executor
pnpm exec wave launch --lane main --start-wave 0 --end-wave 0 --executor codex --codex-sandbox danger-full-access
# Disable Wave Control reporting for a single launcher run
pnpm exec wave launch --lane main --no-telemetry
# Inspect operator surfaces
pnpm exec wave feedback list --lane main --pending
pnpm exec wave dep show --lane main --wave 0 --json
# Run autonomous mode after the wave set is stable
pnpm exec wave autonomous --lane main --executor codex --codex-sandbox danger-full-access
# Pull the latest published package and record the workspace upgrade
pnpm exec wave self-updateCLI Surfaces
wave launchandwave autonomousLive execution, dry-run validation, retry cadence, terminal surfaces, and orchestrator options.wave controlRead-only live status plus operator task, rerun, proof, telemetry, and versioned signal surfaces. Seeded helper scriptsscripts/wave-status.shandscripts/wave-watch.share thin readers overwave control status --json.wave coordandwave depCoordination-log and cross-lane dependency utilities.wave controlis the preferred operator surface;wave coordremains useful for direct log inspection and rendering.wave project,wave draft, andwave adhocPlanner defaults, authored wave generation, and transient operator-driven runs on the same runtime.wave init,wave doctor,wave upgrade, andwave self-updateWorkspace setup, validation, adoption, and package lifecycle.
Develop This Package
pnpm install
pnpm test
node scripts/wave.mjs launch --lane main --dry-run --no-dashboardRailway MCP
This repo includes a repo-local Railway MCP launcher so Codex, Claude, and Cursor can all talk to the same Railway project from the same checkout.
- launcher:
.codex-tools/railway-mcp/start.sh - project MCP config:
.mcp.json - Cursor MCP config:
.cursor/.mcp.json - Claude project settings:
.claude/settings.json - Railway project id:
b2427e79-3de9-49c3-aa5a-c86db83123c0
One-time local checks:
railway whoami
railway link --project b2427e79-3de9-49c3-aa5a-c86db83123c0
codex mcp listLearn More
- docs/README.md: docs map and suggested structure
- docs/concepts/what-is-a-wave.md: wave anatomy, blackboard execution model, and proof-bounded closure
- docs/concepts/runtime-agnostic-orchestration.md: how one orchestration substrate spans Claude, Codex, OpenCode, and local execution
- docs/concepts/context7-vs-skills.md: compiled context, external truth, and repo-owned operating knowledge
- docs/guides/planner.md:
wave projectandwave draftworkflow - docs/agents/wave-design-role.md: standing prompt for the optional pre-implementation design steward
- docs/guides/terminal-surfaces.md: tmux, VS Code terminal registry, and dry-run surfaces
- docs/guides/signal-wrappers.md: versioned signal snapshots, wrapper scripts, and long-running-agent ack loops
- docs/reference/sample-waves.md: showcase-first authored waves, including a high-fidelity repo-landed rollout example
- docs/plans/examples/wave-example-design-handoff.md: optional design-steward example that hands a validated design packet to downstream implementation owners
- docs/plans/examples/wave-example-rollout-fidelity.md: concrete example of what good wave fidelity looks like for a narrow, closure-ready outcome
- docs/reference/cli-reference.md: complete CLI syntax for all commands and flags
- docs/plans/end-state-architecture.md: canonical runtime architecture, engine boundaries, and artifact ownership
- docs/plans/wave-orchestrator.md: operator runbook
- docs/plans/architecture-hardening-migration.md: historical record of the completed architecture hardening stages
- docs/plans/context7-wave-orchestrator.md: Context7 setup and bundle authoring
- docs/reference/runtime-config/README.md: executor, runtime, and skill-projection configuration
- docs/reference/wave-control.md: local-first telemetry contract and Railway control-plane model
- docs/reference/proof-metrics.md: README failure cases mapped to concrete telemetry and benchmark evidence
- docs/reference/skills.md: skill bundle format, resolution order, and runtime projection
- docs/research/coordination-failure-review.md: MAS failure modes from the research and how Wave responds
- CHANGELOG.md: release history
Research Sources
Canonical source index:
The implementation is based on the following research:
Harness and Runtime Surfaces
- Effective harnesses for long-running agents
- Harness engineering: leveraging Codex in an agent-first world
- Unlocking the Codex harness: how we built the App Server
- Building Effective AI Coding Agents for the Terminal: Scaffolding, Harness, Context Engineering, and Lessons Learned
- VeRO: An Evaluation Harness for Agents to Optimize Agents
- EvoClaw: Evaluating AI Agents on Continuous Software Evolution
- Verified Multi-Agent Orchestration: A Plan-Execute-Verify-Replan Framework for Complex Query Resolution
- Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
Shared Coordination and Closure
- LLM-Based Multi-Agent Blackboard System for Information Discovery in Data Science
- Exploring Advanced LLM Multi-Agent Systems Based on Blackboard Architecture
- DOVA: Deliberation-First Multi-Agent Orchestration for Autonomous Research Automation
- Why Do Multi-Agent LLM Systems Fail?
- Silo-Bench: A Scalable Environment for Evaluating Distributed Coordination in Multi-Agent LLM Systems
- An Open Agent Architecture
Skills, Repo Context, and Reusable Operating Knowledge
- SoK: Agentic Skills -- Beyond Tool Use in LLM Agents
- Agent Skills for Large Language Models: Architecture, Acquisition, Security, and the Path Forward
- SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks
- Agent Workflow Memory
- Agent READMEs: An Empirical Study of Context Files for Agentic Coding
- Context Engineering for AI Agents in Open-Source Software
