iterative-delivery-plugin

v0.1.3

Published

a month ago

Codex CLI plugin for durable iterative delivery workflows with guided and autonomous orchestration surfaces.

Downloads

379

0High
0Medium
0Low

wishmri

codex plugin delivery workflow automation

Iterative Delivery Plugin

Iterative Delivery is a Codex CLI plugin that provides a durable delivery control plane under .delivery/ and multiple operator surfaces:

autonomous orchestration through the umbrella $itd:run skill
guided phase control through concise itd:* command and skill aliases

The runtime keeps workflow state, packets, handoffs, summaries, review outputs, verification outputs, workstreams, and lightweight threads on disk so work can survive context exhaustion and session interruption. All user-facing surfaces share that same .delivery/ kernel and record a first-class operating_mode (delivery, plan, team, or ralph) instead of introducing separate state systems.

For day-to-day engineering usage, prefer the intent-level entry commands:

run
plan
team
ralph
feature
bugfix
refactor
resume
status

Those commands are thin wrappers over the same runtime and packet control plane. feature, plan, team, and ralph default conservatively to existing-project delivery; pass --new when you explicitly want a greenfield flow. run, team, and ralph now route through autonomous-drive for autonomous execution by default, while explicit guided requests still stay on the direct delivery.start path. When feature, bugfix, or refactor are invoked with --autonomous or --control-mode autonomous, the operator entry now starts the run and then hands control to autonomous-drive so the host loop continues without another manual "continue" turn.

One-command install

This repo now includes a real installer entrypoint at scripts/install.mjs. The installer does two things automatically:

copies the plugin into ~/plugins/itd
creates or updates ~/.agents/plugins/marketplace.json

The plugin namespace exposed in Codex is now itd:*. The primary surfaces are:

itd:run
itd:plan
itd:team
itd:ralph
itd:feature
itd:bugfix
itd:refactor
itd:resume
itd:status
itd:cancel
itd:gc
itd:iterate
itd:pause
itd:review
itd:thread
itd:verify
itd:workstream

From a local checkout:

node scripts/install.mjs

After installation or update, restart Codex or open a fresh Codex session so it reloads the plugin cache and shows the new itd:* namespace.

For development, install as a live link instead of a copied snapshot:

node scripts/install.mjs --link

For remote one-command install, expose the package through a source that npx can resolve:

npx <your-package-or-git-url>

Examples:

npx iterative-delivery-plugin@latest
npx git+https://github.com/<owner>/iterative-delivery-plugin.git

To make the shorter npm form work, publish the package and replace the current placeholder metadata before release.

Publish to npm

This package is now configured for public npm publishing:

package name: iterative-delivery-plugin
publish access: public
publish gate: npm run verify runs automatically via prepublishOnly

Recommended release flow:

npm login
npm run verify
npm publish

After the first public release, operators can install with:

npx iterative-delivery-plugin@latest

Notes:

If you later move to a scoped package such as @your-scope/iterative-delivery-plugin, keep publishConfig.access as public and publish with the scoped name.
This repository currently has no configured git remote, so npm package metadata is publishable but not yet enriched with a canonical source URL.

Included surfaces

plugin manifest: .codex-plugin/plugin.json
slash-compatible command wrappers under commands/
skill wrappers under skills/
lane prompts under prompts/
JSON schemas under schemas/
single runtime entrypoint: scripts/delivery.mjs

Runtime notes

Canonical delivery state lives in .delivery/.
The root .delivery/STATE.json is the active-workstream index and mirror.
The authoritative workstream state lives in .delivery/workstreams/<name>/STATE.json.
The tested autonomous orchestration sequence lives in scripts/lib/autonomous-loop.mjs; the umbrella $itd:run skill should follow that loop instead of inventing its own phase order.
scripts/autonomous-drive.mjs is the recommended host-side entrypoint for running that loop.
scripts/autonomous-drive.mjs now defaults to the built-in scripts/codex-packet-runner.mjs module when no explicit runner override is supplied, so one autonomous command can drive the full packet loop end-to-end.
packet prepare returns additive prompt and consume metadata: prompt_relpath, prompt_path, consume, and consume_command. New hosts should prefer prompt_relpath and the structured consume object instead of parsing shell strings or hardcoding lane-to-prompt mappings.
packet prepare now also stamps a runtime-authored prepare_set_id onto the prepared packet JSON and onto the packet.prepare result. Treat that as the authoritative batch identity for one prepare invocation, especially for multi-lane review packets.
Packet staleness now also uses an explicit packet_state_revision carried in state, packet snapshots, and active prepare metadata. Control-plane updates like pause and resume no longer invalidate an otherwise still-current prepared batch just because updated_at changed.
Legacy packet inputs that predate packet_state_revision are no longer treated as current just because timestamps happen to line up. The runtime now migrates those packets only when the authoritative active_prepare_set explicitly owns the packet ids for the live batch; otherwise they are stale.
The authoritative prepared packet JSON now also carries execution_policy, a machine-readable baseline for lane delegation and dispatch behavior. Hosts should prefer that packet-owned policy for runtime semantics and keep launcher-specific model/profile/sandbox choices as host-local overrides.
Execution packets now also carry iteration_governance and rework_handoff when another bounded implementation pass was opened from review or verification. That keeps the runtime-owned retry budget and the concrete rework reason in authoritative packet input instead of hiding it in host-local orchestration.
status now includes storage statistics for the active workstream, plus reminder-style warnings when .delivery grows, inactive runs pile up, or packet outputs exceed the soft limit.
status now also includes a packet-aware guided_step plus fallback operator_next_steps so guided/manual operators can see whether the current truth is "prepare packets", "run and consume pending lanes", or "finalize the phase" without rereading packet files by hand.
When review reconciliation artifacts exist, status, STATE.md, and memory summaries now surface the unified review view directly, including deduplicated finding counts, top severity, and recorded reviewer conflicts.
The statusline bridge and low-context monitor now also carry that compact reconciliation summary so agents can prefer the unified review artifact over divergent lane outputs when context is constrained.
Packet preparation now also accepts --context-budget-profile <tight|bounded|wide>. Use tight when a lane should stay narrowly scoped to the current slice, bounded for the default balanced packet context, and wide when the lane needs broader project and roadmap context.
Autonomous hosts can override that globally with --context-budget-profile, or per packet type with --execution-context-budget-profile, --review-context-budget-profile, and --verification-context-budget-profile.
Lane outputs should be persisted through the prepared packet's consume metadata or legacy consume_command so the runtime can validate packet identity, packet type, lane, and active prepare batch before writing .delivery/.../packets/*.output.json.
Guided/manual packet status now resolves the current packet set from the latest active prepare_set_id instead of approximating "latest per lane". Review finalization and verification also stay pinned to the intended batch: review uses the latest prepared review batch, while verification reads the last finalized review batch rather than all same-iteration review outputs.
The authoritative workstream state now also persists active_prepare_set. That makes the current prepare batch state-owned first. iterate, review, and verify now require authoritative batch metadata instead of scanning same-iteration packet files and guessing which outputs are current.
The authoritative workstream state also persists structured latest_review / latest_verification provenance. Those records are now semantic contracts, not loose blobs: review provenance must carry the finalized prepare_set_id, packet ids, and review lanes; verification provenance must carry the finalized packet id, summary, and completion time.
latest_gate is also now a semantic contract. Persisted gate state must keep the gate status, recommendation, criterion results, check buckets, missing evidence, approvals, accepted risks, and summary mutually consistent instead of storing an arbitrary object shell.
When authoritative provenance is missing, the runtime now fails closed with repair-oriented guidance. The recovery path is explicit: mint a fresh prepare batch with packet prepare <execution|review|verification> or finalize the intended review batch with review; if the workspace was resumed or migrated, repair .delivery/workstreams/<name>/STATE.json so the authoritative active_prepare_set / latest_review fields point at the live batch.
pause and resume now preserve the active prepare batch. If the authoritative state files are missing, the workstream handoff also carries enough packet continuity metadata to rebuild that batch on repaired resume, including the latest verification gate provenance needed to continue a paused verification checkpoint safely.
Workstream handoffs now also persist a full authoritative_state snapshot. Handoff-only resume is therefore exact when that snapshot is present, and only falls back to repaired reconstruction for older handoffs that predate the snapshot field.
Historical run and packet JSON files are authoritative under .delivery/workstreams/<name>/runs and .delivery/workstreams/<name>/packets. Optional root mirrors under .delivery/runs and .delivery/packets are disabled by default and only written when DELIVERY_WRITE_ROOT_HISTORY_MIRRORS=1 is set. Prune older history with node scripts/delivery.mjs gc --keep-runs <n>.
gc always preserves the current active run when one exists, keeps the newest <n> inactive runs for the target workstream, and removes only historical run and packet JSON artifacts. It does not delete current state files, handoff mirrors, or phase markdown/doc artifacts.
Blocked policy and blocker stops must preserve the pre-block workflow state in last_stop.state; once human approvals or blockers are cleared, status should surface resume as the next control-plane action, and resume uses that provenance to restore the correct active phase.
Autonomous rework now always re-enters through iterate. Review and verification do not silently mutate the run into a new implementation iteration on their own; the runtime opens the next iteration centrally so it can enforce retry budgets and no-progress stops consistently.
The runtime now stops with explicit policy gates when the iteration budget is exhausted or the same rework handoff repeats without measurable progress.
If a historical blocked state loses that provenance, the runtime keeps the run blocked and requires manual state repair instead of guessing a restore target.
Guided control surfaces stay on scripts/delivery.mjs; autonomous entry surfaces such as run, team, and ralph route through scripts/autonomous-drive.mjs.

Additional architecture, product, and orchestration notes live in docs/architecture-direction.md, docs/product-mvp.md, docs/product-roadmap.md, docs/autonomous-orchestration.md, docs/packet-runner-contract.md, and docs/host-upgrade-resilience.md. The operating-mode split is documented in docs/operating-modes.md. The intent-command surface is specified in docs/operator-entry-command-spec.md. The higher-order operating-mode contract for plan, team, and ralph is specified in docs/operating-mode-surface.md. Formal autonomous host/result contracts now live in schemas/autonomous-result.schema.json, schemas/packet-command-result.schema.json, and schemas/runner-response.schema.json.

To let a host provide real lane execution, prefer:

node scripts/autonomous-drive.mjs

To start a bounded autonomous run with explicit iteration governance:

node scripts/autonomous-drive.mjs "Build a snake game" --max-iterations 7 --no-progress-limit 3

To force one context budget across the whole autonomous run:

node scripts/autonomous-drive.mjs --context-budget-profile tight --runner-file ./path/to/runner.mjs

To tune execution, review, and verification separately:

node scripts/autonomous-drive.mjs --execution-context-budget-profile bounded --review-context-budget-profile tight --verification-context-budget-profile wide --runner-file ./path/to/runner.mjs

Or start and drive an autonomous run directly through a process runner:

node scripts/autonomous-drive.mjs "Build a snake game" --runner-command node --runner-args ./path/to/runner-process.mjs

The same governance flags are also available on the runtime start surface:

node scripts/delivery.mjs start "Build a snake game" --control-mode autonomous --max-iterations 7 --no-progress-limit 3

Use --max-iterations to cap how many bounded implementation passes the runtime may open for one run, and --no-progress-limit to stop when the same rework handoff repeats without measurable change too many times in a row.

The runner module must export runPacket or a default function. The host runner receives the prepared packet metadata and returns one packet-output JSON object per lane.

If no explicit context budget override is provided, the autonomous loop now uses packet-type defaults that keep execution balanced, review narrow, and verification broad:

execution: bounded
review: tight
verification: wide

When both global and per-packet-type overrides are present, the per-packet-type override wins for that packet type.

When packet consume persists a packet output, it reports the output size in bytes and adds a warning if the persisted JSON is above the packet soft limit. That warning is advisory; it does not fail delivery.

For Codex-hosted lane execution, the plugin now ships a reusable runner, and autonomous-drive.mjs uses it by default:

node scripts/autonomous-drive.mjs "Build a snake game"

To override the default transport explicitly, use the runner module path:

node scripts/autonomous-drive.mjs "Build a snake game" --runner-file ./scripts/codex-packet-runner.mjs

Or use it as a process shim:

node scripts/autonomous-drive.mjs "Build a snake game" --runner-command node --runner-args ./scripts/codex-packet-runner.mjs

scripts/codex-packet-runner.mjs accepts launcher controls such as --launcher-command, --launcher-args, --lane-policy-file, --sandbox, --model, --profile, --timeout-ms, --allow-subagents, --pass-through-env, and --pass-through-env-prefixes. By default it launches codex exec, validates the last message against schemas/packet-output.schema.json, and returns a stable runner envelope instead of scraping stdout.

By default, secret-bearing prefixes are only forwarded for trusted codex or omx launchers. Custom wrapper commands should opt in with --pass-through-env or --pass-through-env-prefixes if they need additional environment variables.

Review packet sets are dispatched in parallel by the autonomous loop, and verification packets now carry direct refs to current-iteration review packet outputs plus cloned review markdown artifacts under 05-verification/REVIEWS/. Review finalization now also writes reconciled review artifacts under 04-review/REVIEWS/reconciliation.{json,md} and uses that unified view to deduplicate repeated findings, record reviewer conflicts, drive the authoritative review verdict and review-to-iteration handoff, and feed operator-facing status/memory summaries with the same unified review view. That reconciliation is still deterministic, but it now also normalizes simple phrasing drift such as inflection, filler words, and word-order changes so equivalent findings can merge without requiring byte-for-byte statement matches. It also applies a small deterministic alias map for clearly equivalent review concepts such as tests vs coverage and broken vs fails, while recording merge provenance in the reconciliation artifact so operators can see why findings were unified. The reconciliation artifact is schema-validated and caps per-finding provenance expansion with explicit omitted-input counts so context surfaces stay bounded as review lanes grow. Internal normalization details stay in memory only; the persisted artifact stores a slimmer operator-facing provenance summary.

For explicit host-side routing, scripts/codex-packet-runner.mjs also supports lane policy files. Those let you route execution, review, and verification packets, or individual lanes, to different launcher commands, args, profiles, models, and subagent policies.

When review lanes are not specified explicitly, the runtime now auto-escalates to security-reviewer for riskier work such as auth, production, database, schema, secret, token, or deploy-sensitive changes.

For OMX-backed execution, point the launcher at omx exec instead:

node scripts/codex-packet-runner.mjs --launcher-command omx --launcher-args exec

Execution packets may use native Codex subagents when the lane covers independent implementation or verification sidecars. Review and verification lanes stay single-agent by default.

autonomous-host.mjs remains available as a compatibility alias for existing integrations. It continues an already-started autonomous run and does not create one or set governance options on its own. For new host integrations, prefer autonomous-drive.mjs, including:

node scripts/autonomous-host.mjs -- node ./path/to/runner-process.mjs

The process runner receives one JSON request on stdin and must write exactly one packet-output JSON object to stdout.

If a process runner fails operationally, autonomous-drive and autonomous-host now return ok: false, error_code: "runner_failed", and a nested runner_error object with a stable failure classification instead of echoing raw child stdout or stderr.

If a runner returns an explicit failure envelope, the host surfaces that as nested runner_failure and stops before packet persistence.

Development

Run the test suite with:

npm test

Run the full local signoff gate with:

npm run verify

To trim .delivery history without touching the current active run:

node scripts/delivery.mjs gc --keep-runs 3

To prune a specific workstream more aggressively:

node scripts/delivery.mjs gc --workstream feature-x --keep-runs 1

Published

Vulnerabilities

Links

Maintainers

Keywords