agent-order
v0.3.0
Published
The Order of the Agents - multi-model deliberation for consequential decisions. Anonymized critique, aggregator synthesis, rubric-scored review, and shareable HTML decision artifacts.
Maintainers
Readme
The Order of the Agents
The Order of the Agents turns a rough scenario into a reviewed PRD, ADR, RFC, memo, or plan. Multiple agents first think independently, then challenge each other without seeing model identities, then mix the strongest ideas into a final decision packet.
agent-order prd ./scenario.mdYou get:
final/report.md: the final PRD, ADR, RFC, memo, plan, or recommendationfinal/decision-log.md: what happened, rubric outcomes, blockers, cost/timeindex.html: a shareable visual run indexindex.md: a lightweight run summaryturns/*.md: every agent position, critique, revision, synthesis, and reviewtrace.jsonl: structured run events for replay, debugging, and evals
Short version: stop asking one model for decisions that matter.
Why
Single-model answers can be confident and incomplete. The Order of the Agents is built for decisions where the disagreement matters: product requirements, architecture choices, RFCs, migration plans, build-vs-buy decisions, and incident follow-ups.
The product is not another chat wrapper. It is a fair fight for good ideas: agents start with their own plans, critique anonymized peer responses, accept what improves their answer, and push back when feedback is weak. The final report is the mix of the best surviving ideas, with dissent preserved when it still matters.
Quick Start
agent-order prd ./scenario.md
open agent-order-runs/<latest>/index.htmlPick the artifact you want:
agent-order adr ./decision.md
agent-order rfc ./proposal.md
agent-order build-vs-buy ./analytics.mdPick more deliberation only when you need it:
agent-order prd ./scenario.md --depth quick
agent-order prd ./scenario.md --depth standard
agent-order prd ./scenario.md --depth deepInstall
Install from npm:
npm install -g agent-orderThen run:
agent-order "Should we build or buy an internal analytics dashboard?"For zero-install use:
npx agent-order@latest ./scenario.mdFor local development from this repo:
npm install
npm run build
npm linkThe Order assumes the agent CLIs you use, such as codex, claude, gemini, or other configured commands, are already installed and logged in.
First Demos
Run the mock demo without calling Codex or Claude:
npm run demoRun a mock PRD demo using the built-in PRD template:
npm run demo:prdRun a real two-agent deliberation:
agent-order ./examples/build-vs-buy-analytics/scenario.md --agents codex,claude --out ./agent-order-runsGood first scenarios:
agent-order ./examples/build-vs-buy-analytics/scenario.md
agent-order ./examples/rest-to-trpc/scenario.md
agent-order grill ./examples/review-agent-order-readme/scenario.md
agent-order prd ./docs/evals/scenarios/prd/saved-search/scenario.md --depth quick
agent-order adr ./docs/evals/scenarios/adr/rest-vs-trpc/scenario.md --depth quick
agent-order build-vs-buy ./docs/evals/scenarios/build-vs-buy/analytics/scenario.md --depth quickWhat It Writes
agent-order-runs/<timestamp>/
scenario.md
index.html
index.md
trace.jsonl
schemas/
agent-turn.schema.json
prompts/
raw/
turns/
0001-codex.initial-position.md
0002-claude.initial-position.md
0003-codex.critique.md
0004-claude.critique.md
...
final/
report.md
decision-log.mdturns/ is the audit trail. index.html is the easiest artifact to share. final/report.md is the final decision document.
Commands
agent-order <scenario text | scenario.md> [options]
agent-order grill <scenario text | scenario.md> [options]
agent-order <template> <scenario text | scenario.md> [options]
agent-order replay <run-dir> [options]
agent-order init [--config agent-order.config.yaml]
agent-order check [--config agent-order.config.yaml]
agent-order doctor [--config agent-order.config.yaml]Templates:
prd | adr | rfc | build-vs-buy | migration-plan | incident-reviewCommon options:
agent-order ./scenario.md --agents codex,claude
agent-order ./scenario.md --depth quick
agent-order prd ./scenario.md --depth standard
agent-order ./scenario.md --max-turns 10
agent-order ./scenario.md --human-input never
agent-order ./scenario.md --out ./runsTemplates
Templates give the agents a target artifact shape and a binary final-review rubric.
Built-in templates:
prd: product requirements documentadr: architecture decision recordrfc: request for commentsbuild-vs-buy: build vs buy memomigration-plan: staged migration planincident-review: blameless incident review
Example:
agent-order prd ./docs/evals/scenarios/prd/launch-readiness/scenario.md --depth quickOverride or add templates with YAML/JSON files:
agent-order --template my-template --templates-dir ./templates ./scenario.mdDepth
Depth controls how much deliberation happens. You can omit it; the default behaves like quick.
quick: fast two-agent review, good default for normal workstandard: adds another model family and an aggregator passdeep: more agents and stronger synthesis for expensive decisionscheap: open-source-heavy roster for cost-conscious runs
Under the hood, depth presets choose a roster and synthesis strategy. Every external CLI in the preset must be installed locally.
Example:
agent-order rfc ./scenario.md --depth deepUse doctor to see configured agents plus available presets and templates:
agent-order doctorDeliberation Flow
The default flow is intentionally simple:
independent plans -> anonymized critique -> revisions -> synthesis -> rubric reviewIn more detail:
initial-position -> agents produce their own plans before seeing peers
critique -> agents critique Response A/B/C, not "Claude" or "Codex"
revision -> agents can accept good feedback or defend their position
synthesis -> strongest ideas are combined into one artifact
final-review -> the artifact is scored against the template rubric
synthesis-revision -> runs when blockers or failed rubric criteria need repairAgent outputs include structured data:
claims: recommendations, assumptions, risks, facts, decisionsobjections: critique items with target turn/claim and severityrubric_scores: binary pass/fail review criteria with evidenceincorporated_objection_ids: which objections were accepted or addressed
Unincorporated major/blocking objections can be preserved in a ## Minority Report section.
Human Input
Human input is part of the protocol, not a side channel.
Use grill mode when the scenario needs clarification before deliberation:
agent-order grill "Should we move our frontend to a monorepo?"During a run, agents can also emit structured questions for the user. The orchestrator deduplicates them and pauses only when configured.
human_input:
mode: on_blocking_questions
max_questions_per_pause: 3Disable human pauses:
agent-order ./scenario.md --human-input neverReplay
Replay reruns a previous frozen scenario with the current config, or with inherited metadata from the source run.
agent-order replay ./agent-order-runs/2026-04-26-144108
agent-order replay ./agent-order-runs/2026-04-26-144108 --depth deepThe new run links back to the source run with replay-source.md.
Eval Harness
The repo includes a small artifact eval harness under docs/evals/.
Current branch:
npm run eval -- --version current --scenario prd/saved-searchNamed baselines require an explicit executable so version labels cannot accidentally evaluate the same binary:
npm run eval -- --version v0.1 --agent-order /path/to/agent-order-v0.1.js
npm run eval -- --version v0.2 --agent-order ./dist/bin/agent-order.js
npm run eval -- --compare v0.1 v0.2The judge command defaults to claude and must be available on PATH:
npm run eval -- --version current --judge-command claudeSee docs/evals/README.md for the harness layout and ship-gate notes.
Configuration
Create a starter config:
agent-order initDefault shape:
protocol: agent-order/v1
agents:
- id: codex
adapter: codex-cli
command: codex
preset: codex
- id: claude
adapter: claude-cli
command: claude
preset: claude
limits:
max_turns: 12
output:
dir: ./agent-order-runs
synthesis:
agent: codex
aggregators: null
meta_synthesizer: null
intake:
enabled: false
mode: off
facilitator: codex
max_questions: 6
human_input:
mode: on_blocking_questions
max_questions_per_pause: 3
ask_before_final: false
final_review:
enabled: true
template: null
templates_dir: null
cost_warning_usd: 0If no config sets limits.max_turns, The Order computes a turn budget from the roster, intake settings, and synthesis mode. The budget is enforced even with batched turns.
Adding Another Agent
Codex and Claude are built in. Any scriptable CLI can participate through generic-cli:
agents:
- id: gemini
adapter: generic-cli
command: gemini
args:
- -p
- "{{prompt}}"
input:
mode: arg
output:
mode: stdout
check_args:
- --versionThe generic adapter can pass prompts through stdin, a prompt file, or templated args like {{prompt}}, then reads either JSON matching the agent-turn schema or plain Markdown from stdout.
Curated adapter presets exist for:
codex | claude | gemini | grok | qwen | deepseekResources
- docs/evals/README.md: eval harness usage
- docs/evals/scenarios/: fixed eval scenarios
- examples/: simple starter scenarios and mock config
- docs/showcase-video/: source for the showcase video
Product Positioning
The Order of the Agents is for senior developers, tech leads, staff engineers, PMs, and AI-heavy builders who already use terminal AI tools and want better review for consequential decisions:
- architecture choices
- build-vs-buy calls
- migration plans
- security and reliability reviews
- PRD/RFC critique
- incident remediation reviews
- developer-tool product strategy
Use it when the decision is worth a few minutes of critique. Do not use it for every prompt.

