planpong
v0.6.2
Published
Multi-model adversarial plan review — orchestrates AI agents to critique and refine implementation plans
Maintainers
Readme
Planpong
Adversarial plan review for AI-assisted development. Two AI models play ping-pong with your plan — one critiques, the other revises — until the plan converges or you stop them.
Plans go through three review phases, each with a different lens:
| Round | Phase | What the reviewer looks for | | ----- | ------------- | ------------------------------------------------------------------------------------------------ | | 1 | Direction | Is this the right problem? Is the approach sound? Is the scope appropriate? | | 2 | Risk | Pre-mortem — assume the plan fails. Surface hidden assumptions, dependencies, and failure modes. | | 3+ | Detail | Implementation completeness — missing steps, edge cases, gaps, verification criteria. |
The planner model evaluates each piece of feedback independently — accepting, rejecting, or deferring with rationale — then rewrites the plan. This continues until the reviewer approves or the round limit is reached.
Prerequisites
You need at least one AI CLI installed and authenticated:
- Claude Code —
npm install -g @anthropic-ai/claude-code(Anthropic API key or Max subscription) - Codex CLI —
npm install -g @openai/codex(ChatGPT account or OpenAI API key) - Gemini CLI —
npm install -g @google/gemini-cli(Google account auth — rungeminionce to authenticate)
If multiple are installed, planpong uses one for planning and a different one for reviewing (configurable). If only one is available, it auto-fallbacks to using that CLI for both roles.
Note on gemini as reviewer: the gemini CLI does not expose a stable session-resume mechanism, so reviewer rounds run without persistent context. Expect noticeably slower per-round wall time than claude or codex when gemini is the reviewer. The first time you load a config that selects gemini as reviewer, planpong prints a stderr warning.
Verify your CLI works:
claude --version # or
codex --version # or
gemini --versionPlanpong shells out to these CLIs — no API keys are configured in planpong itself.
Install
npm install -g planpongThen run the interactive setup wizard:
planpong initThe wizard auto-detects which AI CLIs you have installed, lets you pick a planner + reviewer, and writes a working planpong.yaml for the current project. You can re-run it any time to tweak settings — only changed keys are written.
Setup (Claude Code MCP)
Add planpong as an MCP server so Claude Code can use it as a native tool:
claude mcp add planpong -- planpong-mcpAllow the tools in your Claude Code settings (.claude/settings.json):
{
"permissions": {
"allow": ["mcp__planpong"]
}
}Restart Claude Code. The planpong tools should appear in your tool list.
Usage
Via Claude Code (recommended)
Ask Claude to review a plan:
Review my plan at docs/plans/my-feature.md using planpongOr use the slash commands (auto-installed with the MCP server):
/planpong:review docs/plans/my-feature.md # autonomous — runs to completion
/planpong:review_interactive docs/plans/my-feature.md # pauses between rounds for your input
/planpong:status <session_id> # current state and round history
/planpong:sessions # list all review sessions in this project
/planpong:report <session_id> # detailed phase-specific report (direction confidence, risk register, round history)Via CLI
planpong review docs/plans/my-feature.mdConfiguration
Optional. Run planpong init to generate this interactively, or create planpong.yaml in your project root by hand:
planner:
provider: claude # claude, codex, or gemini
model: opus # provider-specific; aliases or full IDs both work
reviewer:
provider: codex
model: gpt-5.3-codex
effort: xhigh # codex-only knob: low | medium | high | xhigh
max_rounds: 10
plans_dir: docs/plans
revision_mode: full # full or edits
planner_mode: inline # inline or external (see below)Valid
modelandeffortvalues are provider-specific and change as providers ship new versions. Runplanpong config providersto see the current per-provider lists, orplanpong initfor an interactive picker — don't copy the values above verbatim.
All fields are optional. Defaults: claude (planner) + codex (reviewer), 10 rounds, docs/plans/ directory, planner_mode: inline, revision_mode: full, human_in_loop: true.
Revision mode: full vs edits
revision_mode controls how the planner emits a revised plan after each round of feedback:
full(default) — the planner re-emits the entire plan markdown each round. Simple and robust; works for any plan size and any kind of change.edits— the planner emits a list of targeted text replacements ({ section, before, after }) which planpong applies server-side. ~10× less output token volume on detail rounds that touch only a section or two, so revisions are noticeably faster on mature plans. Direction (round 1) still uses full rewrites since that's the round where sweeping changes are expected.
Use full for new plans where most rounds will rewrite large sections. Switch to edits once a plan has converged enough that rounds are touching one or two paragraphs at a time.
Planner mode: inline vs external
planner_mode is the most consequential operational choice. It decides who actually rewrites the plan after each round of feedback:
inline(default) — when you're driving planpong from Claude Code, you are the planner. Planpong returns the reviewer's issues; Claude reads the plan, edits it directly, and reports its accept/reject/defer decisions back viaplanpong_record_revision. No second model is invoked, so revisions are fast and use the conversational context Claude already has.external— planpong shells out to the configured planner provider (e.g. anotherclaude -porcodex execinvocation) to produce the revision. Use this when running planpong outside Claude Code (CLI flow), or when you want a different model to plan than the one orchestrating.
Inline is the right default for the Claude-Code-as-orchestrator workflow; external is the right default for planpong review from a plain shell.
Viewing and changing config
planpong config # show resolved config with source annotations
planpong config path # print path to active config file
planpong config keys # list all keys with valid values, types, and defaults
planpong config providers # list per-provider model and effort values
planpong config get <key> # print a single resolved value
planpong config set <key> <value> # set a config valueExamples:
planpong config set reviewer.provider gemini
planpong config set reviewer.model gemini-2.5-pro
planpong config set max_rounds 5
planpong config set planner_mode inlineValid keys: planner.provider, planner.model, planner.effort, reviewer.provider, reviewer.model, reviewer.effort, plans_dir, max_rounds, human_in_loop, revision_mode, planner_mode. Run planpong config keys for the canonical list with descriptions.
Config via MCP
Two MCP tools are available for programmatic config access:
planpong_get_config— returns resolved config, file path, version, and per-key source provenanceplanpong_set_config— dry-run by default (confirm: false); passconfirm: trueto write
MCP API notes
Planpong's MCP tools are designed to be safe under retries, duplicated calls, and orchestrator restarts:
planpong_reviseandplanpong_record_revisionrequireexpected_round. Pass the round number returned by the most recentplanpong_get_feedback. Stale calls (round mismatched lower) and out-of-order calls (mismatched higher) return precise errors instead of double-charging the planner.- Tool calls are replay-safe. Calling
planpong_get_feedbacktwice before the round's revision returns the existing feedback withidempotent_replay: trueinstead of re-invoking the reviewer. The same applies toplanpong_reviseandplanpong_record_revisionwhen the round's response artifact already exists. - Per-session lock. Mutating MCP tools acquire an exclusive lock at
.planpong/sessions/<id>/lockso two overlapping clients cannot both advance the same session. - Reviewer findings carry an evidence flag. Each issue in
planpong_get_feedbackmay include aquoted_textfield (a short verbatim quote from the plan) and averified: true | falseflag set by planpong post-parse.verified: falsemeans the quote could not be located in the plan — usually a hallucinated or paraphrased finding. Planners should deprioritize unverified issues. The response also includesunverified_countfor quick triage.
Most users driving planpong through Claude Code never see these primitives — the slash commands and the orchestrator's instructions handle them. They matter if you're building an external MCP client.
What it produces
Planpong updates your plan file in-place and adds a status line tracking the review:
**planpong:** R3/10 | claude → codex | 2P2 1P3 → 1P3 → 0 | Accepted: 4 | +32/-8 lines | 5m 23s | Approved after 3 roundsReading left to right: round 3 of 10, claude planned / codex reviewed, issue trajectory across rounds, total accepted issues, line delta from original, elapsed time, and outcome.
Session data is stored in .planpong/sessions/ (add to .gitignore).
Development
git clone https://github.com/andrewhml/planpong.git
cd planpong
npm install # installs deps + configures git hooks
npm run build # compile TypeScript
npm run typecheck # type-check without emittingA pre-commit hook automatically rebuilds dist/ when TypeScript files are staged.
Publishing
Automated via GitHub Actions with npm trusted publishing (OIDC). No tokens needed.
npm version patch # bumps version + creates git tag
git push && git push --tags # triggers publish to npmLicense
MIT
