claude-skill-debate
v2.0.0
Published
Claude vs GPT multi-turn AI debate skill for Claude Code with cross fact-checking and HTML report generation
Downloads
356
Maintainers
Readme
The best way to stress-test an idea is to have two smart opponents try to destroy each other's arguments, then let a judge decide.
Most AI analysis gives you one perspective. You nod along, never knowing what it missed. Debate forces both sides to attack, defend, and cite evidence, while a neutral reporter picks the winner.
No false balance. No "both sides have a point." Someone wins.
Why
Single AI analysis:
"Here's my take" → You accept it → Confirmation bias
Debate:
Claude: "Here's why I'm right"
GPT: "Here's why you're wrong"
Claude: "Your data is cherry-picked, here's the full picture"
GPT: "Your model assumes 20% CAGR, Microsoft only did 12%"
Judge: "GPT wins 29.5 to 27.5. Here's why."Two adversarial AIs will find flaws that a single AI never surfaces. The cross fact-check phase catches bad numbers, misleading citations, and logical gaps.
Install
npx claude-skill-debateThat's it. One command. The skill is copied to ~/.claude/skills/debate/ and ready to use.
git clone https://github.com/YunseobShin/claude-skill-debate.git
mkdir -p ~/.claude/skills/debate
cp claude-skill-debate/SKILL.md ~/.claude/skills/debate/SKILL.mdUsage
Inside a Claude Code session:
/debate Will AI replace software engineers?
/debate React vs Vue for new projects in 2026
/debate Tesla stock price outlook
/debate Should central banks adopt CBDCs?Natural language also triggers it:
"Have two AIs debate this"
"Make Claude and GPT argue about this"The Arena
Five phases. Two combatants. One judge.
Phase 0 PREPARATION
Topic → 3-5 key arguments → debate rules
──────────────────────────────────────────
Phase 1 INDEPENDENT ANALYSIS (parallel)
Claude ──┐
├──→ 500-800 word analysis each
GPT ───┘
──────────────────────────────────────────
Phase 2 FREE DEBATE (up to 5 rounds)
Round 1: Claude attacks → GPT counters
Round 2: GPT attacks → Claude counters
...
Convergence check: no new arguments → early exit
──────────────────────────────────────────
Phase 3 CROSS FACT-CHECK
Claude → verifies GPT's top 3 claims
GPT → verifies Claude's top 3 claims
Verdict: [Confirmed / Refuted / Unverified]
──────────────────────────────────────────
Phase 4 VERDICT REPORT
Neutral Reporter agent reads full transcript
→ HTML report (Tailwind CSS, dark mode)
→ Clear winner, not "both have merits"Why Cross Fact-Check Matters
In our first real debate (Palantir stock outlook), fact-checking caught:
- Claude citing Q4 revenue growth as +70% (actual: +36%, it mixed up adjusted vs GAAP)
- GPT citing commercial revenue growth as +109% (actual: +60% global, +109% was US-only)
Both sides had compelling arguments. But one side had more wrong numbers. The fact-check changed the verdict.
Output
All artifacts saved to /tmp/debate/YYYYMMDD_HHMMSS/:
| File | What |
|:-----|:-----|
| topic.md | Topic & key arguments |
| rules.md | Debate rules |
| analysis_claude.md | Claude's independent analysis |
| analysis_codex.md | GPT's independent analysis |
| round_N_claude.md | Claude's round N statement |
| round_N_codex.md | GPT's round N statement |
| factcheck_by_claude.md | Claude fact-checks GPT |
| factcheck_by_codex.md | GPT fact-checks Claude |
| report.html | Final HTML verdict report |
The Report
The HTML report has 8 sections:
- Header with topic, date, participants
- Background (200 words max)
- Key issues in card layout
- Debate highlights with per-round quotes
- Fact-check results table (claim / verdict / evidence)
- Final verdict with clear winner and scoring
- Conclusion & takeaways
- Disclaimer
The report auto-opens in your browser on completion.
Prerequisites
| Requirement | Required | Notes |
|:---|:---:|:---|
| Claude Code | Yes | CLI installed and authenticated |
| OpenAI Codex CLI | Yes | npm i -g @openai/codex |
| Codex MCP Server | No | Enables multi-turn threading (recommended) |
Codex Setup
# Install
npm install -g @openai/codex
# Authenticate (ChatGPT Plus or Pro Plan)
codex auth loginAdd to ~/.claude/settings.json:
{
"mcpServers": {
"codex": {
"command": "codex",
"args": ["--full-auto", "mcp"]
}
}
}Without MCP, the skill falls back to CLI mode automatically.
Design Principles
| Principle | Implementation | |:----------|:--------------| | No false balance | The report must pick a winner based on evidence | | Cross-verification | Both sides fact-check the opponent's top 3 claims | | Early termination | Debate stops when arguments start repeating | | Parallel execution | Independent phases run concurrently | | Adversarial by design | Each side is prompted to attack, not agree |
Timing
Typically 3-5 minutes depending on round count. MCP mode is slightly faster than CLI fallback.
Notes
- Debate logs live in
/tmpand are lost on reboot. Copy them if you need to keep them. - Investment/stock debates are for reference only, not financial advice.
- Codex CLI works with ChatGPT Plus or Pro Plan auth. API key usage is billed separately.
