npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

debate-mcp

v1.0.0

Published

MCP server that stress-tests your decisions with adversarial AI debate. GPT vs Gemini, Skeptic vs Steelman, grounded in web search.

Readme

Debate MCP

Stress-test your decisions before you commit. An MCP server that runs adversarial AI debates between frontier models, grounded in live web search.

Most AI tools optimize for consensus. Debate MCP optimizes for finding where your plan breaks.

MIT License npm

How It Works

You describe your plan
        |
        v
  [Web Search] -- gathers current facts, laws, regulations
        |
        v
  +-----------+          +-----------+
  |  SKEPTIC  |          | STEELMAN  |
  |  (GPT)    |          | (Gemini)  |
  |           |          |           |
  | Attacks   |          | Finds the |
  | your plan |          | strongest |
  | ruthlessly|          | version,  |
  |           |          | then      |
  |           |          | stress-   |
  |           |          | tests it  |
  +-----------+          +-----------+
        |    Round 2: they     |
        |    read each other   |
        |    (anonymized) and  |
        +--- argue back -------+
                  |
                  v
        [Structured synthesis]
        Recommendation + Crux +
        What Would Falsify +
        Unresolved disagreements

Quick Start

1. Install

npx debate-mcp

2. Add to Claude Code

claude mcp add debate npx debate-mcp \
  -e OPENAI_API_KEY=sk-... \
  -e GEMINI_API_KEY=AI...

3. Use it

Just tell Claude: "debate this", "what am I missing", "stress-test this plan", or "is this the right call".

[!TIP] You can also trigger it with domain and current_leaning for targeted debates: "Debate this as a tax attorney. I'm leaning toward electing S-Corp."

What Makes This Different

| Feature | Why it matters | |---------|---------------| | Asymmetric roles | One model attacks (Skeptic), one defends then stress-tests (Steelman). Research shows this outperforms giving both models the same prompt. | | Anonymized cross-examination | In Round 2, models see each other's work labeled "another analyst" to prevent identity bias. Based on NeurIPS 2025 research. | | Web search grounding | Before the debate, the server searches for current facts, laws, and regulations. Both models receive this as VERIFIED evidence and must flag ungrounded claims as UNVERIFIED. | | Confirmation bias attack | Tell it what you're leaning toward. The Skeptic will specifically attack that leaning. | | Domain expertise | Pass domain: "tax attorney" or "systems architect" to make both analysts domain-specific. | | Constrained synthesis | The output forces a structured format: Recommendation, Crux of Disagreement, What Would Falsify, Risk of Acting vs Waiting. Prevents AI from smoothing real disagreements into false consensus. |

Example

Input: "Should we elect S-Corp status? Net profit $40K, based in NYC." Domain: tax attorney Current leaning: "I think S-Corp will save on self-employment tax"

What happens:

  1. Web search pulls current NYC tax rates, QBI rules, IRS thresholds
  2. Skeptic leads with: "At $40K net profit in NYC, S-Corp election is mathematically guaranteed to lose you money" and explains exactly why
  3. Steelman finds the strongest case for S-Corp, then stress-tests it against NYC-specific tax penalties
  4. Cross-examination: Skeptic concedes the QBI interaction point, Steelman concedes the compliance cost erasure
  5. Synthesis: Don't elect. Here's the specific profit threshold where it flips.

Configuration

Environment Variables

Required (at minimum):

| Variable | Description | |----------|-------------| | OPENAI_API_KEY | API key for the Skeptic model (OpenAI by default) | | GEMINI_API_KEY | API key for the Steelman model (Gemini by default) |

Model configuration:

| Variable | Default | Description | |----------|---------|-------------| | SKEPTIC_MODEL | gpt-5.4 | Model for the Skeptic role | | SKEPTIC_BASE_URL | OpenAI default | Base URL for the Skeptic API (change to use Grok, Groq, Mistral, etc.) | | STEELMAN_MODEL | gemini-3.1-pro-preview | Model for the Steelman role | | STEELMAN_PROVIDER | gemini | Set to openai to use any OpenAI-compatible API for Steelman | | STEELMAN_BASE_URL | - | Base URL when using STEELMAN_PROVIDER=openai | | STEELMAN_API_KEY | Falls back to GEMINI_API_KEY | API key when using STEELMAN_PROVIDER=openai | | CALL_TIMEOUT_MS | 90000 | Timeout per API call (ms) |

Use Any Model Provider

The Skeptic role works with any OpenAI-compatible API out of the box. Just change the base URL:

# Grok (xAI)
SKEPTIC_BASE_URL=https://api.x.ai/v1 SKEPTIC_MODEL=grok-3 OPENAI_API_KEY=xai-...

# Groq
SKEPTIC_BASE_URL=https://api.groq.com/openai/v1 SKEPTIC_MODEL=llama-4-scout OPENAI_API_KEY=gsk_...

# Ollama (local, free)
SKEPTIC_BASE_URL=http://localhost:11434/v1 SKEPTIC_MODEL=llama3 OPENAI_API_KEY=ollama

# Mistral
SKEPTIC_BASE_URL=https://api.mistral.ai/v1 SKEPTIC_MODEL=mistral-large OPENAI_API_KEY=...

The Steelman role uses Gemini by default (for Google Search grounding). To use a different provider, set STEELMAN_PROVIDER=openai and configure the base URL.

MCP Configuration (.mcp.json)

{
  "mcpServers": {
    "debate": {
      "command": "npx",
      "args": ["-y", "debate-mcp"],
      "env": {
        "OPENAI_API_KEY": "sk-...",
        "GEMINI_API_KEY": "AI..."
      }
    }
  }
}

[!NOTE] Bring your own API keys. Debate MCP calls OpenAI and Google APIs directly. You are responsible for your own API usage and costs. A typical debate uses ~20,000-30,000 tokens across both providers.

Tool Parameters

| Parameter | Required | Description | |-----------|----------|-------------| | context | Yes | The plan, decision, or situation to debate. Include all relevant details. | | question | No | Specific question to focus the debate on. | | domain | No | Domain expertise: "tax attorney", "systems architect", "financial advisor", etc. | | current_leaning | No | What you're leaning toward. The Skeptic attacks this to counter confirmation bias. |

The Research Behind It

Debate MCP's design is based on peer-reviewed research on multi-agent debate:

  • Asymmetric roles outperform identical prompts ("Peacemaker or Troublemaker: How Sycophancy Shapes Multi-Agent Debate", 2025)
  • Anonymized cross-examination prevents identity bias ("When Identity Skews Debate", NeurIPS 2025)
  • Steelmanning before disagreeing forces genuine engagement (Kahneman's Adversarial Collaboration framework)
  • Re-stating the original question each round prevents context drift ("Talk Isn't Always Cheap", ICML 2025)
  • Caller-model synthesis avoids positional commitment bias from debaters ("Auditing Multi-Agent LLM Reasoning Trees", 2025)
  • Ray Dalio's triangulation method: get independent expert opinions, map convergence and divergence, then decide

When To Use It

Good for: Taxes, legal decisions, financial planning, business strategy, architecture choices, investment analysis, contract terms, hiring decisions, production deployments.

Not for: Simple coding tasks, quick lookups, routine bug fixes, or questions with obvious answers.

License

MIT