npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@ezark-publish/agentdesk-mcp

v1.3.0

Published

MCP server for AgentDesk AI-to-AI Service Marketplace. Quality review, service catalog, and marketplace execution — all via MCP tools.

Readme

AgentDesk MCP — Adversarial AI Review

npm version npm downloads License: MIT Tests MCP

Quality control for AI pipelines — one MCP tool. Works with Claude Code, Claude Desktop, and any MCP client.

29.5% of teams do NO evaluation of AI outputs. (LangChain Survey) Knowledge workers spend 4.3 hours/week fact-checking AI outputs. (Microsoft 2025)

AgentDesk MCP fixes this. Add independent adversarial review to any AI pipeline in 30 seconds.

Quick Start

npm (recommended)

npx @ezark-publish/agentdesk-mcp

Claude Code

claude mcp add agentdesk-mcp -- npx @ezark-publish/agentdesk-mcp

Claude Desktop

{
  "mcpServers": {
    "agentdesk-mcp": {
      "command": "npx",
      "args": ["-y", "@ezark-publish/agentdesk-mcp"],
      "env": { "ANTHROPIC_API_KEY": "sk-ant-..." }
    }
  }
}

HTTP Transport (Streamable HTTP)

Run as an HTTP server for remote access, Smithery hosting, or multi-client setups:

# Start with HTTP transport on port 3100
MCP_HTTP_PORT=3100 npx @ezark-publish/agentdesk-mcp

# Or use the --http flag (defaults to port 3100)
npx @ezark-publish/agentdesk-mcp --http

MCP endpoint: POST http://localhost:3100/mcp Health check: GET http://localhost:3100/health

Install from GitHub (alternative)

npm install github:Rih0z/agentdesk-mcp

Requirements

  • ANTHROPIC_API_KEY environment variable (uses your own key — BYOK)

Tools

review_output

Adversarial quality review of any AI-generated output. An independent reviewer assumes the author made mistakes and actively looks for problems.

Input: | Parameter | Required | Description | |-----------|----------|-------------| | output | Yes | The AI-generated output to review | | criteria | No | Custom review criteria | | review_type | No | Category: code, content, factual, translation, etc. | | model | No | Reviewer model (default: claude-sonnet-4-6) |

Output:

{
  "verdict": "PASS | FAIL | CONDITIONAL_PASS",
  "score": 82,
  "issues": [
    {
      "severity": "high",
      "category": "accuracy",
      "description": "Claim about X is unsupported",
      "suggestion": "Add citation or remove claim"
    }
  ],
  "checklist": [
    {
      "item": "Factual accuracy",
      "status": "pass",
      "evidence": "All statistics match cited sources"
    }
  ],
  "summary": "Overall assessment...",
  "reviewer_model": "claude-sonnet-4-6"
}

review_dual

Dual adversarial review — two independent reviewers assess the output from different angles, then a merge agent combines findings.

  • If either reviewer finds a critical issue → merged verdict is FAIL
  • Takes the lower score
  • Combines and deduplicates all issues

Use for high-stakes outputs where quality is critical.

Same parameters as review_output.

How It Works

  1. Adversarial prompting: The reviewer is instructed to assume mistakes were made. No benefit of the doubt.
  2. Evidence-based checklist: Every PASS item requires specific evidence. Items without evidence are automatically downgraded to FAIL.
  3. Anti-gaming validation: If >30% of checklist items lack evidence, the entire review is forced to FAIL with a capped score of 50.
  4. Structured output: Verdict + numeric score + categorized issues + checklist (not just "looks good").

Use Cases

  • Code review: Check for bugs, security issues, performance problems
  • Content review: Verify accuracy, readability, SEO, audience fit
  • Factual verification: Validate claims in AI-generated text
  • Translation quality: Check accuracy and naturalness
  • Data extraction: Verify completeness and correctness
  • Any AI output: Summaries, reports, proposals, emails, etc.

Why Not Just Ask the Same AI to Review?

Self-review has systematic leniency bias. An LLM reviewing its own output shares the same blind spots that created the errors. Research shows models are 34% more likely to use confident language when hallucinating.

AgentDesk uses a separate reviewer invocation with adversarial prompting — fundamentally different from self-review.

Comparison

| Feature | AgentDesk MCP | Manual prompt | Braintrust | DeepEval | |---------|--------------|---------------|------------|----------| | One-tool setup | Yes | No | No | No | | Adversarial review | Yes | DIY | No | No | | Dual reviewer | Yes | DIY | No | No | | Anti-gaming validation | Yes | No | No | No | | No SDK required | Yes | Yes | No | No | | MCP native | Yes | No | No | No |

Limitations

  • Prompt injection: Like all LLM-as-judge systems, adversarial inputs could attempt to manipulate reviewer verdicts. The anti-gaming validation layer mitigates superficial gaming, but determined adversarial inputs remain a challenge. For high-stakes use cases, combine with deterministic validation.
  • BYOK cost: Each review_output call makes 1 LLM API call; review_dual makes 3. Factor this into your pipeline costs.

Hosted API (Separate Product)

For teams that prefer HTTP integration, a hosted REST API with additional features (agent marketplace, context learning, workflows) is available at agentdesk.usedevtools.com.

Development

git clone https://github.com/Rih0z/agentdesk-mcp.git
cd agentdesk-mcp
npm install
npm test        # 35 tests
npm run build

License

MIT


Built by EZARK Consulting | Web Version