crosscheck-mcp

v0.1.9

Published

2 days ago

Multi-LLM MCP server: confer / debate / coordinate / audit / orchestrate across Anthropic + OpenAI + xAI + Gemini + Mistral + Groq + DeepSeek, with scoreboard-driven router, canary-leak detection, sandboxed shell verifiers, and a tier-aware cheap-mode pic

0High
0Medium
0Low

fxspeiser

mcp model-context-protocol claude claude-code claude-desktop cursor llm anthropic openai xai gemini multi-llm panel audit orchestrate

crosscheck-mcp

A multi-LLM MCP server. Ask several models the same question, debate them, orchestrate them, audit them, or rubric-grade their work — all from a single MCP tool surface that drops into Claude Desktop / Claude Code / Cursor / Continue / any other MCP host.

This is the TypeScript implementation. There is also a Python implementation in ../python/ — both ship the same tool surface; pick whichever fits your stack. The TypeScript build runs natively in Node 18.17+ and has no Python dependency at runtime.

What it does

Twenty-six tools across deliberation, planning, auditing, and operations. Highlights:

confer / debate / coordinate / triangulate — parallel panel calls, structured deliberation, structured synthesis with dissent
audit — rubric-graded scoring; single-judge or multi-judge consensus (median + std-dev disagreement detection, severity-aware obvious-failure flags)
orchestrate — planner-driven DAG of LLM subtasks with optional cheap-mode tier-aware routing
create / create_cheap — lifecycle macros: scope → build → review → audit, with retry-on-failure
solve — iterative propose → verify → retry with sandboxed shell verifiers
verify — deterministic property checks (text, shell, url_head)
recall / session_memory / scoreboard / explain — operational introspection

Cross-cutting features: scoreboard-driven router for panel selection, session circuit breakers (cost / tokens / wall / DAG breadth), cross-provider canary detection for indirect prompt-injection, early-stop on agreement, prompt canonicalisation cache, structured claims with supports/attacks edges.

Install

npm install -g crosscheck-mcp

This installs the crosscheck-mcp binary on your PATH.

The package needs at least one LLM provider API key in the environment. Supported providers (set any subset):

export ANTHROPIC_API_KEY=...
export OPENAI_API_KEY=...
export XAI_API_KEY=...
export GEMINI_API_KEY=...
export MISTRAL_API_KEY=...
export GROQ_API_KEY=...
export DEEPSEEK_API_KEY=...

Use it with an MCP host

Claude Code

Add the server to your project's .claude/settings.json (or the user-level config):

{
  "mcpServers": {
    "crosscheck": {
      "command": "crosscheck-mcp",
      "env": {
        "ANTHROPIC_API_KEY": "${ANTHROPIC_API_KEY}",
        "OPENAI_API_KEY":    "${OPENAI_API_KEY}"
      }
    }
  }
}

Restart Claude Code; the mcp__crosscheck__* tools become available.

Tip — type xc or XC at the start of a prompt to route it to crosscheck; the server ships an MCP instructions payload that teaches Claude Code the convention. Details in ../../CROSSCHECK_USAGE.md.

Claude Desktop

Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "crosscheck": {
      "command": "crosscheck-mcp",
      "env": {
        "ANTHROPIC_API_KEY": "sk-ant-...",
        "OPENAI_API_KEY":    "sk-..."
      }
    }
  }
}

Restart Claude Desktop. Tools appear under the hammer icon.

Cursor

Cursor reads MCP servers from ~/.cursor/mcp.json. Same shape:

{
  "mcpServers": {
    "crosscheck": {
      "command": "crosscheck-mcp",
      "env": {
        "ANTHROPIC_API_KEY": "sk-ant-...",
        "OPENAI_API_KEY":    "sk-..."
      }
    }
  }
}

Any other MCP host

Spawn crosscheck-mcp over stdio. It speaks MCP JSON-RPC 2.0 — no custom protocol.

Optional configuration

Environment variables:

| Variable | Purpose | |-----------------------------------|-----------------------------------------------------------------------------------------------| | CROSSCHECK_DB_PATH | SQLite path for scoreboard / claims / session memory / recall. Default: .crosscheck/db.sqlite | | CROSSCHECK_DB_DISABLED=1 | Skip storage entirely (storage-driven tools return graceful errors). | | CROSSCHECK_TRANSCRIPTS_DIR | Where audit's session-id-only mode looks for transcripts. | | CROSSCHECK_PRICING_PATH | Path to pricing.json (used by cheap-mode tier picker). | | CROSSCHECK_REPO_ROOT | Override repo-root detection (used by fetch for evidence paths). | | CROSSCHECK_REJECT_CONFIG_DRIFT=1| Reject pinned-config drift instead of warning. | | CROSSCHECK_BRIDGE_PYTHON=1 | OPT-IN: spawn the Python server as a parity-testing backstop (see below). | | CROSSCHECK_EVENTS | Structured event emitter: stderr (default), off, file, both. See Observability. | | CROSSCHECK_EVENTS_PATH | Path for the file / both event sinks (ND-JSON, one event per line). | | <PROVIDER>_API_KEY | At least one is required for LLM-using tools. Optional for verify-only flows. | | <PROVIDER>_MODEL | Override a provider's default model. |

The Python backstop is OPTIONAL. Every v1 feature path on every tool runs natively in TypeScript. The bridge is useful for:

Cross-language parity testing (the test/parity/ fixtures check that the TS implementation produces byte-equal output against the Python oracle).
The reactive opt on orchestrate (mid-flight DAG mutation), which still defers to bridge.
A few remaining advanced opts that no longer have a Node-side implementation but kept the bridge as a fallback when host integrations aren't wired (e.g. worker_tools without innerCallers).

Everything in the Tools list above works without `CROSSCHECK_BRIDGE_PYTHON` being set.

Observability

Every tools/call request emits one structured tool_invoke event through a pluggable event bus. The default sink is ND-JSON to stderr — MCP hosts already capture stderr into their session logs, so this lights up out of the box with no configuration.

Event shape

{
  "event":          "tool_invoke",
  "id":             "1a3c0f5d",
  "tool":           "confer",
  "started_at":     "2026-06-06T19:42:11.231Z",
  "duration_ms":    1842,
  "status":         "ok",
  "args_keys":      ["question", "providers", "untrusted_input"],
  "result_keys":    ["answers", "question", "tool"],
  "envelope_bytes": 12451
}

Failure paths emit the same shape with status: "error" and an error_message (truncated to 512 chars). When a tool returns a domain-level error envelope (with error_code), the call still completes — status stays "ok" and error_code is mirrored to the event for easy filtering:

{
  "event": "tool_invoke", "id": "...", "tool": "audit",
  "status": "ok",
  "error_code": "AUDIT_NO_AUDITOR",
  "...": "..."
}

What's deliberately not in the event: argument values, response text, transcripts. Long prompts + multi-LLM debates would dwarf the signal and may carry sensitive content. Key inventory + envelope size is usually enough to triage a regression; pull the full envelope from a tool replay when you need it.

Routing

| CROSSCHECK_EVENTS= | Effect | |----------------------|---------------------------------------------------------| | stderr (default) | ND-JSON to stderr. | | off | Silent. Useful for noisy multi-tool benchmark runs. | | file | Append to CROSSCHECK_EVENTS_PATH only. | | both | Append to file AND write to stderr. |

CROSSCHECK_EVENTS_PATH is required when CROSSCHECK_EVENTS is file or both. Parent directories are created on first emit.

Custom emitters (programmatic)

import { setEventEmitter } from "crosscheck-mcp";

setEventEmitter({
  emit(event) {
    metrics.histogram("crosscheck.tool_duration_ms",
      event.duration_ms, { tool: event.tool, status: event.status });
  },
});

The emitter is replaced for the whole process; for test isolation use the RecordingEmitter export and restore the original via getEventEmitter() / setEventEmitter() in beforeEach / afterEach.

Programmatic use

import { createServer, connectAndServe } from "crosscheck-mcp";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";

const server = createServer({ /* providers, storage, pricing, repoRoot ... */ });
await server.connect(new StdioServerTransport());

Development

git clone https://github.com/fxspeiser/crosscheck-agent
cd crosscheck-agent/servers/typescript
npm install
npm run build        # tsup → dist/{node-stdio,node-http,browser-ext}.{js,cjs}
npm test             # vitest — 1,175 tests
npm run typecheck    # tsc --noEmit
npm run docs:usage   # regenerate the routing-convention doc from src/instructions.ts

End-to-end stdio handshake:

printf '%s\n' \
  '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"x","version":"0"}}}' \
  '{"jsonrpc":"2.0","method":"notifications/initialized"}' \
  '{"jsonrpc":"2.0","id":2,"method":"tools/list"}' \
  '{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"ping","arguments":{"echo":"hello"}}}' \
  | node dist/node-stdio.js

License

PolyForm Noncommercial 1.0.0. Commercial licenses are tracked in ../../COMMERCIAL-LICENSES.md; contact [email protected] for inquiries.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

crosscheck-mcp

What it does

Install

Use it with an MCP host

Claude Code

Claude Desktop

Cursor

Any other MCP host

Optional configuration

Observability

Event shape

Routing

Custom emitters (programmatic)

Programmatic use

Development

License