@iflow-mcp/pickle-pixel-hydramcp

v1.0.7

Published

2 months ago

Multi-model MCP server — compare, vote, and synthesize across GPT, Gemini, Claude, and local models from one terminal

0High
0Medium
0Low

chatflowdev

qystart

mcp llm multi-model claude-code openai gemini anthropic ollama ai model-comparison consensus mcp-server ai-agents

An MCP server that lets Claude Code query any LLM — compare, vote, and synthesize across GPT, Gemini, Claude, and local models from one terminal.

Quick Start

npx hydramcp setup

That's it. The wizard walks you through everything — API keys, subscriptions, local models. At the end it gives you the one-liner to add to Claude Code.

Or if you already have API keys:

claude mcp add hydramcp -e OPENAI_API_KEY=sk-... -- npx hydramcp

What It Looks Like

Four models, four ecosystems, one prompt. Real output from a live session:

> compare gpt-5-codex, gemini-3, claude-sonnet, and local qwen on this function review

## Model Comparison (4 models, 11637ms total)

| Model                      | Latency         | Tokens |
|----------------------------|-----------------|--------|
| gpt-5-codex                | 1630ms fastest  | 194    |
| gemini-3-pro-preview       | 11636ms         | 1235   |
| claude-sonnet-4-5-20250929 | 3010ms          | 202    |
| ollama/qwen2.5-coder:14b   | 8407ms          | 187    |

All four independently found the same async bug. Then each one caught something different the others missed.

And this is consensus with a local judge:

> get consensus from gpt-5, gemini-3, and claude-sonnet. use local qwen as judge.

## Consensus: REACHED

Strategy: majority (needed 2/3)
Agreement: 3/3 models (100%)
Judge: ollama/qwen2.5-coder:14b (686ms)

Three cloud models polled, local model judging them. 686ms to evaluate agreement.

Tools

| Tool | What It Does | |------|-------------| | list_models | See what's available across all providers | | ask_model | Query any model, optional response distillation | | compare_models | Same prompt to 2-5 models in parallel | | consensus | Poll 3-7 models, LLM-as-judge evaluates agreement | | synthesize | Combine best ideas from multiple models into one answer | | analyze_file | Offload file analysis to a worker model | | smart_read | Extract specific code sections without reading the whole file | | session_recap | Restore context from previous Claude Code sessions |

From inside Claude Code, just say things like:

"ask gpt-5 to review this function"
"compare gemini and claude on this approach"
"get consensus from 3 models on whether this is thread safe"
"synthesize responses from all models on how to design this API"

How It Works

Claude Code
    |
    HydraMCP (MCP Server)
    |
    SmartProvider (circuit breaker, cache, metrics)
    |
    MultiProvider (routes to the right backend)
    |
    |-- OpenAI     -> api.openai.com (API key)
    |-- Google     -> Gemini API (API key)
    |-- Anthropic  -> api.anthropic.com (API key)
    |-- Sub        -> CLI tools (Gemini CLI, Claude Code, Codex CLI)
    |-- Ollama     -> local models (your hardware)

Three Ways to Connect Models

API Keys (fastest setup)

Set environment variables. HydraMCP auto-detects them.

| Variable | Provider | |----------|----------| | OPENAI_API_KEY | OpenAI (GPT-4o, GPT-5, o3, etc.) | | GOOGLE_API_KEY | Google Gemini (2.5 Flash, Pro, etc.) | | ANTHROPIC_API_KEY | Anthropic Claude (Opus, Sonnet, Haiku) |

Subscriptions (use your monthly plan)

Already paying for ChatGPT Plus, Claude Pro, or Gemini Advanced? HydraMCP wraps the CLI tools those subscriptions include. No API billing.

npx hydramcp setup   # auto-installs CLIs and runs auth

The setup wizard detects which CLIs you have, installs missing ones, and walks you through authentication. Each CLI authenticates via browser once — then it's stored forever.

| Subscription | CLI Tool | What You Get | |-------------|----------|-------------| | Gemini Advanced | gemini | Gemini 2.5 Flash, Pro, etc. | | Claude Pro/Max | claude | Claude Opus, Sonnet, Haiku | | ChatGPT Plus/Pro | codex | GPT-5, o3, Codex models |

Local Models

Install Ollama, pull a model, done. Auto-detected.

ollama pull qwen2.5-coder:14b

Mix and Match

All three methods stack. Use API keys for some providers, subscriptions for others, and Ollama for local. They all show up in list_models together.

Route explicitly with prefixes:

openai/gpt-5 — force OpenAI API
google/gemini-2.5-flash — force Google API
sub/gemini-2.5-flash — force subscription CLI
ollama/qwen2.5-coder:14b — force local
gpt-5 — auto-detect (tries each provider)

Setup Details

Option A: npx (recommended)

npx hydramcp setup                           # interactive wizard
claude mcp add hydramcp -- npx hydramcp      # register with Claude Code

Config is saved to ~/.hydramcp/.env and persists across npx runs.

Option B: Clone

git clone https://github.com/Pickle-Pixel/HydraMCP.git
cd HydraMCP
npm install && npm run build
claude mcp add hydramcp -- node /path/to/HydraMCP/dist/index.js

Verify

Restart Claude Code and say "list models". You should see everything you configured.

Architecture

HydraMCP wraps all providers in a SmartProvider layer that adds:

Circuit breaker — per-model failure tracking. After 3 failures, the model is disabled for 60s and auto-recovers.
Response cache — SHA-256 keyed, 15-minute TTL. Identical queries are served instantly.
Metrics — per-model query counts, latency, token usage, cache hit rates.
Response distillation — set max_response_tokens on any query and a cheap model compresses the response while preserving code, errors, and specifics.

Contributing

Want to add a provider? The interface is three methods:

interface Provider {
  healthCheck(): Promise<boolean>;
  listModels(): Promise<ModelInfo[]>;
  query(model: string, prompt: string, options?: QueryOptions): Promise<QueryResponse>;
}

See src/providers/ollama.ts for a working example. Implement it, register in src/index.ts, done.

Providers we'd love to see: LM Studio, OpenRouter, Groq, Together AI, or anything that speaks HTTP.

License

MIT