@iflow-mcp/pickle-pixel-hydramcp
v1.0.7
Published
Multi-model MCP server — compare, vote, and synthesize across GPT, Gemini, Claude, and local models from one terminal
Readme
An MCP server that lets Claude Code query any LLM — compare, vote, and synthesize across GPT, Gemini, Claude, and local models from one terminal.
Quick Start
npx hydramcp setupThat's it. The wizard walks you through everything — API keys, subscriptions, local models. At the end it gives you the one-liner to add to Claude Code.
Or if you already have API keys:
claude mcp add hydramcp -e OPENAI_API_KEY=sk-... -- npx hydramcpWhat It Looks Like
Four models, four ecosystems, one prompt. Real output from a live session:
> compare gpt-5-codex, gemini-3, claude-sonnet, and local qwen on this function review
## Model Comparison (4 models, 11637ms total)
| Model | Latency | Tokens |
|----------------------------|-----------------|--------|
| gpt-5-codex | 1630ms fastest | 194 |
| gemini-3-pro-preview | 11636ms | 1235 |
| claude-sonnet-4-5-20250929 | 3010ms | 202 |
| ollama/qwen2.5-coder:14b | 8407ms | 187 |All four independently found the same async bug. Then each one caught something different the others missed.
And this is consensus with a local judge:
> get consensus from gpt-5, gemini-3, and claude-sonnet. use local qwen as judge.
## Consensus: REACHED
Strategy: majority (needed 2/3)
Agreement: 3/3 models (100%)
Judge: ollama/qwen2.5-coder:14b (686ms)Three cloud models polled, local model judging them. 686ms to evaluate agreement.
Tools
| Tool | What It Does | |------|-------------| | list_models | See what's available across all providers | | ask_model | Query any model, optional response distillation | | compare_models | Same prompt to 2-5 models in parallel | | consensus | Poll 3-7 models, LLM-as-judge evaluates agreement | | synthesize | Combine best ideas from multiple models into one answer | | analyze_file | Offload file analysis to a worker model | | smart_read | Extract specific code sections without reading the whole file | | session_recap | Restore context from previous Claude Code sessions |
From inside Claude Code, just say things like:
- "ask gpt-5 to review this function"
- "compare gemini and claude on this approach"
- "get consensus from 3 models on whether this is thread safe"
- "synthesize responses from all models on how to design this API"
How It Works
Claude Code
|
HydraMCP (MCP Server)
|
SmartProvider (circuit breaker, cache, metrics)
|
MultiProvider (routes to the right backend)
|
|-- OpenAI -> api.openai.com (API key)
|-- Google -> Gemini API (API key)
|-- Anthropic -> api.anthropic.com (API key)
|-- Sub -> CLI tools (Gemini CLI, Claude Code, Codex CLI)
|-- Ollama -> local models (your hardware)Three Ways to Connect Models
API Keys (fastest setup)
Set environment variables. HydraMCP auto-detects them.
| Variable | Provider |
|----------|----------|
| OPENAI_API_KEY | OpenAI (GPT-4o, GPT-5, o3, etc.) |
| GOOGLE_API_KEY | Google Gemini (2.5 Flash, Pro, etc.) |
| ANTHROPIC_API_KEY | Anthropic Claude (Opus, Sonnet, Haiku) |
Subscriptions (use your monthly plan)
Already paying for ChatGPT Plus, Claude Pro, or Gemini Advanced? HydraMCP wraps the CLI tools those subscriptions include. No API billing.
npx hydramcp setup # auto-installs CLIs and runs authThe setup wizard detects which CLIs you have, installs missing ones, and walks you through authentication. Each CLI authenticates via browser once — then it's stored forever.
| Subscription | CLI Tool | What You Get |
|-------------|----------|-------------|
| Gemini Advanced | gemini | Gemini 2.5 Flash, Pro, etc. |
| Claude Pro/Max | claude | Claude Opus, Sonnet, Haiku |
| ChatGPT Plus/Pro | codex | GPT-5, o3, Codex models |
Local Models
Install Ollama, pull a model, done. Auto-detected.
ollama pull qwen2.5-coder:14bMix and Match
All three methods stack. Use API keys for some providers, subscriptions for others, and Ollama for local. They all show up in list_models together.
Route explicitly with prefixes:
openai/gpt-5— force OpenAI APIgoogle/gemini-2.5-flash— force Google APIsub/gemini-2.5-flash— force subscription CLIollama/qwen2.5-coder:14b— force localgpt-5— auto-detect (tries each provider)
Setup Details
Option A: npx (recommended)
npx hydramcp setup # interactive wizard
claude mcp add hydramcp -- npx hydramcp # register with Claude CodeConfig is saved to ~/.hydramcp/.env and persists across npx runs.
Option B: Clone
git clone https://github.com/Pickle-Pixel/HydraMCP.git
cd HydraMCP
npm install && npm run build
claude mcp add hydramcp -- node /path/to/HydraMCP/dist/index.jsVerify
Restart Claude Code and say "list models". You should see everything you configured.
Architecture
HydraMCP wraps all providers in a SmartProvider layer that adds:
- Circuit breaker — per-model failure tracking. After 3 failures, the model is disabled for 60s and auto-recovers.
- Response cache — SHA-256 keyed, 15-minute TTL. Identical queries are served instantly.
- Metrics — per-model query counts, latency, token usage, cache hit rates.
- Response distillation — set
max_response_tokenson any query and a cheap model compresses the response while preserving code, errors, and specifics.
Contributing
Want to add a provider? The interface is three methods:
interface Provider {
healthCheck(): Promise<boolean>;
listModels(): Promise<ModelInfo[]>;
query(model: string, prompt: string, options?: QueryOptions): Promise<QueryResponse>;
}See src/providers/ollama.ts for a working example. Implement it, register in src/index.ts, done.
Providers we'd love to see: LM Studio, OpenRouter, Groq, Together AI, or anything that speaks HTTP.
License
MIT
