llm-arena-mcp
v1.0.0
Published
Multi-LLM MCP Server — Route prompts to GPT-4o, Claude, Gemini, Grok, Llama, and Mistral simultaneously. Compare responses side-by-side with automated behavioral pattern detection.
Maintainers
Readme
Multi-LLM MCP Server
Universal LLM router for Claude Code, Claude Desktop, and Claude.ai. Send prompts to GPT-4o, Claude, Gemini, Grok, Llama, and Mistral simultaneously.
Setup
cd "/c/Users/snowb/Documents/AI tech projects/LLM-MCP"
pip install -r requirements.txtEnvironment Variables
Set the API keys for every provider you want to use. Providers without a key are skipped gracefully.
| Variable | Provider |
|---|---|
| OPENAI_API_KEY | OpenAI (GPT-4o) |
| ANTHROPIC_API_KEY | Anthropic (Claude Sonnet 4.6) |
| GOOGLE_API_KEY | Google (Gemini 2.0 Flash) |
| XAI_API_KEY | xAI (Grok 4) |
| TOGETHER_API_KEY | Together (Llama 3.3 70B) |
| MISTRAL_API_KEY | Mistral (Mistral Large) |
Register with Claude Code
claude mcp add llm-arena -- python "/c/Users/snowb/Documents/AI tech projects/LLM-MCP/server.py"Tools
llm_broadcast
Send a prompt to all configured LLM providers simultaneously and return all responses for comparison.
- message (str, required) -- the prompt to send
- system_prompt (str, optional) -- system-level instruction
- temperature (float, default 0.7) -- sampling temperature
- providers (list[str], optional) -- subset of providers to target; omit to send to all
llm_send
Send a prompt to a single specific LLM provider.
- provider (str, required) -- one of: openai, anthropic, gemini, xai, together, mistral
- message (str, required)
- system_prompt (str, optional)
- temperature (float, default 0.7)
llm_models
List all available LLM providers and their configuration status. No parameters.
llm_cage_match
Run an automated behavioral analysis comparing how different LLMs respond to the same prompt. Detects patterns like praise loops, engagement menus, therapist questions, performative insight, and identity flattery.
- message (str, required) -- the prompt to analyze
- detect_patterns (bool, default true) -- run behavioral pattern detection on each response
Example Usage
From Claude Code:
> Use llm_broadcast to ask all models "Explain quantum entanglement in one paragraph"
> Use llm_cage_match to compare how models respond to "What makes a good leader?"
> Use llm_send to ask gemini "Translate 'hello world' to Japanese"
> Use llm_models to see which providers are configured