model-agency
v2.0.1
Published
MCP server providing an agency of AI models for expert consultation and idiomatic code generation
Readme
model-agency
MCP server that provides unified access to multiple AI models (OpenAI, Google, Anthropic, xAI) with structured output support, async operations, and idiomatic code pattern enforcement.
Features
- Multi-provider AI integration (OpenAI, Google, Anthropic, xAI)
- Automatic structured output detection with JSON response support
- Async operations with request tracking and caching
- Idiomatic pattern enforcement to prevent common anti-patterns
- Smart retry logic with exponential backoff
- Real-time model health checks and capability detection
Installation
git clone https://github.com/yourusername/model-agency
cd model-agency
bun install
bun run buildConfiguration
Set API keys as environment variables:
export OPENAI_API_KEY="sk-..."
export GOOGLE_API_KEY="..." # or GEMINI_API_KEY
export ANTHROPIC_API_KEY="..."
export XAI_API_KEY="..." # or GROK_API_KEYUsage
Start the server:
bun startDevelopment mode:
bun run devModel-Specific Limitations
o3 Models
- Temperature parameter is NOT supported - will cause API errors if included
- Use
reasoning_effortinstead for controlling output quality - Supports:
minimal,low,medium,highreasoning levels
GPT-5 Models
- Supports
verbosityparameter for output detail control - Temperature works as expected
Other Models
- Standard parameters (temperature, top_p, etc.) work normally
Available Tools
models
List available models with capabilities and performance characteristics.
{ "detailed": false } // Basic listing
{ "detailed": true } // Include configuration statusadvice
Get AI assistance with automatic capability detection.
{
"model": "openai:gpt-4o",
"prompt": "How do I optimize this React component?",
"reasoningEffort": "medium", // o3/GPT-5 only
"verbosity": "low" // GPT-5 only
}Features:
- Intelligent routing: OpenAI models use async, others use sync
- Automatic structured output when supported
- Fallback to text mode for incompatible models
- Response includes confidence scores
OpenAI models get additional features:
- Multi-turn conversation support (use
conversation_id) - Request caching and deduplication
- Non-blocking operations (poll with
request_id) - Automatic context iteration (max 3 rounds)
// Example with OpenAI (async features enabled)
{
"model": "openai:o3",
"prompt": "Analyze this architecture...",
"conversation_id": "uuid", // For multi-turn
"max_completion_tokens": 2000,
"wait_timeout_ms": 120000
}
// Example with Google (standard sync)
{
"model": "google:gemini-2.5-flash",
"prompt": "Quick code review..."
}idiom
Get ecosystem-aware implementation approaches.
{
"task": "Implement global state management in React",
"context": {
"dependencies": "{ \"react\": \"^18.2.0\" }",
"language": "typescript",
"constraints": ["no new dependencies"]
}
}Returns:
- Recommended approach with rationale
- Packages to use/avoid
- Code examples
- Anti-patterns to avoid
Architecture
src/
├── server.ts # Main MCP server
├── providers.ts # Provider configuration
├── model-registry.ts # Model factory registry
├── handlers/
│ ├── advice.ts # Sync advice handler
│ ├── advice-async.ts # Async with caching
│ ├── idiom.ts # Pattern enforcement
│ └── models.ts # Model listing
├── utils/
│ ├── errors.ts # Error handling
│ └── optimization.ts # Performance utils
└── clients/
└── openai-async.ts # OpenAI async clientAvailable Models
OpenAI
- Reasoning: o3, o3-mini, o3-pro, o4-mini (60-120s)
- Fast: gpt-4o, gpt-4o-mini (5-15s)
- GPT-5 Series: gpt-5, gpt-5-mini, gpt-5-nano (with reasoning)
- GPT-4.1 Series: gpt-4.1, gpt-4.1-mini, gpt-4.1-nano
- Gemini 2.5: gemini-2.5-pro, gemini-2.5-flash (with thinking mode)
- Gemini 2.0: gemini-2.0-flash, gemini-2.0-flash-lite
- Gemini 1.x: gemini-1.5-pro, gemini-1.5-flash, gemini-1.0-pro
Anthropic
- Claude 4 (Opus): claude-opus-4-1-20250805, claude-opus-4-20250514 (hybrid reasoning)
- Claude 3.7: claude-3-7-sonnet-20250219
- Claude 3.5: claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022
- Claude 3: claude-3-opus-20240229, claude-3-haiku-20240307
xAI
- Grok 4: grok-4 (with reasoning)
- Grok 3: grok-3, grok-3-mini
- Grok 2: grok-2
- Legacy: grok-beta
Claude Desktop Integration
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"model-agency": {
"command": "bun",
"args": ["run", "/path/to/model-agency/dist/run.js"],
"env": {
"OPENAI_API_KEY": "sk-...",
"GOOGLE_API_KEY": "...",
"ANTHROPIC_API_KEY": "...",
"XAI_API_KEY": "..."
}
}
}
}Testing
bun test # Run all tests
bun test:watch # Watch mode
bun check # Type checkingTroubleshooting
No models available: Check that at least one API key is set.
Model not found: Use full format provider:model-name (e.g., openai:gpt-4o).
Rate limits: Error -32003 indicates rate limiting. Try another provider or wait.
API key issues: Error -32002 indicates auth problems. Verify key is valid.
Contributing
- Fork the repository
- Create your feature branch
- Commit changes
- Push to branch
- Open a Pull Request
License
MIT - see LICENSE file.
