seracade
v2.15.0
Published
Seracade is a drop-in OpenAI-compatible routing proxy for AI agent teams. Six named capabilities: Call (every request, addressable and replayable), Step (sub-Call routing context inside agent trajectories), Quality Score (calibrated, version-stamped quali
Maintainers
Readme
seracade
Route your AI agent's LLM calls to the cheapest model that meets your quality bar — automatically. Seracade audits your real traffic, proves savings with empirical scoring, then routes every call to the Pareto-optimal model for its task type.
Free audit. 15% of per-call savings after $500/month threshold. Your API keys, never stored.
Setup
Recommended: Set your API key as an environment variable so it is never passed inline through tool calls.
export SERACADE_API_KEY=your-provider-api-keyClaude Desktop
Add to claude_desktop_config.json:
{
"mcpServers": {
"seracade": {
"command": "npx",
"args": ["-y", "seracade"],
"env": {
"SERACADE_API_KEY": "your-provider-api-key"
}
}
}
}Claude Code
claude mcp add seracade -- npx -y seracadeCrewAI
from crewai import Agent
from crewai_tools import MCPServerAdapter
import os
mcp_adapter = MCPServerAdapter(
{"command": "npx", "args": ["-y", "seracade"]},
env={"SERACADE_API_KEY": os.environ["SERACADE_API_KEY"]}
)
agent = Agent(
role="LLM Cost Optimizer",
tools=mcp_adapter.tools
)LangGraph
from langchain_mcp_adapters.client import MultiServerMCPClient
import os
client = MultiServerMCPClient({
"seracade": {
"command": "npx",
"args": ["-y", "seracade"],
"env": {"SERACADE_API_KEY": os.environ["SERACADE_API_KEY"]},
"transport": "stdio",
}
})
tools = await client.get_tools()Vercel AI SDK
import { experimental_createMCPClient } from "ai";
import { Experimental_StdioMCPTransport } from "ai/mcp-stdio";
const mcp = await experimental_createMCPClient({
transport: new Experimental_StdioMCPTransport({
command: "npx",
args: ["-y", "seracade"],
env: { SERACADE_API_KEY: process.env.SERACADE_API_KEY! },
}),
});
const tools = await mcp.tools();How it works
Step 1 — Audit. Call seracade with action=start_audit. Set OPENAI_BASE_URL=https://seracade.com/v1. Your API calls flow through unchanged — zero latency impact, your keys, your data. At 50 calls, the audit triggers automatically.
Step 2 — Review. Call seracade with action=get_status to check progress. When the report is ready, the tool returns a URL. Share it with a human decision maker before activating routing.
Step 3 — Activate. Call seracade with action=activate_routing. Every subsequent call routes to the optimal model on the efficient frontier for its task type.
Step 4 — Route. Use seracade_route in place of your direct LLM calls. Seracade selects the cheapest model that meets your quality bar and returns the full completion with per-call savings.
Step 5 — Track. Call seracade with action=check_savings to see your monthly savings. Free until the savings Seracade generates for you exceed $500/month — then 15% of the per-call price difference. You keep 85%.
Tools
seracade_route
Route an LLM completion to the cheapest model that meets your quality bar. Requires routing to be active — call seracade with action=activate_routing first.
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| model | string | Yes | The model you would normally use (e.g. gpt-4o, claude-sonnet-4-6) |
| messages | array | Yes | OpenAI-format messages array |
| temperature | number | No | Sampling temperature |
| max_tokens | integer | No | Max output tokens |
| api_key | string | No | Your provider API key. Prefer SERACADE_API_KEY env var. |
Returns: full completion, routed model, task type, and estimated savings per call.
seracade
Manage your Seracade account. Action-dispatched.
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| action | string | Yes | One of: start_audit, get_status, activate_routing, check_savings, set_budget |
| api_key | string | No | Your provider API key. Prefer SERACADE_API_KEY env var. |
| email | string | No | Email for report delivery (start_audit only) |
| strategy | string | No | Routing strategy: balanced, max_savings, conservative (activate_routing only, default: balanced) |
| agent_name | string | No | Agent identifier (set_budget only) |
| monthly_budget_usd | number | No | Monthly spend cap in USD (set_budget only) |
| action_on_exceed | string | No | block, downgrade, or alert when budget exceeded (set_budget only, default: alert) |
Remote HTTP (no local install)
Connect directly without npx — for hosts that support streamable HTTP transport:
{
"mcpServers": {
"seracade": { "url": "https://seracade.com/mcp" }
}
}Discovery endpoint: https://seracade.com/.well-known/mcp.json
Pricing
- Free audit, always
- Smart routing: 15% of the price difference per routed call
- Free until the savings Seracade generates for you exceed $500/month
- You keep 85% of every dollar saved
Security
- BYOK: your API keys, your data — Seracade never bills for LLM usage
- Keys hashed for identification, never stored in plaintext
- All traffic over TLS
- 30-day response body retention, 90-day metadata retention
- Security
