llm-cost-router-mcp
v0.2.0
Published
MCP server that advises LLM model routing & cost BEFORE the call — estimate_cost, recommend_model, should_offload_local. Pure logic, no API keys, $0 to run.
Maintainers
Readme
💸 LLM Cost Router — MCP
Know what an LLM call will cost before you make it — and route it to the cheapest model that can actually do the job (including free local Ollama).
An MCP server that gives your AI assistant (Claude Code, Cursor, Cline…) cost-awareness in context. Pure logic, no API keys, $0 to run — it never calls a paid API, it just advises.
Why
LLM gateways (LiteLLM, OpenRouter, Helicone…) sit in the request path and measure spend after the fact. This MCP does the opposite and the thing they can't: it advises before the call, right inside the assistant's reasoning loop, so the model can decide to offload to free local compute or pick a cheaper cloud model — without any proxy.
Tools
| Tool | What it does |
|------|--------------|
| estimate_cost | USD cost of a call for a given model + prompt/tokens. |
| recommend_model | Cheapest model capable of the task (complexity-scored), + a free local option + alternatives. |
| should_offload_local | Boolean: can this run free on local Ollama? + the USD you'd save vs cloud. |
| list_models | Known models, pricing (per 1M tokens), capability tier, context window. |
All pure logic: a pricing table + a complexity heuristic. No network, no keys, no per-call cost.
Install
// Claude Code / Cursor MCP config
{
"mcpServers": {
"llm-cost-router": { "command": "npx", "args": ["-y", "llm-cost-router-mcp"] }
}
}Or run directly: npx -y llm-cost-router-mcp
Example
"Before answering, ask llm-cost-router whether this can run locally."
recommend_model({ "prompt": "write a regex to validate an email in python", "priority": "cost" })
→ { "recommended_model": "ollama:qwen2.5-coder:7b", "estimated_cost_usd": 0,
"complexity": { "label": "mid" },
"best_capable": { "model": "gemini-2.5-pro", "cost_usd": 0.000654 } }Custom pricing
Override or add models without touching code:
MODEL_PRICING_JSON='{"my-model":{"provider":"x","in":1,"out":3,"tier":2,"ctx":128000}}'Free vs Pro
- Free (this package) —
estimate_cost,recommend_model,should_offload_local,list_models. - Pro (coming) — budget rules & alerts, per-project policies, multi-provider live pricing sync, local-first presets, team config.
License
MIT © Frederic Suretat
