llm-cost-router-mcp

v0.2.0

Published

3 days ago

MCP server that advises LLM model routing & cost BEFORE the call — estimate_cost, recommend_model, should_offload_local. Pure logic, no API keys, $0 to run.

0High
0Medium
0Low

fredericsuretat

mcp model-context-protocol llm cost routing ollama claude openai finops ai-gateway

💸 LLM Cost Router — MCP

Know what an LLM call will cost before you make it — and route it to the cheapest model that can actually do the job (including free local Ollama).

An MCP server that gives your AI assistant (Claude Code, Cursor, Cline…) cost-awareness in context. Pure logic, no API keys, $0 to run — it never calls a paid API, it just advises.

Why

LLM gateways (LiteLLM, OpenRouter, Helicone…) sit in the request path and measure spend after the fact. This MCP does the opposite and the thing they can't: it advises before the call, right inside the assistant's reasoning loop, so the model can decide to offload to free local compute or pick a cheaper cloud model — without any proxy.

Tools

| Tool | What it does | |------|--------------| | estimate_cost | USD cost of a call for a given model + prompt/tokens. | | recommend_model | Cheapest model capable of the task (complexity-scored), + a free local option + alternatives. | | should_offload_local | Boolean: can this run free on local Ollama? + the USD you'd save vs cloud. | | list_models | Known models, pricing (per 1M tokens), capability tier, context window. |

All pure logic: a pricing table + a complexity heuristic. No network, no keys, no per-call cost.

Install

// Claude Code / Cursor MCP config
{
  "mcpServers": {
    "llm-cost-router": { "command": "npx", "args": ["-y", "llm-cost-router-mcp"] }
  }
}

Or run directly: npx -y llm-cost-router-mcp

Example

"Before answering, ask llm-cost-router whether this can run locally."

recommend_model({ "prompt": "write a regex to validate an email in python", "priority": "cost" })
→ { "recommended_model": "ollama:qwen2.5-coder:7b", "estimated_cost_usd": 0,
    "complexity": { "label": "mid" },
    "best_capable": { "model": "gemini-2.5-pro", "cost_usd": 0.000654 } }

Custom pricing

Override or add models without touching code:

MODEL_PRICING_JSON='{"my-model":{"provider":"x","in":1,"out":3,"tier":2,"ctx":128000}}'

Free vs Pro

Free (this package) — estimate_cost, recommend_model, should_offload_local, list_models.
Pro (coming) — budget rules & alerts, per-project policies, multi-provider live pricing sync, local-first presets, team config.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme