llm-cost-router
v1.0.0
Published
Automatically route LLM requests to the right model based on complexity, cost, and speed. Stop overpaying for simple prompts.
Maintainers
Readme
llm-router
Automatically route LLM requests to the right model based on complexity, cost, and speed. Stop paying Opus prices for tasks that Haiku can handle.
npm install llm-routerThe Problem
// You're doing this:
const result = await anthropic.complete(prompt, { model: "claude-opus-4-6" })
// For EVERY request. Including:
// "Is this email valid?" → should be Haiku ($0.00025/1K)
// "Translate this to French" → should be Haiku
// "Summarize this article" → should be Sonnet
// "Debug this race condition" → fine, use OpusTypical savings: 60–80% on LLM API costs with zero quality loss.
Quick Start
import { createRouter } from "llm-router"
const router = createRouter({
routes: [
{ model: "claude-haiku-4-5-20251001", cost: "low", for: ["classify", "validate"] },
{ model: "claude-sonnet-4-6", cost: "medium", for: ["summarize", "rewrite"] },
{ model: "claude-opus-4-6", cost: "high", for: ["debug", "architect"] },
],
fallback: "claude-sonnet-4-6",
})
// Routing decision — no API call needed
const decision = router.route("Is this email format valid?")
// { model: "claude-haiku-4-5-20251001", tier: "low", reason: "...", confidence: 0.8 }
// Full completion — attach your own provider
router.useProvider(myAnthropicProvider)
const result = await router.complete("Debug this race condition across 3 files...")
// result.model → "claude-opus-4-6" (routed automatically)
// result.usage.estimatedCost → 0.0032Routing Strategies
rules (default) — Fast, zero extra API call
Analyzes prompt length, keywords, structure. Decision in under 1ms.
createRouter({ ...config, strategy: "rules" })smart — Intent-based matching
Matches against a curated intent map. More accurate for ambiguous prompts.
createRouter({ ...config, strategy: "smart" })Budget Control
const router = createRouter({
...config,
budget: {
maxCostPerRequest: 0.05, // $0.05 hard limit per call
},
onBudgetExceeded: "fallback", // "throw" | "skip" | "fallback"
})| Behaviour | What happens |
|---|---|
| "throw" | Throws an error before the API call |
| "skip" | Returns empty response, no API call |
| "fallback" | Downgrades to cheapest available model |
Cost Estimation
const cost = router.estimateCost("classify this as spam or not")
// → 0.000018 (estimated USD before making the call)Force a Model
const result = await router.complete(prompt, {
forceModel: "claude-opus-4-6", // bypass routing
})Observe Routing Decisions
createRouter({
...config,
onRoute: (prompt, model, reason) => {
console.log(`[router] ${model} — ${reason}`)
},
})Supported Models (built-in cost table)
| Model | Tier | Cost/1K tokens | |---|---|---| | claude-haiku-4-5-20251001 | low | $0.00025 | | claude-sonnet-4-6 | medium | $0.003 | | claude-opus-4-6 | high | $0.015 | | gpt-4o-mini | low | $0.00015 | | gpt-4o | medium | $0.005 | | gemini-1.5-flash | low | $0.000075 |
Any model string works — unknown models fall back to a default cost estimate.
TypeScript
Fully typed. All config, decisions, and responses are typed end to end.
import type { RouterConfig, RoutingDecision, LLMResponse } from "llm-router"License
MIT
