llm-cost-guard

v1.5.0

Published

17 days ago

Drop-in spend tracking, rolling budgets, and kill switches for LLM API calls.

Downloads

288

0High
0Medium
0Low

darksol

llm-cost-guard

Drop-in spend tracking, rolling budgets, and kill switches for LLM API calls.

Why this exists

LLM bills can spike from runaway loops, prompt explosions, or accidental load. llm-cost-guard gives you hard limits and visibility without locking you into one provider SDK.

What it does

Tracks spend from token usage with built-in model pricing
Enforces rolling-window budgets globally or by user/feature
Triggers threshold alerts (80/90/100%) and kill events
Wraps SDK clients (OpenAI/Anthropic/Gemini-style usage fields)
Supports manual usage tracking for any provider
Provides Express/Fastify precheck guards
Uses pluggable storage (in-memory included)

Quickstart

npm install llm-cost-guard

import { createGuard } from "llm-cost-guard";

const guard = createGuard({
  budgets: [{ id: "global-hourly", limitUsd: 50, windowMs: 60 * 60 * 1000 }]
});

await guard.track({
  model: "gpt-5",
  inputTokens: 1200,
  outputTokens: 800,
  userId: "u_123",
  feature: "chat"
});

Real examples

import OpenAI from "openai";
import { createGuard } from "llm-cost-guard";

const guard = createGuard({
  budgets: [
    { id: "global", limitUsd: 100, windowMs: 3_600_000 },
    { id: "user-daily", limitUsd: 10, windowMs: 86_400_000, scopeBy: "user" }
  ]
});

const wrapped = guard.wrap(new OpenAI(), { userId: "u_42", feature: "assistant" });
await wrapped.responses.create({ model: "gpt-5", input: "Summarize this transcript" });

guard.onBudgetAlert((event) => {
  console.log(`[${event.thresholdPercent}%] ${event.scopeKey}: $${event.usageUsd.toFixed(4)} / $${event.limitUsd}`);
});

guard.onKill((event) => {
  console.error(`Kill switch: ${event.scopeKey} exceeded $${event.limitUsd}`);
});

Config / options

| Option | Type | Default | Description | |---|---|---|---| | budgets | BudgetRule[] | required | Rules to enforce | | pricing | PricingCatalog | built-in | Override/add model pricing | | storage | StorageAdapter | MemoryStorageAdapter | Persist usage events | | throwOnKill | boolean | true | Throw BudgetExceededError when exceeded | | onUnknownModel | "error" \| "zero" | "error" | Unknown model behavior | | now | () => number | Date.now | Injectable clock for testing |

Architecture / flow

Ingest usage via guard.wrap(...) or guard.track(...).
Resolve model pricing and compute cost.
Store usage event in selected adapter.
Evaluate matching budget scopes/windows.
Emit alert/kill events and optionally throw.

Benchmarks / perf notes

Zero runtime dependencies.
Default MemoryStorageAdapter keeps chronological events and optimizes time-window queries.
For multi-instance deployments, use a shared StorageAdapter (Redis/Postgres/etc.) to keep budget state consistent.

Limitations + roadmap

Current limitations

Built-in pricing is static; provider pricing changes require updates/overrides
SDK auto-wrap depends on providers exposing token usage in response payloads
In-memory adapter is process-local (not distributed)

Roadmap

Additional ready-made storage adapters
More provider-specific usage extractors
Optional sampled telemetry/export hooks

License + links

License: MIT
Changelog: CHANGELOG.md
npm: https://www.npmjs.com/package/llm-cost-guard
Issues: https://github.com/darks0l/llm-cost-guard/issues

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

llm-cost-guard

Why this exists

What it does

Quickstart

Real examples

Config / options

Architecture / flow

Benchmarks / perf notes

Limitations + roadmap

Current limitations

Roadmap

License + links