@advik1228/llm-cost-guard
v0.1.2
Published
[](https://www.npmjs.com/package/@advik1228/llm-cost-guard) [](https://www.npmjs.com/package/@advik1228
Downloads
390
Maintainers
Readme
@advik1228/llm-cost-guard
Drop-in cost guard for LLM API clients. Wrap your Anthropic, OpenAI, or Google Gemini client with a JavaScript Proxy — track spend, enforce limits, and get alerts without changing your application code.
Why This Exists
LLM API costs can spike silently. A single runaway loop or oversized prompt can burn through a daily budget in minutes. Most teams discover overspend only after the invoice arrives.
@advik1228/llm-cost-guard sits between your app and the LLM provider. It reads real token counts from API responses, calculates cost in USD, enforces configurable limits, and optionally fires webhook alerts — all with one wrapper call.
Features
- Zero code changes — wrap existing clients via Proxy; intercepts
messages.create,chat.completions.create, and GeminigenerateContent - Real token counts — reads
usagefrom API responses (not tiktoken estimates at runtime) - Daily, monthly, and per-request limits — throw, warn, or silently ignore on breach
- Per-user budget tracking — isolate spend by
userIdwith optionaluserDailyLimit - Pre-flight estimation — estimate cost before sending (char/4 heuristic)
- Streaming support — Anthropic and OpenAI streaming with usage captured after stream completes
- Webhook alerts — Slack/Discord/custom webhook on warn threshold or limit breach
- Pluggable storage — in-memory (default) or Redis for multi-instance deployments
- Custom pricing — override built-in model rates for private or fine-tuned models
- TypeScript-first — full type definitions included
Installation
npm install @advik1228/llm-cost-guardQuick Start
Anthropic
import Anthropic from "@anthropic-ai/sdk";
import { guard } from "@advik1228/llm-cost-guard";
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
const guarded = guard(client, {
dailyLimit: 5.0,
warnAt: 4.0,
onLimit: "throw",
});
const response = await guarded.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 1024,
messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.content);OpenAI
import OpenAI from "openai";
import { guard } from "@advik1228/llm-cost-guard";
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const guarded = guard(client, {
dailyLimit: 10.0,
monthlyLimit: 100.0,
perRequestLimit: 0.50,
onLimit: "throw",
});
const response = await guarded.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);Configuration
Pass a GuardConfig object as the second argument to guard():
| Option | Type | Description |
|--------|------|-------------|
| dailyLimit | number | Max USD spend per UTC day |
| monthlyLimit | number | Max USD spend per UTC month |
| perRequestLimit | number | Max USD per single API call |
| warnAt | number | USD threshold to trigger a warning (and webhook if configured) |
| onLimit | "throw" \| "warn" \| "silent" | Behavior when a limit is exceeded (default: "throw") |
| userId | string | Track spend for a specific user |
| userDailyLimit | number | Per-user daily USD cap |
| alertWebhook | string | URL to POST alert payloads on warn/limit events |
| preflight | boolean | Reserved for pre-flight enforcement |
| storage | "memory" \| StorageAdapter | Storage backend (default: in-memory) |
| customPricing | Record<string, { inputPerMillion, outputPerMillion }> | Override built-in model pricing |
const guarded = guard(client, {
dailyLimit: 5.0,
monthlyLimit: 50.0,
perRequestLimit: 1.0,
warnAt: 4.0,
onLimit: "throw",
alertWebhook: "https://hooks.slack.com/services/...",
userId: "user_abc123",
userDailyLimit: 0.50,
});Per-User Budget Tracking
Track spend per end-user in multi-tenant apps:
const guarded = guard(client, {
dailyLimit: 100.0, // org-wide cap
userId: req.user.id,
userDailyLimit: 2.0, // per-user cap
onLimit: "throw",
});When a user's daily spend exceeds userDailyLimit, a LimitExceededError is thrown with limitType: "user".
Pre-Flight Cost Estimation
Estimate cost before making a call (uses char/4 token heuristic, assumes output = 25% of input):
import { estimateCost } from "@advik1228/llm-cost-guard";
const estimate = await estimateCost(
{
provider: "anthropic",
model: "claude-sonnet-4-6",
messages: [{ role: "user", content: "Summarize this 10-page document..." }],
},
2.5, // dailySpentSoFar (USD)
5.0 // dailyLimit (USD)
);
console.log(estimate.estimatedCostUSD); // e.g. 0.003
console.log(estimate.willBreachLimit); // falseStreaming requests run a pre-flight check via estimateCost() before the stream starts.
Webhook Alerts
Set alertWebhook to receive JSON POST payloads on warn threshold or limit breach:
const guarded = guard(client, {
dailyLimit: 10.0,
warnAt: 8.0,
alertWebhook: "https://your-webhook.example.com/alerts",
});Payload shape:
{
"event": "warn_threshold_reached",
"currentSpendUSD": 8.12,
"limitUSD": 10.0,
"warnAtUSD": 8.0,
"timestamp": "2026-06-13T12:00:00.000Z",
"provider": "anthropic",
"userId": "user_abc123"
}Events: warn_threshold_reached, limit_reached, per_request_limit_reached, user_limit_reached.
Alerts are fire-and-forget — webhook failures are logged but never throw.
Usage Stats
import { getStats } from "@advik1228/llm-cost-guard";
const stats = await getStats();
console.log(stats.todayUSD); // today's spend
console.log(stats.monthUSD); // this month's spend
console.log(stats.requestCount); // total requests tracked
console.log(stats.byModel); // spend by model (future)
console.log(stats.byUser); // spend by user (future)Note:
getStats()uses a module-level in-memory tracker. For production, pass a sharedStorageAdapterviaguard({ storage })and query it directly.
Error Reference
LimitExceededError
Thrown when a configured limit is exceeded (when onLimit: "throw").
import { LimitExceededError } from "@advik1228/llm-cost-guard";
try {
await guarded.messages.create({ ... });
} catch (err) {
if (err instanceof LimitExceededError) {
console.log(err.limitType); // "daily" | "monthly" | "perRequest" | "user"
console.log(err.currentSpend); // current USD spend
console.log(err.limit); // configured limit
}
}PreflightError
Reserved for pre-flight enforcement when estimated cost exceeds remaining budget.
import { PreflightError } from "@advik1228/llm-cost-guard";
// err.estimatedCost, err.remainingBudgetSupported Models & Pricing
Prices in USD per 1 million tokens (input / output):
| Model | Input | Output |
|-------|------:|-------:|
| claude-opus-4-6 | $15.00 | $75.00 |
| claude-sonnet-4-6 | $3.00 | $15.00 |
| claude-haiku-4-5 | $0.80 | $4.00 |
| gpt-4o | $5.00 | $15.00 |
| gpt-4o-mini | $0.15 | $0.60 |
| gpt-3.5-turbo | $0.50 | $1.50 |
| gemini-1.5-pro | $3.50 | $10.50 |
| gemini-1.5-flash | $0.075 | $0.30 |
| gemini-2.0-flash | $0.10 | $0.40 |
Custom Pricing Override
Override or add models not in the built-in table:
const guarded = guard(client, {
dailyLimit: 10.0,
customPricing: {
"my-fine-tuned-model": {
inputPerMillion: 1.0,
outputPerMillion: 3.0,
},
},
});Redis Storage
For multi-instance deployments, use Redis instead of in-memory storage:
import { createClient } from "redis";
import { guard, RedisAdapter } from "@advik1228/llm-cost-guard";
const redis = createClient({ url: process.env.REDIS_URL });
await redis.connect();
const guarded = guard(client, {
dailyLimit: 10.0,
storage: new RedisAdapter(redis),
});Or import the adapter directly:
import { RedisAdapter } from "@advik1228/llm-cost-guard/adapters";All Redis keys are prefixed with llmguard: to avoid collisions.
Custom Storage Adapter
Implement the StorageAdapter interface for your own backend:
import { guard, StorageAdapter } from "@advik1228/llm-cost-guard";
class PostgresAdapter implements StorageAdapter {
async get(key: string): Promise<number> { /* ... */ return 0; }
async set(key: string, value: number, ttlSeconds?: number): Promise<void> { /* ... */ }
async increment(key: string, by: number): Promise<number> { /* ... */ return 0; }
}
const guarded = guard(client, {
dailyLimit: 10.0,
storage: new PostgresAdapter(),
});Environment Variables
The library does not read environment variables directly. Pass credentials and config from your app:
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
REDIS_URL=redis://localhost:6379const guarded = guard(client, {
dailyLimit: parseFloat(process.env.DAILY_LLM_BUDGET ?? "5"),
alertWebhook: process.env.LLM_ALERT_WEBHOOK,
});TypeScript Support
Full type definitions are included. Import types as needed:
import {
guard,
GuardConfig,
UsageStats,
LimitExceededError,
PreflightError,
AlertPayload,
} from "@advik1228/llm-cost-guard";Contributing
Setup
git clone https://github.com/advikhingmire12-oss/llm-cost-guard.git
cd llm-cost-guard
npm install
npm run build
npm testProject Structure
src/
index.ts # Public exports
guard.ts # Proxy wrapper + limit enforcement
tracker.ts # SpendTracker — records and queries spend
pricing.ts # Model pricing table + calculateCost()
estimator.ts # Pre-flight cost estimation
errors.ts # LimitExceededError, PreflightError
alerts.ts # Webhook alert sender
adapters/
memory.ts # In-memory StorageAdapter (default)
redis.ts # Redis StorageAdapter
tests/
guard.test.ts
estimator.test.tsHow to Add a Provider
- In
src/guard.ts, detect the provider's API call pattern in the Proxygettrap (method name + parent object path). - Add token extraction in
extractTokens()using the provider'susageresponse shape. - For nested clients (like Gemini's
getGenerativeModel), wrap returned objects in a secondary Proxy. - Add model pricing to
PRICINGinsrc/pricing.ts. - Add tests in
tests/guard.test.tswith a mocked client.
PR Rules
- All tests must pass (
npm test) before submitting - Keep changes focused — one feature or fix per PR
- Add tests for new behavior
- Do not change public API without discussion
- Follow existing TypeScript style (strict mode, no
anyunless unavoidable)
Roadmap
- [ ] Full
byModelandbyUserbreakdown ingetStats() - [ ] Preflight enforcement via
preflight: trueconfig flag - [ ] Dashboard / CLI for live spend monitoring
- [ ] AWS Bedrock and Azure OpenAI adapters
- [ ] Rate limiting (requests per minute, not just cost)
License
MIT — see LICENSE.
Author
Advik — github.com/advikhingmire12-oss
