agent-budget-guard
v1.0.0
Published
LLM API budget enforcement with real-time cost tracking, automatic model downgrade, and spending limits
Downloads
114
Maintainers
Readme
agent-budget-guard
LLM API budget enforcement with real-time cost tracking, automatic model downgrade, and spending limits.
Wraps fetch calls to OpenAI, Anthropic, and Google APIs. Tracks cumulative token costs per session and enforces three configurable thresholds:
- WARN — fires a callback, continues
- THROTTLE — auto-downgrades the model (e.g., Opus to Sonnet)
- STOP — halts with a spending summary
Zero external runtime dependencies. Works as middleware (wraps fetch) or standalone (pass usage stats manually).
Install
npm install agent-budget-guardQuick Start
import { BudgetGuard } from 'agent-budget-guard';
const guard = new BudgetGuard({
budget: 5.00, // $5.00 max
thresholds: {
warn: 0.50, // warn at 50%
throttle: 0.80, // downgrade at 80%
stop: 1.00, // halt at 100%
},
throttleModel: 'claude-sonnet-4-6',
onWarn: (stats) => console.log('Budget warning:', stats),
onStop: (stats) => { throw new Error('Budget exceeded') },
});
// Wrap your fetch calls
const response = await guard.fetch('https://api.anthropic.com/v1/messages', {
method: 'POST',
headers: {
'content-type': 'application/json',
'x-api-key': process.env.ANTHROPIC_API_KEY,
'anthropic-version': '2023-06-01',
},
body: JSON.stringify({
model: 'claude-opus-4-6',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Hello' }],
}),
});API
new BudgetGuard(options)
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| budget | number | required | Maximum spend in dollars |
| thresholds.warn | number | 0.50 | Fraction of budget to trigger warn |
| thresholds.throttle | number | 0.80 | Fraction to trigger model downgrade |
| thresholds.stop | number | 1.00 | Fraction to halt all calls |
| throttleModel | string | — | Model to downgrade to at throttle |
| pricing | Partial<PricingTable> | — | Custom per-model pricing overrides |
| onWarn | (stats) => void | — | Callback on warn threshold |
| onThrottle | (stats) => void | — | Callback on throttle threshold |
| onStop | (stats) => void | — | Callback on stop threshold |
| cocChainFile | string | — | Path to CoC JSONL file for provenance |
| sessionId | string | auto | Session identifier for CoC entries |
guard.fetch(url, init?)
Drop-in replacement for fetch. Auto-detects provider from URL, parses usage from response, calculates cost. Downgrades model at throttle threshold. Throws BudgetExceededError at stop.
guard.wrap()
Returns a bound fetch function: const myFetch = guard.wrap();
guard.recordUsage(options)
Manual usage recording for streaming responses or non-fetch integrations:
guard.recordUsage({
inputTokens: 1500,
outputTokens: 800,
model: 'claude-opus-4-6',
provider: 'anthropic',
});guard.getStats()
Returns current UsageStats: total tokens, cost, budget percent, per-model breakdown, threshold events.
guard.isStopped() / guard.reset()
Check if halted. Reset all state for a new session.
CLI
# Summarize spending from CoC entries
agent-budget-guard report chain.jsonlBuilt-in Pricing ($/MTok)
| Provider | Model | Input | Output | |----------|-------|-------|--------| | Anthropic | claude-opus-4-7 | $15 | $75 | | Anthropic | claude-sonnet-4-6 | $3 | $15 | | Anthropic | claude-haiku-4-5 | $1 | $5 | | OpenAI | gpt-4 | $30 | $60 | | OpenAI | gpt-4o | $5 | $15 | | OpenAI | gpt-3.5-turbo | $0.50 | $1.50 | | Google | gemini-pro | $7 | $21 | | Google | gemini-1.5-flash | $0.075 | $0.30 |
Override with the pricing option. Unknown models are tracked (tokens counted) but at zero cost.
CoC Integration
If @absupport/coc-writer is installed, threshold events are recorded as BUDGET_EVENT entries in the specified chain file. This is optional — the package works standalone.
Deploy
npm run build— compiles TypeScript to CJS + ESMnpm test— runs 39 tests vianode --testnpm publish— publish to npm
License
Apache-2.0
