npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

agent-cost-guard

v1.0.0

Published

Spending limits for AI agents. 3 lines of code. Zero dependencies.

Readme

agent-cost-guard

Spending limits for AI agents. 3 lines of code. Zero dependencies.

npm version license bundle size


The Problem

A LangChain agent loop cost $47K in November 2025 when four agents ping-ponged for 11 days before anyone noticed. A separate retry storm cost $47K in February 2026 after 2.3M API calls ran unchecked over a weekend. OpenAI removed dashboard budget caps -- client-side enforcement is the only option left.

The Solution

import { guard } from "agent-cost-guard";
const openai = guard(new OpenAI(), { budget: "$5.00" });
// All calls now tracked and enforced

How It Works

  • Wraps your OpenAI SDK client via method-level interception (no HTTP proxies, no monkey-patching globals)
  • Reads token usage from API responses -- ground truth, not estimation
  • Calculates cost with cached token discounts and reasoning token awareness
  • Enforces hard limits -- throws BudgetExceededError when budget is exceeded

What's Supported

  • Chat Completions API (streaming + non-streaming)
  • Responses API (non-streaming)
  • Embeddings API
  • Cached token cost calculation (50-75% discount depending on model)
  • Reasoning token tracking (o3, o4-mini)

Installation

npm install agent-cost-guard

Requires openai SDK v4+ as a peer dependency. Node.js >= 18.

API Reference

guard(client, options)

Wraps an OpenAI client with budget tracking and enforcement.

import OpenAI from "openai";
import { guard } from "agent-cost-guard";

const openai = guard(new OpenAI(), {
  budget: "$5.00",          // Required. String ("$5.00") or number (5)
  warn: 0.8,                // Optional. Warning threshold as fraction (0-1). Default: 0.8
  onWarn: (info) => {       // Optional. Callback when warn threshold is reached
    console.log(`Warning: ${info.percent}% of budget used`);
  },
  onExceeded: "throw",      // Optional. "throw" (default) or "warn-only"
  pricing: {                 // Optional. Custom model pricing overrides (per 1M tokens)
    "my-fine-tune": { input: 3.00, output: 6.00, cachedInput: 1.50 },
  },
});

| Option | Type | Default | Description | |--------|------|---------|-------------| | budget | string \| number | required | Budget in dollars. Accepts "$5.00" or 5. | | warn | number | 0.8 | Fraction (0-1) of budget at which onWarn fires. | | onWarn | (info: WarnInfo) => void | undefined | Called once when spending crosses the warn threshold. | | onExceeded | "throw" \| "warn-only" | "throw" | Whether to throw or just warn when budget is exceeded. | | pricing | Record<string, ModelPrice> | undefined | Custom pricing per model (per 1M tokens). |

client.$budget

After wrapping, the client exposes a $budget property with live budget status.

console.log(openai.$budget.spent);      // 0.0023 (dollars spent so far)
console.log(openai.$budget.remaining);  // 4.9977 (dollars remaining)
console.log(openai.$budget.calls);      // 1 (number of API calls tracked)
console.log(openai.$budget.budget);     // 5 (total budget in dollars)

openai.$budget.reset();                 // Reset counters to zero

| Property | Type | Description | |----------|------|-------------| | spent | number | Total dollars spent so far. | | remaining | number | Remaining budget in dollars. | | calls | number | Number of API calls tracked. | | budget | number | Total budget in dollars. | | reset() | () => void | Reset spent amount and call count to zero. |

BudgetExceededError

Thrown when the budget is exceeded (unless onExceeded is "warn-only").

import { BudgetExceededError } from "agent-cost-guard";

try {
  await openai.chat.completions.create({ model: "gpt-4o", messages });
} catch (err) {
  if (err instanceof BudgetExceededError) {
    console.log(err.spent);    // Total dollars spent
    console.log(err.budget);   // Budget limit in dollars
    console.log(err.lastCall); // { model, inputTokens, outputTokens, cachedTokens, reasoningTokens, cost }
  }
}

| Property | Type | Description | |----------|------|-------------| | spent | number | Total dollars spent when the error was thrown. | | budget | number | The budget limit in dollars. | | lastCall | LastCallInfo \| undefined | Details about the call that pushed spending over budget. |

With Streaming

stream_options: { include_usage: true } is injected automatically so you don't have to remember it.

const stream = await openai.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [{ role: "user", content: "hello" }],
  stream: true,
});
// stream_options: { include_usage: true } is injected automatically
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}
// Cost tracked after stream completes

With Responses API

const response = await openai.responses.create({
  model: "gpt-4.1",
  input: "Explain quantum computing",
});

With Embeddings

const embeddings = await openai.embeddings.create({
  model: "text-embedding-3-small",
  input: "hello world",
});

Supported Models

Built-in pricing for 11 OpenAI models (per 1M tokens):

| Model | Input | Output | Cached Input | |-------|------:|-------:|-------------:| | gpt-4o | $2.50 | $10.00 | $1.25 | | gpt-4o-mini | $0.15 | $0.60 | $0.075 | | gpt-4.1 | $2.00 | $8.00 | $0.50 | | gpt-4.1-mini | $0.40 | $1.60 | $0.10 | | gpt-4.1-nano | $0.10 | $0.40 | $0.025 | | gpt-5 | $1.25 | $10.00 | $0.125 | | gpt-5-mini | $0.25 | $2.00 | $0.025 | | o3 | $2.00 | $8.00 | $0.50 | | o3-mini | $1.10 | $4.40 | $0.55 | | o4-mini | $1.10 | $4.40 | $0.275 | | o1 | $15.00 | $60.00 | $7.50 |

Model names are matched by longest prefix, so gpt-4o-2024-11-20 automatically resolves to gpt-4o pricing.

Custom Pricing

Add pricing for any model -- fine-tunes, embedding models, or new releases:

const openai = guard(new OpenAI(), {
  budget: "$10.00",
  pricing: {
    "text-embedding-3-small": { input: 0.02, output: 0, cachedInput: 0 },
    "ft:gpt-4o-mini:my-org": { input: 0.30, output: 1.20, cachedInput: 0.15 },
  },
});

Custom pricing takes precedence over built-in pricing. For models not found in either, cost resolves to $0 (a warning, not a silent failure -- check $budget.spent to verify tracking).

FAQ

How does this compare to Langfuse/Helicone? They are observability platforms -- dashboards, traces, analytics. agent-cost-guard is a one-line enforcement layer that stops runaway spending in real time. They are complementary: use both.

How does this compare to LiteLLM Proxy? LiteLLM requires deploying and maintaining a gateway server. agent-cost-guard is npm install + 3 lines of code. No infrastructure.

How does this compare to agent-guard? agent-guard depends on tiktoken and uses HTTP interception. agent-cost-guard is zero-dependency and uses method-level wrapping -- lighter, simpler, and no token estimation drift.

What about provider dashboard limits? OpenAI's dashboard limits are org-level monthly caps. agent-cost-guard provides per-session, per-agent, real-time enforcement you can configure in code.

Roadmap

  • Anthropic SDK support
  • Gemini SDK support
  • MCP server for budget enforcement
  • Persistent cross-session budgets
  • Framework integrations (Vercel AI SDK, LangChain.js)

License

MIT