tokenfence

v0.2.2

Published

2 months ago

Cost circuit breaker for AI agents — guard your OpenAI/Anthropic spend with automatic downgrade and kill switch.

0High
0Medium
0Low

nexusaib

openai anthropic cost budget ai llm guardrail agent tokenfence

TokenFence — Cost Circuit Breaker for AI Agents

Stop runaway AI agent costs with per-workflow budget caps, automatic model downgrade, and a hard kill switch.

Quick Start

npm install tokenfence

OpenAI

import { guard } from "tokenfence";
import OpenAI from "openai";

const client = guard(new OpenAI(), {
  budget: "$0.50",           // Max spend for this workflow
  fallback: "gpt-4o-mini",  // Auto-downgrade at 80% budget
  onLimit: "stop",           // Graceful stop at budget cap
});

// Use exactly like your normal OpenAI client
const res = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Analyze this data..." }],
});

// Check spend anytime
console.log(`Spent: $${client.tokenfence.spent.toFixed(4)}`);
console.log(`Remaining: $${client.tokenfence.remaining.toFixed(4)}`);

Anthropic

import { guard } from "tokenfence";
import Anthropic from "@anthropic-ai/sdk";

const client = guard(new Anthropic(), {
  budget: "$1.00",
  fallback: "claude-3-haiku-20240307",
  onLimit: "stop",
});

const res = await client.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Summarize this document..." }],
});

How It Works

Budget Tracking — Every API call is metered using real model pricing
Auto-Downgrade — At 80% budget (configurable), switches to your fallback model
Kill Switch — At 100%, blocks further calls with a synthetic response

Options

| Option | Type | Default | Description | |--------|------|---------|-------------| | budget | string \| number | required | Max spend ("$0.50" or 0.50) | | fallback | string | undefined | Model to downgrade to | | onLimit | "stop" \| "warn" \| "raise" | "stop" | Behaviour at budget cap | | threshold | number | 0.8 | Budget fraction to trigger downgrade |

onLimit Modes

"stop" — Returns a synthetic response (no API call). Your code keeps running.
"warn" — Logs a warning, allows the call through anyway.
"raise" — Throws BudgetExceeded error.

Supported Models

OpenAI (GPT-4o, GPT-4o-mini, GPT-4, o1, o3-mini, GPT-5.4), Anthropic (Claude Opus 4, Sonnet 4, 3.7, 3.5, Haiku), Google Gemini (2.5, 2.0, 1.5), DeepSeek, and more.

Free Tier & Pricing

The free Hobby tier includes 50K tracked requests/month. For production workloads:

| Tier | Requests | Price | |------|----------|-------| | Hobby | 50K/mo | Free | | Pro | 500K/mo | $49/mo | | Team | 2M/mo | $149/mo |

→ Upgrade to Pro at tokenfence.dev — 7-day free trial, no credit card required to start.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme