@costcanary/sdk
v0.1.3
Published
Auto-capture per-user LLM costs across 10+ providers. Zero code changes to existing AI calls.
Maintainers
Readme
@costcanary/sdk
Track per-user LLM costs with minimal integration.
Install
npm install @costcanary/sdkAuto-Capture (Recommended)
Two integration points — init + middleware. Every LLM call is automatically captured with per-user, per-feature attribution.
const { init, middleware } = require("@costcanary/sdk");
// 1. Entry point — before any LLM calls
init({ apiKey: "cck_your_api_key", autoCapture: true });
// 2. Express/Fastify middleware — sets user context per request
app.use(
middleware((req) => ({
userId: req.user.id,
feature: req.path,
})),
);That's it. Every outbound call to OpenAI, Anthropic, or Google is intercepted at the HTTP layer, tokens extracted, cost calculated, and sent to your dashboard — with per-user, per-feature breakdown.
Next.js / Cron jobs / Queue workers
For contexts without HTTP middleware, use withContext:
const { withContext } = require("@costcanary/sdk");
await withContext({ userId: "system", feature: "daily-digest" }, async () => {
await openai.chat.completions.create({ ... }); // auto-tracked
});How it works
- Fetch interception — patches
globalThis.fetchto detect outbound calls to known LLM API hosts - AsyncLocalStorage — propagates userId/feature through the async call chain (same technique as Datadog, Sentry, OpenTelemetry)
- Zero dependencies — uses only Node.js built-in APIs
- Non-invasive — never modifies your AI call sites
Manual tracking — track()
For granular control, wrap individual calls:
const { init, track } = require("@costcanary/sdk");
init({ apiKey: "cck_your_api_key" });
const result = await track({
userId: "user_123",
feature: "chat",
fn: () => openai.chat.completions.create({ ... }),
});Supported Models
| Provider | Models | | ----------- | ---------------------------------------------------------------------------------------------------------- | | OpenAI | gpt-4o, gpt-4o-mini, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, gpt-4-turbo, gpt-3.5-turbo, o3, o3-mini, o4-mini | | Anthropic | claude-3-5-sonnet, claude-3-7-sonnet, claude-3-haiku, claude-3-5-haiku, claude-sonnet-4, claude-opus-4 | | Google | gemini-1.5-pro, gemini-1.5-flash, gemini-2.0-flash, gemini-2.5-flash, gemini-2.5-pro | | Groq | llama-3.1-70b-versatile, llama-3.1-8b-instant, mixtral-8x7b-32768, gemma2-9b-it | | Perplexity | sonar, sonar-pro, sonar-reasoning, sonar-reasoning-pro | | Cohere | command, command-r, command-r-plus, command-a | | Mistral | mistral-small, mistral-large, codestral, open-mixtral-8x7b, open-mistral-7b | | Together AI | llama-3-70b-chat-hf | | Fireworks | llama-v3p1-70b-instruct, llama-v3p1-8b-instruct |
Unknown models fall back to a default rate and still appear in your dashboard.
Config
| Option | Default | Description |
| ------------- | ----------------------------------------------------- | ----------------------------------- |
| apiKey | required | Your CostCanary API key |
| endpoint | https://costcanary.saaspilot.workers.dev/api/ingest | Override for self-hosted |
| autoCapture | false | Enable automatic HTTP-level capture |
| debug | false | Log captured calls to console |
Environment Support
| Environment | autoCapture | middleware | withContext | | ------------------ | ------------- | ---------- | ----------- | | Node.js 18+ | ✅ Full | ✅ | ✅ | | Bun | ✅ Full | ✅ | ✅ | | Cloudflare Workers | ✅ Fetch-only | ❌ | ✅ | | Vercel Edge | ✅ Fetch-only | ❌ | ⚠️ | | Deno | ✅ | ❌ | ✅ | | Browser | ✅ | ❌ | ❌ |
Limitations
- Cloudflare Workers / Vercel Edge:
AsyncLocalStorageis unavailable. Provide context viacontextProviderininit()or usewithContext()explicitly. - Browser: CORS restrictions may prevent capturing calls to some LLM providers.
Use
wrapClient()as an alternative. - http.request interception is only available in Node.js/Bun environments. Cloudflare Workers and browsers use fetch-only mode.
