@usagetap/sdk
v0.8.0
Published
UsageTap SDK core client plus optional React helpers.
Downloads
44
Readme
@usagetap/sdk
Server-only JavaScript/TypeScript client for UsageTap. The SDK helps you instrument call_begin → vendor call → call_end flows with built-in retries, idempotency helpers, and vendor adapters.
Module formats
@usagetap/sdk ships real dual ESM (.mjs) and CommonJS (.cjs) entrypoints. In ESM projects use import { UsageTapClient } from "@usagetap/sdk";. For CommonJS runtimes (including VS Code extensions) rely on const { UsageTapClient } = require("@usagetap/sdk");.
Optional adapters live behind subpath exports so their peer dependencies stay out of the core bundle:
@usagetap/sdk/openai– OpenAI/OpenRouter helpers (wrapOpenAI,streamOpenAIRoute, etc.)@usagetap/sdk/express– Express middleware@usagetap/sdk/react– React chat hook
Install only the peer dependencies for the adapters you actually use.
Quick start
Install the peer dependency for your vendor (e.g. openai) and the UsageTap SDK in your server runtime.
npm install @usagetap/sdk openaiCreate a UsageTap client, request entitlements, and choose the right model every time:
import OpenAI from "openai";
import { UsageTapClient } from "@usagetap/sdk";
const usageTap = new UsageTapClient({
apiKey: process.env.USAGETAP_API_KEY!,
baseUrl: process.env.USAGETAP_BASE_URL!,
});
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY! });
await usageTap.createCustomer({
customerId: "cust_123",
customerFriendlyName: "Acme AI",
customerEmail: "[email protected]",
});
function selectCapabilities(allowed: {
standard?: boolean;
premium?: boolean;
reasoningLevel?: "LOW" | "MEDIUM" | "HIGH" | null;
search?: boolean;
}) {
const tier = allowed.premium ? "premium" : "standard";
const model = tier === "premium" ? "gpt5" : "gpt5-mini";
const reasoningEffort = allowed.reasoningLevel === "HIGH"
? "high"
: allowed.reasoningLevel === "MEDIUM"
? "medium"
: allowed.reasoningLevel === "LOW"
? "low"
: undefined;
return {
model,
reasoning: reasoningEffort ? { effort: reasoningEffort } : undefined,
tools: allowed.search ? [{ type: "web_search" as const }] : undefined,
};
}
const completion = await usageTap.withUsage(
{
customerId: "cust_123",
feature: "chat.send",
requested: { standard: true, premium: true, search: true, reasoningLevel: "HIGH" },
},
async ({ begin, setUsage }) => {
const { model, reasoning, tools } = selectCapabilities(begin.data.allowed);
const response = await openai.responses.create({
model,
input: "Draft a welcome email for our Pro plan",
reasoning,
tools,
});
setUsage({
modelUsed: model,
inputTokens: response.usage?.input_tokens ?? response.usage?.prompt_tokens ?? 0,
responseTokens: response.usage?.output_tokens ?? response.usage?.completion_tokens ?? 0,
reasoningTokens: reasoning ? response.usage?.reasoning_tokens ?? 0 : 0,
searches: tools?.length ? response.usage?.web_search_queries ?? 0 : 0,
});
return response;
},
);
console.log(completion.output_text);If you only need to toggle web search, keep the selected model and conditionally add the tool when UsageTap says it’s allowed:
const response = await openai.responses.create({
model: "gpt5",
tools: begin.data.allowed.search ? [{ type: "web_search" }] : undefined,
input: "What was a positive news story from today?",
});Prefer a zero-boilerplate integration? Keep scrolling—wrapOpenAI applies the same entitlement-aware defaults if you omit model from your request.
import { wrapOpenAI } from "@usagetap/sdk/openai";
const ai = wrapOpenAI(openai, usageTap, {
defaultContext: {
customerId: "cust_123",
feature: "chat.send",
requested: { standard: true, premium: true, search: true, reasoningLevel: "HIGH" },
},
});Heads up:
UsageTapClientalways negotiates the canonical UsageTap media type by sendingAccept: application/vnd.usagetap.v1+json. Every response now uses the{ result, data, correlationId }envelope exclusively and the begin payload includesdata.idempotency.key(always matchingcallId), per-meter snapshots, and subscription metadata. SetautoIdempotency: false(or pass your ownidempotency) to skip the SDK's auto-generated key and rely on the server's deterministic fallback when retriable semantics are acceptable.
Streaming helpers
wrapOpenAI automatically instruments streaming responses. You can feed the wrapped stream directly into Next.js or an Express response using the exported helpers:
import { toNextResponse } from "@usagetap/sdk/openai";
export async function POST() {
const stream = await ai.chat.completions.create(
{
messages: [{ role: "user", content: "Stream it" }],
stream: true,
},
{
usageTap: {
requested: { standard: true, premium: true, search: true, reasoningLevel: "MEDIUM" },
},
},
);
return toNextResponse(stream, { mode: "text" });
}wrapOpenAI inspects begin.data.vendorHints.preferredModel: premium entitlements resolve to gpt5, otherwise the wrapper falls back to gpt5-mini. Use the manual pattern shown earlier when you need to toggle reasoning effort or attach search tools based on the returned allowances.
Overriding usage context per request
You can override the UsageTap begin payload on a per-call basis via the usageTap option:
await ai.chat.completions.create(
{ messages },
{
usageTap: {
customerId: currentUser.id,
feature: "chat.assist",
tags: ["beta"],
requested: { standard: true, premium: true, search: true, reasoningLevel: "HIGH" },
},
},
);The begin response for that call will promote premium plans to gpt5, fall back to gpt5-mini otherwise, and cap reasoning to the granted tier.
For streaming calls created with { stream: true }, UsageTap automatically calculates usage from the final OpenAI response (or falls back to estimates when available). The wrapped stream retains OpenAI-specific helpers like finalChatCompletion().
responses.create support
The wrapper also instruments openai.responses.create, applying vendor hints (preferred models, token limits) and collecting usage data the same way as chat completions.
OpenRouter support
wrapOpenAI works seamlessly with OpenRouter since it uses an OpenAI-compatible API. Just point the base URL to OpenRouter:
import OpenAI from "openai";
import { UsageTapClient } from "@usagetap/sdk";
import { wrapOpenAI } from "@usagetap/sdk/openai";
const usageTap = new UsageTapClient({
apiKey: process.env.USAGETAP_API_KEY!,
baseUrl: process.env.USAGETAP_BASE_URL!,
});
const openrouter = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.OPENROUTER_API_KEY!,
});
const ai = wrapOpenAI(openrouter, usageTap, {
defaultContext: {
customerId: "cust_123",
feature: "chat.send",
requested: { standard: true, premium: true, search: true, reasoningLevel: "HIGH" },
},
});
const completion = await ai.chat.completions.create(
{
messages: [{ role: "user", content: "Hello from OpenRouter!" }],
},
{
usageTap: {
requested: { standard: true, premium: true, search: true, reasoningLevel: "MEDIUM" },
},
},
);begin.data.models will surface the OpenRouter-specific identifiers the customer can use (for example, standard ⇒ gpt5-mini, premium ⇒ gpt5). Since wrapOpenAI honors those hints, you can omit model and let UsageTap keep the request aligned with the active entitlement.
Express middleware
For Express applications, use the withUsage middleware to attach UsageTap context to requests:
import express from "express";
import OpenAI from "openai";
import { UsageTapClient } from "@usagetap/sdk";
import { withUsage } from "@usagetap/sdk/express";
const app = express();
const usageTap = new UsageTapClient({
apiKey: process.env.USAGETAP_API_KEY!,
baseUrl: process.env.USAGETAP_BASE_URL!,
});
// Extract customer ID from your auth system
app.use(withUsage(usageTap, (req) => req.user.id));
app.post("/api/chat", async (req, res) => {
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY! });
const ai = req.usageTap!.openai(openai, {
feature: "chat.assistant",
requested: { standard: true, premium: true, search: true, reasoningLevel: "HIGH" },
});
const stream = await ai.chat.completions.create(
{
messages: req.body.messages,
stream: true,
},
{
usageTap: {
requested: { standard: true, premium: true, search: true, reasoningLevel: "HIGH" },
},
},
);
// Pipes stream to response and finalizes usage
req.usageTap!.pipeToResponse(stream, res);
});With that context in place, premium calls receive gpt5 and everyone else falls back to gpt5-mini. To respect allowed.reasoningLevel or allowed.search, read the begin payload inside route handlers (see the manual withUsage example above) and shape the OpenAI request accordingly.
React hook for chat UIs
Build chat interfaces with automatic UsageTap tracking:
import { useChatWithUsage } from "@usagetap/sdk/react";
function ChatComponent({ userId }) {
const { messages, input, setInput, handleSubmit, isLoading } = useChatWithUsage({
api: "/api/chat",
customerId: userId,
feature: "chat.assistant",
});
return (
<div>
{messages.map((m) => (
<div key={m.id}>
<strong>{m.role}:</strong> {m.content}
</div>
))}
<form onSubmit={handleSubmit}>
<input
value={input}
onChange={(e) => setInput(e.target.value)}
disabled={isLoading}
/>
<button type="submit" disabled={isLoading}>
Send
</button>
</form>
</div>
);
}The hook works with server routes that use UsageTap SDK (see streamOpenAIRoute above).
wrapFetch: minimal integration
For the smallest possible integration, use wrapFetch to wrap the fetch function passed to the OpenAI SDK. This requires zero changes to your OpenAI code:
import OpenAI from "openai";
import { UsageTapClient, wrapFetch } from "@usagetap/sdk";
const usageTap = new UsageTapClient({
apiKey: process.env.USAGETAP_API_KEY!,
baseUrl: process.env.USAGETAP_BASE_URL!,
});
const wrappedFetch = wrapFetch(usageTap, {
defaultContext: {
customerId: "cust_123",
feature: "chat",
requested: { standard: true, premium: true, search: true, reasoningLevel: "MEDIUM" },
},
});
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY!,
fetch: wrappedFetch,
});
// Reuse the selectCapabilities helper shown above to map entitlements to models
// Pull the entitlements you cached after call_begin and pick the right tier
const { model } = selectCapabilities(session.entitlements.allowed);
const completion = await openai.chat.completions.create({
model,
messages: [{ role: "user", content: "Hello!" }],
});wrapFetch detects OpenAI API endpoints, handles streaming and non-streaming responses, and automatically extracts usage data. Persist the begin.data.allowed blob wherever you store session context so every downstream openai call can resolve to gpt5 (premium) or gpt5-mini (standard). You can override context per-request using special headers:
await openai.chat.completions.create(
{ messages: [{ role: "user", content: "Hello!" }] },
{
headers: {
"x-usagetap-customer-id": currentUser.id,
"x-usagetap-feature": "chat.premium",
},
},
);Unified /call endpoint (API-only)
Need a single round-trip without the SDK? The public REST API exposes POST /call, which wraps call_begin, an optional vendor invocation, and call_end into one atomic request. Supply your usual begin payload plus an optional vendor block containing the URL, headers, and body to execute. UsageTap merges usage metrics from the vendor response with any explicit overrides before finalizing the call.
async function getEntitlementsFor(customerId: string) {
// Call begin upfront or reuse a cached begin payload for this customer + feature
return sessionStore.read(customerId); // pseudo-code: use your own persistence layer
}
const entitlements = await getEntitlementsFor("cust_123"); // stash begin.data.allowed somewhere durable
const { model } = selectCapabilities(entitlements.allowed);
const response = await fetch(`${baseUrl}/call`, {
method: "POST",
headers: {
Authorization: `Bearer ${process.env.USAGETAP_API_KEY}`,
Accept: "application/vnd.usagetap.v1+json",
"Content-Type": "application/json",
},
body: JSON.stringify({
customerId: "cust_123",
requested: { standard: true, premium: true, search: true, reasoningLevel: "MEDIUM" },
feature: "chat.completions",
idempotency: crypto.randomUUID(),
vendor: {
url: "https://api.openai.com/v1/chat/completions",
method: "POST",
headers: {
Authorization: `Bearer ${process.env.OPENAI_API_KEY}`,
"Content-Type": "application/json",
},
body: {
model,
messages: [{ role: "user", content: "Hello" }],
},
responseType: "json",
},
usage: { modelUsed: model },
}),
});
const envelope = await response.json();
if (!response.ok || envelope.result.status !== "ACCEPTED") {
throw new Error(`UsageTap /call failed: ${envelope.result.code}`);
}
const { begin, end, vendor, endUsage } = envelope.data;- When the
vendorblock is omitted,/callsimply runs begin → end using the providedusageoverrides. - Non-2xx vendor responses still trigger
call_end; the envelope returnsCALL_VENDOR_WARNINGalongside vendor error metadata. - The canonical media type
application/vnd.usagetap.v1+jsonis required; the SDK already sends this header automatically when you rely onUsageTapClient.
Exports
Key exports from @usagetap/sdk:
UsageTapClient– minimal HTTP client forcreateCustomer,changePlan,incrementCustomMeter,call_begin,call_end, andcheckUsage.createCustomer– idempotently ensure a customer subscription exists before starting a call.changePlan– switch a customer to a different usage plan with configurable strategy (immediate reset, prorated, or scheduled).incrementCustomMeter– track custom usage metrics beyond standard LLM counters (agent actions, documents, API calls, etc.).checkUsage– lightweight method to query current usage status without creating a call session.wrapFetch– wraps a fetch function to automatically instrument OpenAI API calls (minimal integration).createIdempotencyKey– helper for generating UsageTap-compatible idempotency keys.- Type definitions for canonical UsageTap request/response payloads.
Optional subpaths:
@usagetap/sdk/openai–wrapOpenAI,createOpenAIAdapter,streamOpenAIRoute,toNextResponse,pipeToResponse, and related types.@usagetap/sdk/express–withUsage,withUsageMiddleware, and corresponding Express request types.@usagetap/sdk/react–useChatWithUsageand supporting types for building chat interfaces.
All helpers are designed for server runtimes. Use UsageTapClient with allowBrowser: true only for sandbox/test scenarios.
Ensure a customer subscription exists
Run createCustomer before you invoke call_begin (or higher-level helpers) to guarantee the customer has an active subscription. The endpoint is fully idempotent—repeat calls return the existing snapshot and set newCustomer: false:
const snapshot = await usageTap.createCustomer({
customerId: "cust_123",
customerFriendlyName: "Acme AI",
customerEmail: "[email protected]",
stripeCustomerId: "cus_123",
});
console.log("New customer?", snapshot.data.newCustomer);
console.log("Plan:", snapshot.data.plan);
console.log("Allowed entitlements:", snapshot.data.allowed);This returns the same rich subscription snapshot surfaces by call_begin and checkUsage, making it safe to cache the response for onboarding flows. Pass idempotencyKey in CreateCustomerOptions when you need deterministic keys across services; otherwise the client auto-generates one by default. Both idempotencyKey (preferred) and idempotency (deprecated) are supported.
Change a customer's plan
Use changePlan to switch a customer to a different usage plan. You can control how the change is applied with the strategy option:
const result = await usageTap.changePlan({
customerId: "cust_123",
planId: "plan_premium_v2",
strategy: "IMMEDIATE_RESET", // or "IMMEDIATE_PRORATED" or "AT_NEXT_REPLENISH"
});
console.log("Plan changed:", result.data.success);
console.log("New subscription:", result.data.subscription);Strategy options:
IMMEDIATE_RESET: Switch plan immediately and reset all usage counters to zeroIMMEDIATE_PRORATED: Switch plan immediately and prorate existing usage against new limitsAT_NEXT_REPLENISH: Schedule the plan change for the next replenishment cycle (default)
The response includes the updated subscription details, including the new plan version, limits, and next replenishment timestamp. If strategy: "AT_NEXT_REPLENISH" is used, the subscription.pending field will indicate the scheduled plan change.
Check usage without creating a call
When you need to display current quota status, plan details, or remaining balances without tracking a vendor call, use checkUsage():
const usageStatus = await usageTap.checkUsage({ customerId: "cust_123" });
console.log("Meters:", usageStatus.data.meters);
console.log("Allowed:", usageStatus.data.allowed);
console.log("Plan:", usageStatus.data.plan);
console.log("Balances:", usageStatus.data.balances);This returns the same rich usage snapshot as call_begin (meters, entitlements, subscription details, plan info, balances) but without creating a call record. Use this for dashboard widgets, pre-flight checks, or displaying quota status to users.
Increment custom meters
Custom meters allow you to track usage beyond standard LLM metrics—ideal for agent actions, document processing, API calls, or any custom usage you need to meter.
const result = await usageTap.incrementCustomMeter({
customerId: "cust_123",
meterSlot: "CUSTOM1", // or "CUSTOM2"
amount: 5,
feature: "agent_actions",
tags: ["workflow_automation"],
metadata: {
workflowId: "wf_abc123",
actionType: "email_send",
},
});
console.log("Event recorded:", result.data.eventId);
console.log("Remaining quota:", result.data.meter.remaining);
console.log("Blocked:", result.data.blocked);Parameters:
customerId(string, required): Customer identifiermeterSlot("CUSTOM1" | "CUSTOM2", required): Which custom meter to incrementamount(number, required): Positive number to decrement from quotafeature(string, optional): Feature identifier for trackingtags(string[], optional): Tags for categorizationmetadata(object, optional): Additional metadata
The method returns the updated meter snapshot showing remaining quota, limits, and usage. If the customer's plan has limitType: "BLOCK" and quota is exceeded, a UsageTapError is thrown with code USAGETAP_AUTH_ERROR.
Use cases:
// Track agent tool invocations
await usageTap.incrementCustomMeter({
customerId: "cust_123",
meterSlot: "CUSTOM1",
amount: 1,
feature: "agent.tool_call",
tags: ["web_search"],
});
// Track document processing (10 pages)
await usageTap.incrementCustomMeter({
customerId: "cust_456",
meterSlot: "CUSTOM2",
amount: 10,
feature: "document.ocr",
metadata: { documentId: "doc_789", pages: 10 },
});
// Track external API calls
await usageTap.incrementCustomMeter({
customerId: "cust_789",
meterSlot: "CUSTOM1",
amount: 1,
feature: "external_api.maps",
tags: ["geocoding"],
});Important notes:
- Custom meters must be enabled in the customer's usage plan
- The
amountdecrements the remaining quota (like token usage) - With
BLOCKpolicy, exceeding quota throws an error - With
DOWNGRADEpolicy, usage continues but quota can go negative - Unlimited meters don't track usage but still record events for analytics
Response envelope (canonical only)
UsageTap responds exclusively with the canonical { result, data, correlationId } envelope for every endpoint. The SDK automatically sends Accept: application/vnd.usagetap.v1+json, parses the envelope, and returns strongly typed data structures. Transitional raw payloads and the normalize* helpers have been removed—response.data already contains the canonical shape you should persist or render.
Example call_begin success
{
"result": {
"status": "ACCEPTED",
"code": "CALL_BEGIN_SUCCESS",
"timestamp": "2025-10-04T18:21:37.482Z"
},
"data": {
"callId": "call_123",
"startTime": "2025-10-04T18:21:37.482Z",
"policy": "DOWNGRADE",
"newCustomer": false,
"canceled": false,
"allowed": {
"standard": true,
"premium": true,
"audio": false,
"image": false,
"search": true,
"reasoningLevel": "MEDIUM"
},
"entitlementHints": {
"suggestedModelTier": "standard",
"reasoningLevel": "MEDIUM",
"policy": "DOWNGRADE",
"downgrade": {
"reason": "PREMIUM_QUOTA_EXHAUSTED",
"fallbackTier": "standard"
}
},
"meters": {
"standardCalls": {
"remaining": 12,
"limit": 20,
"used": 8,
"unlimited": false,
"ratio": 0.6
},
"premiumCalls": {
"remaining": null,
"limit": null,
"used": null,
"unlimited": true,
"ratio": null
},
"standardTokens": {
"remaining": 800,
"limit": 1000,
"used": 200,
"unlimited": false,
"ratio": 0.8
}
},
"remainingRatios": {
"standardCalls": 0.6,
"standardTokens": 0.8
},
"subscription": {
"id": "sub_123",
"usagePlanVersionId": "plan_2025_01",
"planName": "Pro",
"planVersion": "2025-01",
"limitType": "DOWNGRADE",
"reasoningLevel": "MEDIUM",
"lastReplenishedAt": "2025-10-04T00:00:00.000Z",
"nextReplenishAt": "2025-11-04T00:00:00.000Z",
"subscriptionVersion": 14
},
"models": {
"standard": ["gpt5-mini"],
"premium": ["gpt5"]
},
"idempotency": {
"key": "call_123",
"source": "derived"
}
},
"correlationId": "corr_abc123"
}UsageTapClient exposes the normalized structure via UsageTapSuccessResponse<BeginCallResponseBody>. In addition to the flattened allowed map, the begin response now ships richer metadata:
entitlementHintssummarises the recommended model tier and downgrade rationale based on the active policy.metersis a per-counter snapshot including remaining quotas, total limits, usage to date, and convenience ratios.remainingRatiosmirrors the same information in a compact map for quick lookups.subscriptioncontains the active plan identity, versioning, and upcoming replenishment timestamps so you can render customer-facing UI without querying Dynamo yourself.modelssurfaces per-organization vendor hints (e.g. standard vs. premium model shortlists).idempotencyreveals the actual key that was persisted (callIdmirrors this value). When you omitidempotencyin the request, the backend derives a deterministic hash from organization, customer, feature, and requested entitlements.planandbalancesremain available alongside the core begin payload for backwards compatibility with earlier SDK versions.
Example call_end success
{
"result": {
"status": "ACCEPTED",
"code": "CALL_END_SUCCESS",
"timestamp": "2025-10-04T18:21:52.103Z"
},
"data": {
"callId": "call_123",
"costUSD": 0,
"metered": {
"tokens": 768,
"calls": 1,
"searches": 1
}
},
"correlationId": "corr_abc123"
}metered is derived from the raw Dynamo deltas. Additional meters (audio seconds, reasoning tokens, balances) will populate in later phases without breaking the contract.
Premium detection and override
UsageTap automatically determines whether a call is premium based on the model's output token pricing:
- If the output token price exceeds $4.00 per million tokens, the call is classified as premium
- Otherwise, it's classified as standard
You can explicitly override this detection by passing isPremium in your call_end request:
await usageTap.endCall({
callId: begin.data.callId,
modelUsed: "custom-model-v2",
inputTokens: 100,
responseTokens: 200,
isPremium: true, // Explicitly mark this as a premium call
});This is useful when:
- You're using custom models that aren't in UsageTap's pricing database
- You want to enforce specific billing tiers regardless of pricing
- You're implementing your own tier classification logic
Raw fetch integrations
Prefer UsageTapClient whenever possible—it handles retries, headers, and idempotency for you. If you still need to work with fetch directly, remember to request the canonical media type and consume the envelope shape directly:
import type { BeginCallResponseBody, EndCallResponseBody } from "@usagetap/sdk";
const beginResponse = await fetch(`${baseUrl}/call_begin`, {
method: "POST",
headers: {
Authorization: `Bearer ${apiKey}`,
Accept: "application/vnd.usagetap.v1+json",
"Content-Type": "application/json",
},
body: JSON.stringify(payload),
}).then((r) => r.json());
if (beginResponse.result.status !== "ACCEPTED") {
throw new Error(`call_begin failed: ${beginResponse.result.code}`);
}
const begin = beginResponse.data as BeginCallResponseBody;
// ...later, when closing the call
const endResponse = await fetch(`${baseUrl}/call_end`, {
method: "POST",
headers: {
Authorization: `Bearer ${apiKey}`,
Accept: "application/vnd.usagetap.v1+json",
"Content-Type": "application/json",
},
body: JSON.stringify({ callId: begin.callId }),
}).then((r) => r.json());
if (endResponse.result.status !== "ACCEPTED") {
throw new Error(`call_end failed: ${endResponse.result.code}`);
}
const end = endResponse.data as EndCallResponseBody;The canonical payloads (BeginCallResponseBody, EndCallResponseBody, etc.) now match the envelope exactly, keeping SDK and raw integrations aligned without extra helper utilities.
