npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

tokenmeter

v0.9.11

Published

OpenTelemetry-native cost tracking for AI workflows

Downloads

247

Readme

tokenmeter

npm version License: MIT

OpenTelemetry-native cost tracking for AI workflows. Track real USD costs per user, workflow, and provider with zero code changes.

Why tokenmeter?

AI costs are hard to track. Tokens flow through multiple providers, streaming responses don't report usage upfront, and attributing costs to users or workflows requires custom instrumentation everywhere.

tokenmeter solves this by:

  • Wrapping AI clients transparently - monitor(client) returns the same type, no code changes needed
  • Calculating costs automatically - Uses up-to-date pricing for OpenAI, Anthropic, Google, fal.ai, ElevenLabs
  • Propagating context - withAttributes() attaches user/org/workflow IDs to all nested AI calls
  • Integrating with OTel - Export to Datadog, Jaeger, Honeycomb, or persist to PostgreSQL

Installation

npm install tokenmeter @opentelemetry/api @opentelemetry/sdk-trace-node

Quick Start

import OpenAI from 'openai';
import { NodeTracerProvider } from '@opentelemetry/sdk-trace-node';
import { SimpleSpanProcessor, ConsoleSpanExporter } from '@opentelemetry/sdk-trace-base';
import { monitor, withAttributes } from 'tokenmeter';

// 1. Set up OpenTelemetry
const provider = new NodeTracerProvider();
provider.addSpanProcessor(new SimpleSpanProcessor(new ConsoleSpanExporter()));
provider.register();

// 2. Wrap your AI client (this is what adds cost tracking!)
const openai = monitor(new OpenAI());

// 3. Track costs with context
await withAttributes({ 'user.id': 'user_123' }, async () => {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    messages: [{ role: 'user', content: 'Hello!' }],
  });
  console.log(response.choices[0].message.content);
});

// Spans now include: tokenmeter.cost_usd, tokenmeter.provider, tokenmeter.model, user.id

Supported Providers

| Provider | Models | Pricing Unit | |----------|--------|--------------| | OpenAI | GPT-4o, GPT-4-turbo, o1, o3, GPT-3.5, embeddings, DALL-E, Whisper | per 1M tokens | | Anthropic | Claude 4, Claude 3.5, Claude 3 | per 1M tokens | | Google | Gemini 2.0, Gemini 1.5 | per 1M tokens | | fal.ai | 900+ models (Flux, SDXL, Kling, Runway, etc.) | per request/megapixel/second | | ElevenLabs | All TTS models | per 1K characters |

Core Concepts

monitor(client)

Wraps any supported AI client with a Proxy that intercepts API calls, extracts usage data, and creates OpenTelemetry spans with cost attributes.

import OpenAI from 'openai';
import Anthropic from '@anthropic-ai/sdk';
import { fal } from '@fal-ai/client';
import { monitor } from 'tokenmeter';

const openai = monitor(new OpenAI());
const anthropic = monitor(new Anthropic());
const trackedFal = monitor(fal);

// Types are fully preserved - no changes to your code
const response = await openai.chat.completions.create({...});

withAttributes(attrs, fn)

Sets context attributes inherited by all AI calls within the callback. Uses OpenTelemetry Baggage for propagation.

import { withAttributes } from 'tokenmeter';

await withAttributes({ 'user.id': 'user_123', 'org.id': 'acme' }, async () => {
  // All AI calls here are tagged with user.id and org.id
  await openai.chat.completions.create({...});
  await anthropic.messages.create({...});
});

// Nesting merges attributes
await withAttributes({ 'org.id': 'acme' }, async () => {
  await withAttributes({ 'user.id': 'user_123' }, async () => {
    // Has both org.id and user.id
  });
});

Request-Level Cost Attribution

Get cost data immediately after each API call using hooks or the withCost utility.

Using Hooks

Configure beforeRequest, afterResponse, and onError hooks when creating the monitored client:

import { monitor } from 'tokenmeter';

const openai = monitor(new OpenAI(), {
  beforeRequest: (ctx) => {
    console.log(`Calling ${ctx.spanName}`);
    // Throw to abort the request (useful for rate limiting)
    if (isRateLimited()) throw new Error('Rate limited');
  },
  afterResponse: (ctx) => {
    console.log(`Cost: $${ctx.cost.toFixed(6)}`);
    console.log(`Tokens: ${ctx.usage?.inputUnits} in, ${ctx.usage?.outputUnits} out`);
    console.log(`Duration: ${ctx.durationMs}ms`);
    
    // Track costs in your system
    trackCost(ctx.usage, ctx.cost);
  },
  onError: (ctx) => {
    console.error(`Error in ${ctx.spanName}:`, ctx.error.message);
    alertOnError(ctx.error);
  },
});

Hooks are read-only—they observe but cannot modify request arguments.

Using withCost

For ad-hoc cost capture without configuring hooks:

import { monitor, withCost } from 'tokenmeter';

const openai = monitor(new OpenAI());

const { result, cost, usage } = await withCost(() =>
  openai.chat.completions.create({
    model: 'gpt-4o',
    messages: [{ role: 'user', content: 'Hello!' }],
  })
);

console.log(`Response: ${result.choices[0].message.content}`);
console.log(`Cost: $${cost.toFixed(6)}`);
console.log(`Tokens: ${usage?.inputUnits} in, ${usage?.outputUnits} out`);

Provider-Specific Types

Use type guards to access provider-specific usage data:

import { 
  withCost, 
  isOpenAIUsage, 
  isAnthropicUsage,
  type ProviderUsageData 
} from 'tokenmeter';

const { usage } = await withCost(() => openai.chat.completions.create({...}));

if (isOpenAIUsage(usage)) {
  console.log(`OpenAI tokens: ${usage.inputUnits} in, ${usage.outputUnits} out`);
  if (usage.totalTokens) console.log(`Total: ${usage.totalTokens}`);
}

if (isAnthropicUsage(usage)) {
  console.log(`Anthropic tokens: ${usage.inputUnits} in, ${usage.outputUnits} out`);
  if (usage.cacheCreationTokens) console.log(`Cache: ${usage.cacheCreationTokens}`);
}

Available type guards: isOpenAIUsage, isAnthropicUsage, isGoogleUsage, isBedrockUsage, isFalUsage, isElevenLabsUsage, isBFLUsage, isVercelAIUsage.

TokenMeterProcessor

An OpenTelemetry SpanProcessor for debugging and validating cost calculations. It logs calculated costs for spans that have usage data.

Note: The processor cannot add cost attributes to spans after they end (OpenTelemetry limitation). For production cost tracking, use monitor() which adds tokenmeter.cost_usd before the span ends.

import { NodeTracerProvider } from '@opentelemetry/sdk-trace-node';
import { TokenMeterProcessor, configureLogger } from 'tokenmeter';

// Enable debug logging to see calculated costs
configureLogger({ level: 'debug' });

const provider = new NodeTracerProvider();
provider.addSpanProcessor(new TokenMeterProcessor());
provider.register();

Span Attributes

tokenmeter adds these attributes to spans:

| Attribute | Type | Description | |-----------|------|-------------| | tokenmeter.cost_usd | number | Calculated cost in USD | | tokenmeter.provider | string | Provider name | | tokenmeter.model | string | Model identifier | | gen_ai.usage.input_tokens | number | Input token count | | gen_ai.usage.output_tokens | number | Output token count |

Plus any attributes set via withAttributes() (e.g., user.id, org.id, workflow.id).

Framework Integrations

Next.js App Router

import { withTokenmeter } from 'tokenmeter/next';

async function handler(request: Request) {
  const response = await openai.chat.completions.create({...});
  return Response.json({ message: response.choices[0].message.content });
}

export const POST = withTokenmeter(handler, (request) => ({
  userId: request.headers.get('x-user-id') || undefined,
}));

Inngest

import { withInngest, getInngestTraceHeaders } from 'tokenmeter/inngest';

// Send events with trace context
await inngest.send({
  name: 'ai/generate',
  data: { prompt: '...' },
  ...getInngestTraceHeaders(),
});

// Restore context in function
export const generateFn = inngest.createFunction(
  { id: 'generate' },
  { event: 'ai/generate' },
  withInngest(async ({ event }) => {
    await openai.chat.completions.create({...}); // Linked to original trace
  })
);

Vercel AI SDK

For non-invasive integration with the Vercel AI SDK using experimental_telemetry:

import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { telemetry } from 'tokenmeter/vercel-ai';

const { text } = await generateText({
  model: openai('gpt-4o'),
  prompt: 'Hello!',
  experimental_telemetry: telemetry({
    userId: 'user_123',
    orgId: 'org_456',
  }),
});

PostgreSQL Persistence

Store costs for querying and billing.

import { BatchSpanProcessor } from '@opentelemetry/sdk-trace-base';
import { PostgresExporter } from 'tokenmeter/exporter';
import { createQueryClient } from 'tokenmeter/client';

// Export spans to PostgreSQL
provider.addSpanProcessor(new BatchSpanProcessor(
  new PostgresExporter({ connectionString: process.env.DATABASE_URL })
));

// Query costs
const client = createQueryClient({ connectionString: process.env.DATABASE_URL });

const { totalCost } = await client.getCostByUser('user_123', {
  from: new Date('2024-01-01'),
  to: new Date('2024-01-31'),
});

const byModel = await client.getCosts({ groupBy: ['model'] });

See DATABASE_SETUP.md for schema and setup instructions.

Streaming Support

tokenmeter handles streaming responses automatically.

// OpenAI - requires stream_options for usage
const stream = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{...}],
  stream: true,
  stream_options: { include_usage: true },
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}
// Cost calculated when stream completes

// Anthropic streaming works out of the box
const stream = anthropic.messages.stream({...});
for await (const event of stream) {...}
const finalMessage = await stream.finalMessage();

Cross-Service Propagation

For distributed systems, propagate trace context across service boundaries.

import { extractTraceHeaders, withExtractedContext } from 'tokenmeter';

// Service A: Extract headers
const headers = extractTraceHeaders();
await fetch('https://service-b.example.com', { headers });

// Service B: Restore context
await withExtractedContext(req.headers, async () => {
  await openai.chat.completions.create({...}); // Part of Service A's trace
});

Pricing Configuration

tokenmeter fetches pricing from a remote manifest with local fallback.

import { configurePricing, loadManifest } from 'tokenmeter';

// Use offline mode (bundled pricing only)
configurePricing({ offlineMode: true });

// Custom pricing API
configurePricing({ apiUrl: 'https://your-api.com/pricing' });

// Force refresh
await loadManifest({ forceRefresh: true });

Export Destinations

Datadog

import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';

provider.addSpanProcessor(new BatchSpanProcessor(
  new OTLPTraceExporter({
    url: 'https://trace.agent.datadoghq.com/v0.4/traces',
    headers: { 'DD-API-KEY': process.env.DD_API_KEY },
  })
));

Honeycomb

provider.addSpanProcessor(new BatchSpanProcessor(
  new OTLPTraceExporter({
    url: 'https://api.honeycomb.io/v1/traces',
    headers: { 'x-honeycomb-team': process.env.HONEYCOMB_API_KEY },
  })
));

Jaeger

provider.addSpanProcessor(new BatchSpanProcessor(
  new OTLPTraceExporter({ url: 'http://localhost:4318/v1/traces' })
));

API Reference

Core

| Export | Description | |--------|-------------| | monitor(client, options?) | Wrap AI client with cost tracking | | withAttributes(attrs, fn) | Set context attributes for nested calls | | withCost(fn) | Capture cost from API calls in the function | | extractTraceHeaders() | Get W3C trace headers for propagation | | withExtractedContext(headers, fn) | Restore context from headers |

Monitor Options

| Option | Type | Description | |--------|------|-------------| | name | string | Custom name for span naming | | provider | string | Override provider detection | | attributes | Attributes | Custom attributes for all spans | | beforeRequest | (ctx) => void | Hook called before each API call | | afterResponse | (ctx) => void | Hook called after successful response | | onError | (ctx) => void | Hook called on errors |

Type Guards

| Export | Description | |--------|-------------| | isOpenAIUsage(usage) | Check if usage is from OpenAI | | isAnthropicUsage(usage) | Check if usage is from Anthropic | | isGoogleUsage(usage) | Check if usage is from Google | | isBedrockUsage(usage) | Check if usage is from AWS Bedrock | | isFalUsage(usage) | Check if usage is from fal.ai | | isElevenLabsUsage(usage) | Check if usage is from ElevenLabs | | isBFLUsage(usage) | Check if usage is from Black Forest Labs | | isVercelAIUsage(usage) | Check if usage is from Vercel AI SDK |

Processor & Exporter

| Export | Description | |--------|-------------| | TokenMeterProcessor | OTel SpanProcessor for cost calculation | | PostgresExporter | OTel SpanExporter for PostgreSQL |

Query Client (tokenmeter/client)

| Method | Description | |--------|-------------| | createQueryClient(config) | Create query client | | client.getCosts(options) | Query with filters and grouping | | client.getCostByUser(userId, options?) | Get user costs | | client.getCostByOrg(orgId, options?) | Get organization costs | | client.getWorkflowCost(workflowId) | Get workflow costs |

Integrations

| Export | Description | |--------|-------------| | withTokenmeter (tokenmeter/next) | Next.js App Router wrapper | | withInngest (tokenmeter/inngest) | Inngest function wrapper | | getInngestTraceHeaders (tokenmeter/inngest) | Get headers for Inngest events | | telemetry (tokenmeter/vercel-ai) | Vercel AI SDK telemetry settings |

Questions Engineers Ask

"What's the integration effort?"

One line per client. Wrap with monitor(), and you're done.

// Before
const openai = new OpenAI();

// After  
const openai = monitor(new OpenAI());

No changes to your API calls, no middleware, no schema migrations. TypeScript types are fully preserved—autocomplete works exactly as before.

"What's the performance overhead?"

Near-zero. The hot path is:

  1. A JavaScript Proxy intercepts the method call
  2. An OTel span is created (microseconds)
  3. Cost lookup happens synchronously from bundled pricing data

No network calls block your AI requests. Pricing manifest refresh happens in the background.

"What happens if something fails?"

Graceful degradation everywhere:

| Scenario | Behavior | |----------|----------| | Pricing data unavailable | Uses bundled fallback (works offline) | | Unknown model | Logs warning, cost_usd = 0, doesn't throw | | OTel not configured | Spans are no-ops, your code still works | | Stream interrupted | Partial cost still recorded |

"Does it work with streaming responses?"

Yes, automatically. tokenmeter wraps async iterators and calculates cost when the stream completes.

For OpenAI, add stream_options: { include_usage: true } to get token counts in streaming mode.

"How do I attribute costs to users/orgs?"

Wrap your request handler with withAttributes(). All nested AI calls inherit the context automatically via OpenTelemetry Baggage:

await withAttributes({ 'user.id': userId, 'org.id': orgId }, async () => {
  // Every AI call in here gets tagged with user.id and org.id
  await openai.chat.completions.create({...});
  await anthropic.messages.create({...});  // Also tagged
});

"What if my provider isn't supported?"

Use registerProvider() to add custom providers without forking:

import { registerProvider } from 'tokenmeter';

registerProvider({
  name: 'my-provider',
  detect: (client) => 'myMethod' in client,
  extractUsage: (response) => ({
    inputUnits: response.usage?.input,
    outputUnits: response.usage?.output,
  }),
  extractModel: (args) => args[0]?.model || 'default',
});

"How accurate/up-to-date is the pricing?"

  • Bundled pricing is compiled from official provider pricing pages at build time
  • Remote refresh fetches updates from our Pricing API on startup (5-minute cache)
  • Model matching handles version suffixes (e.g., gpt-4o-2024-05-13gpt-4o) and aliases
  • Custom overrides via setModelAliases() for fine-tuned or custom-named models

"Can I use this without PostgreSQL?"

Yes. PostgreSQL is optional—only needed if you want to persist and query costs. The core monitor() and withAttributes() work with any OTel-compatible exporter (Datadog, Honeycomb, Jaeger, console, etc.) or no exporter at all.

Contributing

Contributions are welcome! Please read our Contributing Guide for details.

# Install dependencies
pnpm install

# Run tests
pnpm test

# Build
pnpm build

# Type check
pnpm check-types

License

MIT