npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

prompt-price

v0.3.1

Published

Pre-flight cost estimation for any prompt and model combination

Readme

prompt-price

Pre-flight cost estimation for any prompt and model combination. Know what an LLM call will cost before you send it.

npm version npm downloads license node TypeScript


Description

prompt-price answers the question "how much will this request cost?" with a single function call -- before the API call is made. Existing LLM SDKs report token usage and cost after the response arrives, by which time the money is already spent. This package composes token counting, provider-specific message overhead calculation, tool definition tokenization, and model pricing into a unified estimation pipeline.

The package supports six providers out of the box (OpenAI, Anthropic, Google, Meta, Mistral, Cohere) and uses calibrated per-provider character-to-token ratios for fast approximate counting without any WASM or native dependencies. A budget guard middleware wraps any OpenAI or Anthropic SDK client to enforce cost limits automatically, preventing runaway spend from unexpectedly large context windows, prompt injection, or developer error.


Installation

npm install prompt-price model-price-registry

model-price-registry is a required peer dependency that provides model pricing data.

Optional: exact token counting

npm install js-tiktoken

When js-tiktoken is installed, prompt-price can use native tiktoken encodings for exact OpenAI token counts. Without it, a calibrated heuristic is used (fast, no WASM overhead, typically within 5% of exact counts).


Quick Start

import { estimate, countTokens, compareModels, guard } from 'prompt-price';

// Estimate cost for a prompt
const est = await estimate({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Explain quantum computing' }],
});
console.log(`Estimated cost: $${est.totalCost.toFixed(6)}`);

// Count tokens without cost calculation
const tokens = await countTokens('Hello, world!', { model: 'gpt-4o' });
console.log(`Tokens: ${tokens.tokens} (${tokens.method})`);

// Compare costs across models
const comparison = await compareModels(
  [{ role: 'user', content: 'Write a haiku' }],
  ['gpt-4o', 'gpt-4o-mini', 'claude-3-5-sonnet', 'gemini-2.0-flash'],
);
comparison.forEach((e) =>
  console.log(`${e.model}: $${e.totalCost.toFixed(6)}`),
);
// Results are sorted cheapest-first

// Budget guard -- wraps an OpenAI or Anthropic client
import OpenAI from 'openai';
const openai = guard(new OpenAI(), { model: 'gpt-4o', maxCost: 0.05 });
// Throws BudgetExceededError if a request would exceed $0.05

Features

  • Pre-flight cost estimation -- Get token counts and dollar costs before making any API call.
  • Multi-provider support -- Built-in pricing and token heuristics for OpenAI, Anthropic, Google, Meta, Mistral, and Cohere models.
  • Budget guard middleware -- Wrap OpenAI or Anthropic SDK clients to enforce per-request cost limits automatically.
  • Multi-model comparison -- Estimate the same prompt against multiple models in one call, sorted by cost.
  • Tool definition accounting -- Estimates tokens consumed by function/tool definitions passed in the request.
  • Multimodal content support -- Handles text, images (with detail-level awareness), audio, and file content parts.
  • System prompt tracking -- Separately counts system prompt tokens with per-component breakdowns.
  • Sync and async APIs -- Every core function has both an async and synchronous variant.
  • Zero mandatory runtime dependencies -- Only requires model-price-registry for pricing data. Token counting works with built-in heuristics; js-tiktoken is optional.
  • Full TypeScript support -- Ships with declaration files and complete type exports.

API Reference

estimate(options: EstimateOptions): Promise<Estimate>

Estimate the cost of an LLM call before sending it.

const est = await estimate({
  model: 'gpt-4o',                   // required -- model name or "provider/model"
  provider: 'openai',                // optional -- explicit provider override
  messages: [                        // optional -- conversation messages
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'What is 2+2?' },
  ],
  systemPrompt: 'Be concise.',       // optional -- separate system prompt
  maxOutputTokens: 256,              // optional -- caps output token estimate
  tools: [...],                      // optional -- tool/function definitions
});

console.log(est.inputTokens);   // e.g. 42
console.log(est.outputTokens);  // e.g. 13
console.log(est.inputCost);     // e.g. 0.000105
console.log(est.outputCost);    // e.g. 0.000130
console.log(est.totalCost);     // e.g. 0.000235
console.log(est.currency);      // 'USD'
console.log(est.method);        // 'approximate'
console.log(est.provider);      // 'openai'
console.log(est.model);         // 'gpt-4o'
console.log(est.breakdown);     // { systemTokens, conversationTokens, toolTokens }

When maxOutputTokens is not provided, output tokens default to min(ceil(inputTokens * 0.3), 1000).

estimateSync(options: EstimateOptions): Estimate

Synchronous version of estimate(). Identical parameters and return value.

const est = estimateSync({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hi' }],
});

estimatePrompt(prompt: string, options?): Promise<Estimate>

Convenience wrapper that treats a plain string as a single user message. Defaults to gpt-4o when no model is specified.

const est = await estimatePrompt('Explain general relativity', {
  model: 'claude-3-5-sonnet',
});

compareModels(messages: Message[], models: string[], options?): Promise<Estimate[]>

Run estimate() for multiple models and return results sorted by total cost (ascending). Useful for cost-aware model selection.

const results = await compareModels(
  [{ role: 'user', content: 'Summarize this article...' }],
  ['gpt-4o', 'gpt-4o-mini', 'claude-3-5-sonnet', 'gemini-1.5-flash'],
);
// results[0] is the cheapest option
results.forEach((r) => console.log(`${r.model}: $${r.totalCost.toFixed(6)}`));

countTokens(content: string, options?: CountTokensOptions): Promise<TokenCount>

Estimate the token count of a string without computing costs.

const result = await countTokens('Hello, world!', { model: 'gpt-4o' });
// { tokens: 4, method: 'approximate' }

countTokensSync(content: string, options?: CountTokensOptions): TokenCount

Synchronous version of countTokens().

const result = countTokensSync('Hello, world!', { model: 'claude-3-5-sonnet' });

checkBudget(options: CheckBudgetOptions): Promise<BudgetResult>

Check whether an API call would fit within a cost budget. Returns a detailed result without throwing.

const result = await checkBudget({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello' }],
  budget: 0.01,
  maxOutputTokens: 500,              // optional
  provider: 'openai',                // optional
});

console.log(result.withinBudget);    // true or false
console.log(result.utilizationPct);  // e.g. 0.53 (percent of budget used)
console.log(result.remaining);       // e.g. 0.009947 (dollars remaining)
console.log(result.estimate);        // full Estimate object
console.log(result.budget);          // the budget that was checked against

guard<T extends SupportedClient>(client: T, options: GuardOptions): T

Wrap an OpenAI or Anthropic SDK client to automatically check costs before every request. The wrapped client has the same type as the original, so it is a drop-in replacement.

import OpenAI from 'openai';
import Anthropic from '@anthropic-ai/sdk';
import { guard, BudgetExceededError } from 'prompt-price';

// Guard an OpenAI client
const openai = guard(new OpenAI(), {
  model: 'gpt-4o',
  maxCost: 0.05,
  onExceed: 'throw',   // 'throw' | 'warn' | 'log' (default: 'throw')
  maxOutputTokens: 1000, // optional
  provider: 'openai',    // optional
});

// Guard an Anthropic client
const anthropic = guard(new Anthropic(), {
  model: 'claude-3-5-sonnet',
  maxCost: 0.10,
  onExceed: 'warn',    // logs a warning but does not block the request
});

try {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    messages: [{ role: 'user', content: 'Write a novel...' }],
  });
} catch (err) {
  if (err instanceof BudgetExceededError) {
    console.log(`Blocked: $${err.estimate.totalCost} exceeds $${err.maxCost}`);
  }
}

onExceed behavior:

| Value | Behavior | |-----------|-----------------------------------------------------------------| | 'throw' | Throws BudgetExceededError before the request is sent | | 'warn' | Logs a warning via console.warn and proceeds with the request | | 'log' | Logs via console.log and proceeds with the request |

resolveModel(model: string, providerHint?: string): ResolvedModel

Resolve a model string to its provider, canonical ID, and pricing. Supports bare names (gpt-4o), provider-prefixed names (openai/gpt-4o), and versioned names (gpt-4o-2024-08-06). Throws ModelNotFoundError if the model is not recognized.

import { resolveModel } from 'prompt-price';

const resolved = resolveModel('gpt-4o');
console.log(resolved.provider);             // 'openai'
console.log(resolved.modelId);              // 'gpt-4o'
console.log(resolved.inputPricePerMillion); // 2.50
console.log(resolved.outputPricePerMillion); // 10.00

// With provider prefix
const r2 = resolveModel('anthropic/claude-3-5-sonnet');
// With provider hint
const r3 = resolveModel('gpt-4o', 'openai');

calculateCost(tokens: number, pricePerMillion: number): number

Calculate the USD cost for a given number of tokens at a given price per million tokens. Returns a value rounded to 6 decimal places.

import { calculateCost } from 'prompt-price';

const cost = calculateCost(1500, 2.50); // 1500 tokens at $2.50/MTok
// 0.00375

formatCost(cost: number): string

Format a numeric cost as a dollar string with 6 decimal places.

import { formatCost } from 'prompt-price';

formatCost(0.00375); // '$0.003750'

estimateTokens(text: string, provider: string): number

Estimate the token count for a text string using the provider-specific character-to-token ratio. Returns Math.ceil(text.length / ratio).

import { estimateTokens } from 'prompt-price';

estimateTokens('hello world', 'openai');    // ceil(11 / 3.9) = 3
estimateTokens('hello world', 'anthropic'); // ceil(11 / 3.5) = 4

countChars(text: string): number

Returns the character count of a string. Utility function used internally.

import { countChars } from 'prompt-price';
countChars('hello'); // 5

countContentTokens(content: string | ContentPart[], provider: string): number

Count tokens for message content, handling both plain strings and multimodal content part arrays. Accounts for image token costs by provider and detail level.

import { countContentTokens } from 'prompt-price';

// Plain text
countContentTokens('hello world', 'openai'); // 3

// Multimodal content
countContentTokens([
  { type: 'text', text: 'describe this:' },
  { type: 'image_url', image_url: { url: '...', detail: 'low' } },
], 'openai'); // text tokens + 85 (low-detail image)

Configuration

Model specification

Models can be specified in three formats:

| Format | Example | Description | |-----------------------|------------------------------|--------------------------------------| | Bare model name | gpt-4o | Provider is inferred from the name | | Provider/model | openai/gpt-4o | Explicit provider prefix | | Versioned name | gpt-4o-2024-08-06 | Resolved via prefix matching |

Provider inference

When no provider is specified, it is inferred from the model name prefix:

| Prefix pattern | Provider | |------------------------------|-------------| | gpt-*, o1*, o3*, o4* | openai | | claude-* | anthropic | | gemini-* | google | | llama-* | meta | | mistral-*, codestral-* | mistral | | command-* | cohere |

Supported models

| Provider | Models | |-----------|--------------------------------------------------------------------------------------------------------------| | OpenAI | gpt-4o, gpt-4o-mini, gpt-4.1, gpt-4.1-mini, gpt-4-turbo, gpt-3.5-turbo, o1, o1-mini, o3-mini | | Anthropic | claude-opus-4, claude-sonnet-4-5, claude-haiku-4-5, claude-3-5-sonnet, claude-3-5-haiku, claude-3-opus, claude-3-haiku | | Google | gemini-2.0-flash, gemini-1.5-pro, gemini-1.5-flash | | Meta | llama-3.1-70b, llama-3.3-70b | | Mistral | mistral-large, mistral-small | | Cohere | command-r-plus, command-r |


Error Handling

BudgetExceededError

Thrown by guard() (with onExceed: 'throw') when a request's estimated cost exceeds the configured budget.

Properties:

| Property | Type | Description | |------------|------------|------------------------------------| | name | string | Always 'BudgetExceededError' | | message | string | Human-readable description with cost, budget, model, and token counts | | estimate | Estimate | The full estimate that triggered the error | | maxCost | number | The budget threshold that was exceeded |

import { guard, BudgetExceededError } from 'prompt-price';

try {
  await guardedClient.chat.completions.create({ ... });
} catch (err) {
  if (err instanceof BudgetExceededError) {
    console.log(err.estimate.totalCost); // the estimated cost
    console.log(err.maxCost);            // the budget limit
    console.log(err.estimate.model);     // which model
    console.log(err.estimate.inputTokens);
  }
}

ModelNotFoundError

Thrown by estimate(), estimateSync(), resolveModel(), and any function that performs model resolution when the model string does not match any known model in the built-in catalog.

Properties:

| Property | Type | Description | |----------|----------|------------------------------------------| | name | string | Always 'ModelNotFoundError' | | message| string | Human-readable description | | model | string | The model string that was not recognized |

import { estimate, ModelNotFoundError } from 'prompt-price';

try {
  await estimate({ model: 'unknown-model', messages: [] });
} catch (err) {
  if (err instanceof ModelNotFoundError) {
    console.log(`Unknown model: ${err.model}`);
  }
}

Advanced Usage

Budget checking without middleware

Use checkBudget() for programmatic budget checks without wrapping a client:

import { checkBudget } from 'prompt-price';

const result = await checkBudget({
  model: 'gpt-4o',
  messages: conversation,
  budget: 0.50,
  maxOutputTokens: 2000,
});

if (!result.withinBudget) {
  console.log(`Request would cost $${result.estimate.totalCost.toFixed(6)}`);
  console.log(`Budget utilization: ${result.utilizationPct.toFixed(1)}%`);
  // Trim context, switch to a cheaper model, or abort
}

Cost-aware model selection

Use compareModels() to pick the cheapest model that fits your needs:

import { compareModels } from 'prompt-price';

const candidates = ['gpt-4o', 'gpt-4o-mini', 'claude-3-5-sonnet', 'gemini-2.0-flash'];
const results = await compareModels(messages, candidates);

// results[0] is the cheapest
const cheapest = results[0];
console.log(`Using ${cheapest.model} at $${cheapest.totalCost.toFixed(6)}/request`);

Tool definition cost estimation

Tool definitions consume tokens and affect cost. Include them in your estimate:

import { estimate } from 'prompt-price';

const est = await estimate({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'What is the weather?' }],
  tools: [
    {
      type: 'function',
      function: {
        name: 'get_weather',
        description: 'Get current weather for a location',
        parameters: {
          type: 'object',
          properties: {
            location: { type: 'string', description: 'City name' },
            units: { type: 'string', enum: ['celsius', 'fahrenheit'] },
          },
          required: ['location'],
        },
      },
    },
  ],
});

console.log(est.breakdown?.toolTokens);          // tokens from tool definitions
console.log(est.breakdown?.conversationTokens);   // tokens from messages

Multimodal content estimation

Messages with image content are accounted for with provider-specific token costs:

import { estimate } from 'prompt-price';

const est = await estimate({
  model: 'gpt-4o',
  messages: [{
    role: 'user',
    content: [
      { type: 'text', text: 'What is in this image?' },
      {
        type: 'image_url',
        image_url: { url: 'https://example.com/photo.jpg', detail: 'high' },
      },
    ],
  }],
});
// Image tokens: 512 for high detail (OpenAI), 85 for low detail

For audio or file content parts, set estimatedTokens on the content part to provide a manual token estimate:

const est = await estimate({
  model: 'gpt-4o',
  messages: [{
    role: 'user',
    content: [
      { type: 'audio', estimatedTokens: 500 },
    ],
  }],
});

Synchronous usage

All core functions have synchronous variants for contexts where async is not available:

import { estimateSync, countTokensSync } from 'prompt-price';

const est = estimateSync({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello' }],
});

const tokens = countTokensSync('Hello, world!', { model: 'gpt-4o' });

Token Counting Method

All token counts use a calibrated heuristic (Math.ceil(text.length / ratio)):

| Provider | Chars per token | Notes | |-----------|----------------|---------------------------------| | OpenAI | 3.9 | Typically within 5% of tiktoken | | Anthropic | 3.5 | Calibrated against Claude tokenizer | | Google | 4.0 | SentencePiece-based models | | Mistral | 3.8 | BPE tokenizer | | Meta | 3.7 | Llama BPE tokenizer | | Cohere | 4.0 | BPE tokenizer |

Image token costs:

| Provider | Low detail | High detail | |----------|-----------|-------------| | OpenAI | 85 tokens | 512 tokens | | Others | 512 tokens | 512 tokens |

Output token defaults: When maxOutputTokens is not provided, output tokens are estimated as min(ceil(inputTokens * 0.3), 1000).

Message overhead: Each provider adds structural tokens per message (role markers, formatting). OpenAI: 4 tokens/message + 3 base. Anthropic: 4 tokens/message + 10 base. Others: 5 tokens/message + 10 base.


TypeScript

prompt-price is written in TypeScript and ships with full declaration files. All 13 interfaces and both error classes are exported:

Types

import type {
  Message,              // A single conversation message
  ContentPart,          // Multimodal content (text, image_url, audio, file)
  ToolCall,             // A tool call made by an assistant
  ToolDefinition,       // A tool/function definition
  EstimateOptions,      // Options for estimate()
  CountTokensOptions,   // Options for countTokens()
  TokenCount,           // Result of a token count operation
  Estimate,             // Full cost estimate result
  GuardOptions,         // Options for the guard() middleware
  CheckBudgetOptions,   // Options for checkBudget()
  BudgetResult,         // Result of checkBudget()
  OpenAILikeClient,     // Type for OpenAI-compatible SDK clients
  AnthropicLikeClient,  // Type for Anthropic-compatible SDK clients
  SupportedClient,      // Union of OpenAILikeClient | AnthropicLikeClient
} from 'prompt-price';

Key interfaces

interface Estimate {
  model: string;
  provider: string;
  inputTokens: number;
  outputTokens: number;
  inputCost: number;
  outputCost: number;
  totalCost: number;
  currency: 'USD';
  method: 'exact' | 'approximate';
  breakdown?: {
    systemTokens?: number;
    conversationTokens?: number;
    toolTokens?: number;
  };
}

interface BudgetResult {
  withinBudget: boolean;
  estimate: Estimate;
  budget: number;
  remaining: number;
  utilizationPct: number;
}

interface TokenCount {
  tokens: number;
  encoding?: string;
  method: 'exact' | 'approximate';
}

License

MIT