@tokenometer/core
v1.1.0
Published
Tokenometer core library — multi-provider LLM token cost, latency, and vision-token estimation with empirical countTokens fallback (Anthropic, OpenAI, Google, Mistral, Cohere).
Maintainers
Keywords
Readme
@tokenometer/core
Core library powering tokenometer: tokenizer dispatch, format converters, versioned cost rate matrix, vision-token estimators, latency measurement, SARIF emitter, config loader, and an empirical-mode
countTokensadapter for Anthropic, OpenAI, Google, Mistral, and Cohere.
See the root README for findings, methodology, and the full project overview.
Live playground · Source · MIT
If you just want a CLI, npm install -g tokenometer. This package is for programmatic use — it's the engine the CLI, the GitHub Action, the VS Code / Cursor extension, and the playground all share, so counts and pricing stay identical across every surface.
API
import {
// Core tokenization
tokenize,
tokenizeMatrix,
countTokens,
// Empirical (real provider countTokens / tokenize endpoints)
tokenizeEmpirical,
tokenizeMatrixEmpirical,
// Latency benchmarking
measureLatency,
nthPercentile,
// Format conversion
toFormat,
isFormat,
allFormats,
// Config (.tokenometer.yml)
loadConfig,
parseConfig,
// SARIF + JSON emitter
toSarif,
// Vision-token estimators
anthropicVisionTokens,
openaiVisionTokens,
googleVisionTokens,
// Pricing / model registry
KNOWN_MODELS,
MODELS,
RATES,
RATES_VERSION,
getModel,
getRate,
} from '@tokenometer/core';
import type {
// Token results
CountResult,
TokenizeResult,
EmpiricalResult,
EmpiricalCountResult,
EmpiricalEnv,
// Latency
LatencyResult,
LatencyTrial,
LatencyStats,
LatencyDeps,
MeasureLatencyOptions,
// Aggregates / formatters
TokenometerResult,
TokenometerFileResult,
ToSarifOptions,
// Config
TokenometerConfig,
ConfigFormat,
// Vision input shapes
AnthropicVisionInput,
OpenAIVisionInput,
GoogleVisionInput,
// Registry
ModelDescriptor,
Provider,
RateEntry,
Format,
TokenizerKind,
} from '@tokenometer/core';Offline (deterministic, no API key)
const result = tokenize({
prompt: '{"hello": "world"}',
format: 'yaml',
modelId: 'claude-opus-4-7',
});
// {
// model: 'claude-opus-4-7',
// provider: 'anthropic',
// format: 'yaml',
// tokenizer: 'cl100k_base',
// inputTokens: 12,
// inputCost: 0.00018,
// approximate: true // ← Anthropic does not publish a public Claude 3+ tokenizer
// }Empirical (real provider counts, free)
const result = await tokenizeEmpirical({
prompt: '{"hello": "world"}',
format: 'yaml',
modelId: 'claude-opus-4-7',
env: { anthropicApiKey: process.env.ANTHROPIC_API_KEY! },
});
// approximate: false ← uses Anthropic's messages.countTokensLatency benchmarking
const stats = await measureLatency({
modelId: 'claude-opus-4-7',
prompt: 'Write a haiku about CI.',
trials: 3,
env: { anthropicApiKey: process.env.ANTHROPIC_API_KEY! },
});
// LatencyResult: { trials: LatencyTrial[], stats: { ttftMs, totalMs, tokensPerSec } }
// Each stat is { p50, p95, mean }.Supported providers: Anthropic (messages.stream), OpenAI (/v1/chat/completions SSE), Google (generateContentStream), Cohere (/v1/chat NDJSON), Mistral (/v1/chat/completions SSE). Each call is capped at max_tokens=200; trials retry once on transient failures.
Vision tokens
const tokens = anthropicVisionTokens({ width: 1280, height: 720 });
// 1228 (capped at 1600 for very large images)The openaiVisionTokens and googleVisionTokens exports are formula-equivalent to the OpenAI high-detail tile cost and Gemini's 258-per-768²-tile cost respectively.
SARIF + JSON output
const sarif = toSarif({ files: [{ path: 'prompt.md', results: [...] }] });
// SARIF 2.1.0 — drop into GitHub Code Scanning or any SARIF viewer.Rate table
RATES is a Record<modelId, { inputPer1k, outputPer1k, cachedInputPer1k? }>. RATES_VERSION ships as a date string so consumers can pin or audit. KNOWN_MODELS is the union (currently 63 across 5 providers).
Providers
| Provider | Models | Offline tokenizer | Exactness | Empirical (tokenizeEmpirical) |
|-----------|--------|-------------------|-----------|----------------------------------|
| OpenAI | gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-3.5-turbo, o1 family | gpt-tokenizer o200k_base | exact | same o200k_base (matches production) |
| Anthropic | claude-opus-4-7, claude-sonnet-4-6, claude-haiku-4-5, Claude 3.x family | gpt-tokenizer cl100k_base | approximate | messages.countTokens (free, exact) |
| Google | gemini-2.5-pro, gemini-2.5-flash, gemini-1.5-pro, gemini-1.5-flash | chars / 4 heuristic | approximate | model.countTokens (free, exact) |
| Mistral (19 models) | open-mistral-7b, open-mixtral-8x22b, mistral-large-latest, codestral-latest, mistral-nemo, pixtral-large-latest, mistral-medium-2505, magistral-small, ministral-3b-latest, devstral-small-2505 | mistral-tokenizer-js (V1/V2/V3 SentencePiece); chars/4 for Tekken family (NeMo, Pixtral, Mistral Small 2409+, Devstral, Mistral Medium 2505+, Magistral, Ministral) | exact for SentencePiece · approximate for Tekken | unsupported (no public token-count endpoint) |
| Cohere | command-r, command-r-plus | chars / 4 heuristic | approximate | POST /v1/tokenize (free, exact, requires COHERE_API_KEY) |
Pricing comes from @tokenlens/models plus a small LOCAL_OVERRIDES map for bleeding-edge models the registry hasn't picked up yet. Cohere lives entirely in LOCAL_OVERRIDES because @tokenlens/models does not yet ship a Cohere catalog at v1.3.0; pull from cohere.com/pricing whenever RATES_VERSION bumps.
Internally the dispatch helpers mistralCount, cohereCount, cohereTokenizeApi, and isTekken (in tokenize-mistral.ts / tokenize-cohere.ts) are not part of the public API — they're called from tokenize / tokenizeEmpirical. If you need them, import the files directly; they may move.
License
MIT
