llm-token-estimator
v1.0.5
Published
Fast, offline token estimation for popular LLMs
Downloads
189
Maintainers
Readme
llm-token-estimator
Offline token estimation for Large Language Models.
Installation
npm install llm-token-estimatorUsage
Basic usage
const { estimateTokens } = require("llm-token-estimator");
const result = estimateTokens(
"Explain transformers like I'm five.",
{ model: "gpt-4o" }
);
console.log(result);Output:
{
tokens: 9,
characters: 39,
model: "gpt-4o",
maxTokens: 128000,
vendor: "openai",
warning: null
}Estimation strategies
Choose between fast heuristics, more content-aware balanced estimation, or a custom tokenizer.
estimateTokens("const answer = items.map(x => x.id)", {
model: "gpt-4o",
strategy: "balanced",
language: "code"
});
estimateTokens("hello", {
model: "gpt-4o",
strategy: "tokenizer",
tokenizer: () => ({ tokens: 42 })
});Language-aware estimation
For better accuracy with non-English content:
const result = estimateTokens(
"Bonjour, comment allez-vous?",
{
model: "gpt-4o",
language: "fr" // French
}
);
// Supported languages: en, es, fr, de, it, pt, ru, zh, ja, ko, ar, hi, codeUsing chat-style inputs (array of strings)
Useful when estimating prompts made of multiple messages:
estimateTokens(
[
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Summarize the following text:" },
{ role: "user", content: articleText }
],
{ model: "claude-3-sonnet" }
);Handling context limit warnings
const { warning } = estimateTokens(longPrompt, {
model: "gpt-4"
});
if (warning) {
console.warn(warning);
}Budgeting and truncation helpers
const { estimateCompletionBudget, fitsContextWindow, truncateToFit } = require("llm-token-estimator");
estimateCompletionBudget(prompt, {
model: "gpt-4o",
reservedOutputTokens: 1200
});
fitsContextWindow(prompt, {
model: "llama-2",
outputTokens: 256
});
truncateToFit(prompt, {
model: "gpt-4",
outputTokens: 512
});Listing supported models
const { listModels, getModelInfo } = require("llm-token-estimator");
console.log(listModels());
console.log(listModels({ vendor: "openai" }));
console.log(getModelInfo("gpt-4o"));Accuracy and Limitations
This library provides approximate token counts based on character-to-token ratios. While fast and dependency-free, it has limitations:
- ✅ Good for: Quick estimates, cost approximation, context limit checks
- ❌ Limitations: Language variations, content types, model-specific tokenization
For production applications requiring high accuracy, consider using:
tiktokenfor OpenAI models- Model-specific tokenizers for other providers
Supported Models
Includes 81 models from major providers:
OpenAI: GPT-5.2, GPT-5, GPT-4.1, o3, o4-mini, GPT-OSS models, and more Anthropic: Claude 4 series (Opus 4.6, Sonnet 4.5, Haiku 4.5) Google: Gemini 3, Gemini 2.5 series Meta: LLaMA 3.x series Mistral: Large 3, Medium 3.1, Ministral 3 series Others: xAI Grok, Cohere, Alibaba Qwen, DeepSeek, Amazon Nova, and more
Use listModels() to see all supported models.
CLI usage
npx llm-token-estimator --model gpt-4o --input "Explain transformers like I'm five."
npx llm-token-estimator --list-models --vendor openai
npx llm-token-estimator --budget --model claude-3-sonnet --output-tokens 800 --file prompt.txtDefault behavior
- Default model:
gpt-3.5-turbo - Default language:
en(English) - Default strategy:
fast - Input can be:
- a string
- an array of strings
- an array of chat messages
- Output tokens are not included (input only)
Example use cases
- Pre-flight prompt validation
- CI checks for context overflows
- Prompt truncation logic
- Cost estimation (approximate)
- Multi-language content estimation
- Model comparison and selection
- Rate limiting based on token counts
API Reference
estimateTokens(input, options)
Parameters:
input(string | string[]): Text to estimate tokens foroptions(object):model(string): Model name (default: "gpt-3.5-turbo")language(string): Language code for better estimation (default: "en")strategy("fast" | "balanced" | "tokenizer")tokenizer(function): Custom tokenizer hook for exact counts
Returns: Object with tokens, characters, model, maxTokens, vendor, warning
estimateCompletionBudget(input, options)
Returns prompt token count plus reserved output budget information.
fitsContextWindow(input, options)
Returns whether the prompt plus requested output fits the model context window.
truncateToFit(input, options)
Truncates oversized input until it fits the model context window.
listModels(filters)
Returns model names, or metadata objects when includeMetadata: true is passed.
getModelInfo(modelName)
Returns enriched metadata for a single model.
Contributing
We welcome contributions! Feel free to:
- Add new models
- Improve estimation accuracy
- Add new languages
- Fix bugs or enhance documentation
