llm-token-estimator

v1.0.5

Published

23 days ago

Fast, offline token estimation for popular LLMs

Downloads

189

0High
0Medium
0Low

er_imadahmed

llm ai tokens genai openai claude gemini llama

llm-token-estimator

Offline token estimation for Large Language Models.

Installation

npm install llm-token-estimator

Usage

Basic usage

const { estimateTokens } = require("llm-token-estimator");

const result = estimateTokens(
  "Explain transformers like I'm five.",
  { model: "gpt-4o" }
);

console.log(result);

Output:

{
  tokens: 9,
  characters: 39,
  model: "gpt-4o",
  maxTokens: 128000,
  vendor: "openai",
  warning: null
}

Estimation strategies

Choose between fast heuristics, more content-aware balanced estimation, or a custom tokenizer.

estimateTokens("const answer = items.map(x => x.id)", {
  model: "gpt-4o",
  strategy: "balanced",
  language: "code"
});

estimateTokens("hello", {
  model: "gpt-4o",
  strategy: "tokenizer",
  tokenizer: () => ({ tokens: 42 })
});

Language-aware estimation

For better accuracy with non-English content:

const result = estimateTokens(
  "Bonjour, comment allez-vous?",
  { 
    model: "gpt-4o",
    language: "fr"  // French
  }
);

// Supported languages: en, es, fr, de, it, pt, ru, zh, ja, ko, ar, hi, code

Using chat-style inputs (array of strings)

Useful when estimating prompts made of multiple messages:

estimateTokens(
  [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "Summarize the following text:" },
    { role: "user", content: articleText }
  ],
  { model: "claude-3-sonnet" }
);

Handling context limit warnings

const { warning } = estimateTokens(longPrompt, {
  model: "gpt-4"
});

if (warning) {
  console.warn(warning);
}

Budgeting and truncation helpers

const { estimateCompletionBudget, fitsContextWindow, truncateToFit } = require("llm-token-estimator");

estimateCompletionBudget(prompt, {
  model: "gpt-4o",
  reservedOutputTokens: 1200
});

fitsContextWindow(prompt, {
  model: "llama-2",
  outputTokens: 256
});

truncateToFit(prompt, {
  model: "gpt-4",
  outputTokens: 512
});

Listing supported models

const { listModels, getModelInfo } = require("llm-token-estimator");

console.log(listModels());
console.log(listModels({ vendor: "openai" }));
console.log(getModelInfo("gpt-4o"));

Accuracy and Limitations

This library provides approximate token counts based on character-to-token ratios. While fast and dependency-free, it has limitations:

✅ Good for: Quick estimates, cost approximation, context limit checks
❌ Limitations: Language variations, content types, model-specific tokenization

For production applications requiring high accuracy, consider using:

tiktoken for OpenAI models
Model-specific tokenizers for other providers

Supported Models

Includes 81 models from major providers:

OpenAI: GPT-5.2, GPT-5, GPT-4.1, o3, o4-mini, GPT-OSS models, and more Anthropic: Claude 4 series (Opus 4.6, Sonnet 4.5, Haiku 4.5) Google: Gemini 3, Gemini 2.5 series Meta: LLaMA 3.x series Mistral: Large 3, Medium 3.1, Ministral 3 series Others: xAI Grok, Cohere, Alibaba Qwen, DeepSeek, Amazon Nova, and more

Use listModels() to see all supported models.

CLI usage

npx llm-token-estimator --model gpt-4o --input "Explain transformers like I'm five."
npx llm-token-estimator --list-models --vendor openai
npx llm-token-estimator --budget --model claude-3-sonnet --output-tokens 800 --file prompt.txt

Default behavior

Default model: gpt-3.5-turbo
Default language: en (English)
Default strategy: fast
Input can be:
- a string
- an array of strings
- an array of chat messages
Output tokens are not included (input only)

Example use cases

Pre-flight prompt validation
CI checks for context overflows
Prompt truncation logic
Cost estimation (approximate)
Multi-language content estimation
Model comparison and selection
Rate limiting based on token counts

API Reference

estimateTokens(input, options)

Parameters:

input (string | string[]): Text to estimate tokens for
options (object):
- model (string): Model name (default: "gpt-3.5-turbo")
- language (string): Language code for better estimation (default: "en")
- strategy ("fast" | "balanced" | "tokenizer")
- tokenizer (function): Custom tokenizer hook for exact counts

Returns: Object with tokens, characters, model, maxTokens, vendor, warning

estimateCompletionBudget(input, options)

Returns prompt token count plus reserved output budget information.

fitsContextWindow(input, options)

Returns whether the prompt plus requested output fits the model context window.

truncateToFit(input, options)

Truncates oversized input until it fits the model context window.

listModels(filters)

Returns model names, or metadata objects when includeMetadata: true is passed.

getModelInfo(modelName)

Returns enriched metadata for a single model.

Contributing

We welcome contributions! Feel free to:

Add new models
Improve estimation accuracy
Add new languages
Fix bugs or enhance documentation

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

llm-token-estimator

Installation

Usage

Basic usage

Estimation strategies

Language-aware estimation

Using chat-style inputs (array of strings)

Handling context limit warnings

Budgeting and truncation helpers

Listing supported models

Accuracy and Limitations

Supported Models

CLI usage

Default behavior

Example use cases

API Reference

estimateTokens(input, options)

estimateCompletionBudget(input, options)

fitsContextWindow(input, options)

truncateToFit(input, options)

listModels(filters)

getModelInfo(modelName)

Contributing