tokenfit
v0.1.1
Published
Fit text into LLM token budgets — estimate, trim, and pack prompts with zero dependencies.
Maintainers
Readme
tokenfit
Fit text into LLM token budgets — estimate, trim, and pack prompts with zero dependencies.
Every app that talks to an LLM eventually fights the same battle: the context window.
You have retrieved documents, chat history, logs, and system rules — and they don't all fit.
tokenfit is a tiny, dependency-free toolkit that helps you measure how much you have and
keep only what fits, without pulling in a megabyte-sized tokenizer.
import { pack, trim, estimateTokens } from "tokenfit";
estimateTokens("How many tokens is this?"); // → 7
// Keep the largest high-priority subset that fits in 4000 tokens
const { text, dropped } = pack(
[
{ text: systemRules, priority: 10 },
{ text: retrievedDocs, priority: 5 },
{ text: chatHistory, priority: 1 },
],
4000,
);Why tokenfit?
- Zero dependencies. No native bindings, no 2 MB vocab files. Drops into edge functions, Cloudflare Workers, browsers, and serverless without a cold-start penalty.
- Conservative by design. The built-in estimator brackets real BPE behaviour so you under-fill rather than overflow the window.
- Bring your own tokenizer. Need exact counts? Pass
tiktoken, Anthropic's tokenizer, or any(text) => numberto every API. - Three things, done well.
estimateTokens,trim, andpack— fully typed, tested, and documented. - ESM + CJS + types, with a handy CLI.
Install
npm install tokenfit
# or: pnpm add tokenfit / yarn add tokenfit / bun add tokenfitAPI
estimateTokens(text): number
Fast, dependency-free token estimate. Blends a chars / 4 and a words / 0.75 signal and
takes the larger of the two, so it stays conservative for both prose and code.
estimateTokens(""); // 0
estimateTokens("hello world"); // 3Estimates are typically within ~10–15% of
tiktokenfor English and common code. When you need exact counts, supplycountTokens(below).
trim(text, budget, options?): string
Trim a string so its token count never exceeds budget. The result — including the
ellipsis marker — is guaranteed to fit.
trim(longLog, 2000, { strategy: "start" }); // keep the tail (newest log lines)
trim(bigFile, 1500, { strategy: "middle" }); // keep both ends, drop the middle
trim(article, 500); // strategy defaults to "end"| Option | Type | Default | Description |
| ----------- | --------------------------------- | -------- | --------------------------------------------- |
| strategy | "end" \| "start" \| "middle" | "end" | Which part of the text to drop. |
| ellipsis | string | "…" | Marker inserted where text was removed. |
| countTokens | (text) => number | built-in | Custom token counter. |
pack(items, budget, options?): PackResult
Greedily assemble the largest subset of items that fits the budget, highest priority
first, accounting for the separator between items.
const result = pack(
[
{ text: rules, priority: 10, id: "rules" },
{ text: docA, priority: 5, id: "docA" },
{ text: docB, priority: 5, id: "docB" },
],
3000,
{ separator: "\n\n---\n\n", trimLast: true },
);
result.text; // assembled prompt, ≤ 3000 tokens
result.tokens; // estimated token count of result.text
result.included; // items that made it in (output order)
result.dropped; // items left out| Option | Type | Default | Description |
| -------------- | ------------------------------ | -------- | ---------------------------------------------------- |
| separator | string | "\n\n" | Inserted between included items. |
| trimLast | boolean | false | Trim the first non-fitting item to use the leftover. |
| trimStrategy | "end" \| "start" \| "middle" | "end" | Strategy used when trimLast is on. |
| countTokens | (text) => number | built-in | Custom token counter. |
Bring your own tokenizer
For exact counts, hand any counter to any function:
import { encoding_for_model } from "tiktoken";
import { pack } from "tokenfit";
const enc = encoding_for_model("gpt-4o");
const countTokens = (t: string) => enc.encode(t).length;
pack(items, 8000, { countTokens });CLI
tokenfit ships a small CLI for shell pipelines:
# Estimate tokens
cat big.log | tokenfit count
tokenfit count README.md
# Trim to a budget (reads stdin or a file)
cat big.log | tokenfit trim -b 2000 -s start
tokenfit trim --budget 500 --strategy middle notes.mdtokenfit count [file] Estimate tokens (stdin if no file)
tokenfit trim --budget <n> [file] Trim text to fit a token budget
--budget, -b <n> Token budget (required)
--strategy, -s <s> end | start | middle (default: end)
--ellipsis <str> Marker for removed textRecipes
Keep a chat history under budget (newest wins):
const history = messages.map((m) => ({ text: m.content, priority: m.index }));
const { text } = pack(history, 6000, { trimLast: true, trimStrategy: "start" });Truncate a noisy log before sending it to a model:
const safe = trim(rawLog, 4000, { strategy: "start" }); // keep the most recent linesHow accurate is the estimator?
The default estimator is a heuristic, not a tokenizer. It is designed to be slightly
conservative — it tends to estimate a touch high so your prompts fit on the first try.
For budgeting, trimming, and packing this is exactly what you want. When you need
guaranteed-exact counts (e.g. billing), plug in a real tokenizer via countTokens.
Contributors ✨
This project follows the all-contributors specification. Contributions of any kind are welcome — code, docs, bug reports, ideas, reviews! See the emoji key for how each contribution is recognized, and open a PR or issue to get involved.
Thanks goes to these wonderful people:
License
MIT © Tung Tran
