npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

tokenfit

v0.1.1

Published

Fit text into LLM token budgets — estimate, trim, and pack prompts with zero dependencies.

Readme

tokenfit

All Contributors

Fit text into LLM token budgets — estimate, trim, and pack prompts with zero dependencies.

CI npm version bundle size types license

Every app that talks to an LLM eventually fights the same battle: the context window. You have retrieved documents, chat history, logs, and system rules — and they don't all fit. tokenfit is a tiny, dependency-free toolkit that helps you measure how much you have and keep only what fits, without pulling in a megabyte-sized tokenizer.

import { pack, trim, estimateTokens } from "tokenfit";

estimateTokens("How many tokens is this?"); // → 7

// Keep the largest high-priority subset that fits in 4000 tokens
const { text, dropped } = pack(
  [
    { text: systemRules,   priority: 10 },
    { text: retrievedDocs, priority: 5  },
    { text: chatHistory,   priority: 1  },
  ],
  4000,
);

Why tokenfit?

  • Zero dependencies. No native bindings, no 2 MB vocab files. Drops into edge functions, Cloudflare Workers, browsers, and serverless without a cold-start penalty.
  • Conservative by design. The built-in estimator brackets real BPE behaviour so you under-fill rather than overflow the window.
  • Bring your own tokenizer. Need exact counts? Pass tiktoken, Anthropic's tokenizer, or any (text) => number to every API.
  • Three things, done well. estimateTokens, trim, and pack — fully typed, tested, and documented.
  • ESM + CJS + types, with a handy CLI.

Install

npm install tokenfit
# or: pnpm add tokenfit  /  yarn add tokenfit  /  bun add tokenfit

API

estimateTokens(text): number

Fast, dependency-free token estimate. Blends a chars / 4 and a words / 0.75 signal and takes the larger of the two, so it stays conservative for both prose and code.

estimateTokens("");                  // 0
estimateTokens("hello world");       // 3

Estimates are typically within ~10–15% of tiktoken for English and common code. When you need exact counts, supply countTokens (below).

trim(text, budget, options?): string

Trim a string so its token count never exceeds budget. The result — including the ellipsis marker — is guaranteed to fit.

trim(longLog, 2000, { strategy: "start" });   // keep the tail (newest log lines)
trim(bigFile, 1500, { strategy: "middle" });  // keep both ends, drop the middle
trim(article, 500);                            // strategy defaults to "end"

| Option | Type | Default | Description | | ----------- | --------------------------------- | -------- | --------------------------------------------- | | strategy | "end" \| "start" \| "middle" | "end" | Which part of the text to drop. | | ellipsis | string | "…" | Marker inserted where text was removed. | | countTokens | (text) => number | built-in | Custom token counter. |

pack(items, budget, options?): PackResult

Greedily assemble the largest subset of items that fits the budget, highest priority first, accounting for the separator between items.

const result = pack(
  [
    { text: rules,  priority: 10, id: "rules" },
    { text: docA,   priority: 5,  id: "docA"  },
    { text: docB,   priority: 5,  id: "docB"  },
  ],
  3000,
  { separator: "\n\n---\n\n", trimLast: true },
);

result.text;      // assembled prompt, ≤ 3000 tokens
result.tokens;    // estimated token count of result.text
result.included;  // items that made it in (output order)
result.dropped;   // items left out

| Option | Type | Default | Description | | -------------- | ------------------------------ | -------- | ---------------------------------------------------- | | separator | string | "\n\n" | Inserted between included items. | | trimLast | boolean | false | Trim the first non-fitting item to use the leftover. | | trimStrategy | "end" \| "start" \| "middle" | "end" | Strategy used when trimLast is on. | | countTokens | (text) => number | built-in | Custom token counter. |

Bring your own tokenizer

For exact counts, hand any counter to any function:

import { encoding_for_model } from "tiktoken";
import { pack } from "tokenfit";

const enc = encoding_for_model("gpt-4o");
const countTokens = (t: string) => enc.encode(t).length;

pack(items, 8000, { countTokens });

CLI

tokenfit ships a small CLI for shell pipelines:

# Estimate tokens
cat big.log | tokenfit count
tokenfit count README.md

# Trim to a budget (reads stdin or a file)
cat big.log | tokenfit trim -b 2000 -s start
tokenfit trim --budget 500 --strategy middle notes.md
tokenfit count [file]                 Estimate tokens (stdin if no file)
tokenfit trim --budget <n> [file]     Trim text to fit a token budget
  --budget, -b <n>      Token budget (required)
  --strategy, -s <s>    end | start | middle   (default: end)
  --ellipsis <str>      Marker for removed text

Recipes

Keep a chat history under budget (newest wins):

const history = messages.map((m) => ({ text: m.content, priority: m.index }));
const { text } = pack(history, 6000, { trimLast: true, trimStrategy: "start" });

Truncate a noisy log before sending it to a model:

const safe = trim(rawLog, 4000, { strategy: "start" }); // keep the most recent lines

How accurate is the estimator?

The default estimator is a heuristic, not a tokenizer. It is designed to be slightly conservative — it tends to estimate a touch high so your prompts fit on the first try. For budgeting, trimming, and packing this is exactly what you want. When you need guaranteed-exact counts (e.g. billing), plug in a real tokenizer via countTokens.

Contributors ✨

This project follows the all-contributors specification. Contributions of any kind are welcome — code, docs, bug reports, ideas, reviews! See the emoji key for how each contribution is recognized, and open a PR or issue to get involved.

Thanks goes to these wonderful people:

License

MIT © Tung Tran