promptsize
v0.1.0
Published
size-limit for LLM prompts. Count the tokens in your prompt & context files, set budgets, and fail CI when a prompt grows past its limit.
Maintainers
Readme
promptsize
size-limit for your LLM prompts. Count the tokens in your prompt and context files, set budgets, and fail CI when a prompt quietly grows past its limit.
Prompts and context templates balloon over time — a few extra few-shot examples
here, a longer system prompt there — until you blow a context window or your
per-call cost in production. promptsize treats your prompts like a build
artifact: it measures them on every PR and tells you (and your reviewers) when
they cross a line you set.
promptsize
agent system prompt
Limit: 1 K tokens
Size: 214 tokens (o200k_base)
✔ within budget (+38 vs baseline)
few-shot examples
Limit: 8 K tokens
Size: 9.2 K tokens (o200k_base)
✘ over budget by 1.2 K tokens
1 prompt over budget.Why not just a token counter?
There are plenty of token counters and usage trackers. promptsize is the
piece that was missing: a budget gate that runs in CI, exits non-zero on a
breach, and tracks regressions against a committed baseline — the way
size-limit does for JS bundles.
- Offline & deterministic. Token counts come from
gpt-tokenizer— pure JS, no network, no API key. Same input → same number, every time. - Multi-model. Pick an encoding (
o200k_base,cl100k_base, …) or a model name (gpt-4o,gpt-4,claude-*,gemini-*). Anthropic/Gemini are approximated witho200k_base(they ship no public JS tokenizer) and the encoding used is always printed, so the number is never a black box. - Globs & grouping. Budget a single file or a whole directory of few-shot examples as one number.
- Regression tracking. Snapshot current sizes to
.promptsize.jsonand see the delta on every run.
Install
npm install --save-dev promptsize
# or: pnpm add -D promptsize / yarn add -D promptsizeRequires Node ≥ 18.3.
Configure
Create promptsize.config.json (or a .js / .mjs / .ts config, or a
"promptsize" key in package.json):
{
"tokenizer": "o200k_base",
"limits": [
{ "name": "system prompt", "path": "prompts/system.md", "limit": "2k" },
{ "name": "few-shot examples", "path": "prompts/examples/*.md", "limit": 8000 },
{ "name": "rag template", "path": "src/rag/*.txt", "limit": "1.5k", "tokenizer": "gpt-4" }
]
}| Field | Type | Notes |
| ----------- | -------------------- | ------------------------------------------------------------ |
| tokenizer | string | Default encoding/model for all entries. Default o200k_base. |
| limits[] | array | One budget per logical prompt. |
| .name | string? | Display name and baseline key. Defaults to the glob. |
| .path | string \| string[] | File path(s) / glob(s), relative to the config file. |
| .limit | number \| string | Budget in tokens. 2000, "2k", "1.5k" all work. |
| .tokenizer| string? | Per-entry override. |
Use
promptsize # check all budgets; exit 1 if any are over
promptsize --why # show per-file token breakdown
promptsize --json # machine-readable output
promptsize --save # write current sizes to .promptsize.json (commit it)
promptsize --silent # exit code only, no output
promptsize -c path/to/config.jsonExit codes: 0 within budget · 1 over budget · 2 config/runtime error.
Track regressions
Commit a baseline once, then every run shows the delta:
promptsize --save && git add .promptsize.jsonGitHub Action
# .github/workflows/promptsize.yml
name: promptsize
on: [pull_request]
jobs:
budget:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: abdulmunimjemal/promptsize@v1
with:
why: "true" # optional: per-file breakdown in the log
# config: custom/path.json # optionalThe Action is self-contained (no install step) and adds a job summary plus an
inline ::error:: annotation for every prompt that's over budget.
Or just run the CLI in any workflow: npx promptsize.
Programmatic API
import { loadConfig, analyze, countTokens } from "promptsize";
const { config, dir } = await loadConfig();
const result = await analyze(config, { baseDir: dir });
console.log(result.ok, result.entries);
await countTokens("some prompt text", "gpt-4o"); // -> numberLicense
MIT © Abdulmunim Jemal
