prompt-minifier
v0.1.1
Published
Token-aware prompt minifier with explainable diff. Shrinks LLM prompts and tells you exactly what changed.
Maintainers
Readme
prompt-minifier
Token-aware prompt minifier with explainable diff. Shrinks LLM prompts and tells you exactly what changed.
import { minify } from 'prompt-minifier';
const r = minify(
'In order to win, please could you take into consideration the fact that, ' +
'due to the fact that we are at this point in time absolutely essential, ' +
'you must respond in a timely manner.',
);
r.minified;
// → 'To win, consider the fact that, because we are now absolutely essential,
// you must respond promptly.'
r.savings.charPercent; // 45.1
r.changes.length; // 7Zero runtime dependencies. Bring your own tokenizer (gpt-tokenizer, tiktoken, anything callable). Falls back to a chars / 4 estimator if you don't.
When you'd use this
prompt-minifier finds the wordy bits in a prompt and replaces them with shorter equivalents that mean the same thing. It's most useful when:
- you're embedding user-written prompts (template prompts, GPT instructions, customer support replies) where verbosity is normal
- you have a polite system prompt with "please could you", "I would like you to", "thanks in advance" that ships on every request
- you generated a prompt with another LLM and it came back full of "in order to", "due to the fact that", and "it is important to note that"
It is not going to find anything in a tightly-written engineering prompt — and that is fine. We measured: across all 1,624 prompts in awesome-chatgpt-prompts, the median prompt saves 0%, because those prompts are already well-engineered. The point of prompt-minifier is to flatten the long tail.
Install
npm i prompt-minifierNode 18+. ESM and CJS, types included.
Quick start
Library
import { minify } from 'prompt-minifier';
const result = minify(prompt, { level: 'balanced' });
console.log(`${result.tokensBefore} → ${result.tokensAfter} tokens (-${result.savings.tokenPercent.toFixed(1)}%)`);
console.log(`${result.changes.length} changes`);
for (const c of result.changes) {
console.log(`${c.rule}: "${c.from}" → "${c.to}"`);
}CLI
# minify a file (writes to stdout)
prompt-minifier my-prompt.txt > shrunk.txt
# pipe and inspect a colored diff
cat my-prompt.txt | prompt-minifier --diff
# emit the full result as JSON
cat my-prompt.txt | prompt-minifier --json
# list every available rule
prompt-minifier --rulesLevels
minify(prompt, { level: 'balanced' }) // default| Level | Rules | Risk | What it does |
| --- | --- | --- | --- |
| safe | smart-quotes, dashes, nbsp, zero-width, collapse-whitespace | none | Character-level cleanup. Won't change a single word — but does fix the dozen-or-so Unicode characters that tokenize badly. |
| balanced (default) | safe + politeness, filler-words, verbose-phrases | low | Phrase-level rewrites: "in order to" → "to", strips "please could you", drops "as you can see". Preserves meaning. |
| aggressive | balanced + redundant-qualifiers, redundant-adverbs | medium | Strips tautological pairs ("end result" → "result", "very unique" → "unique"). May tighten phrasing in ways some readers consider stylistic, not literal. |
Each individual rule can be turned on or off — see API.
Tokenizer setup
By default, prompt-minifier uses a quick Math.ceil(text.length / 4) estimator (OpenAI's rule of thumb). For accurate counts, plug in any tokenizer that exposes a (text) => count function.
gpt-tokenizer (OpenAI, pure JS)
import { encode } from 'gpt-tokenizer';
import { minify } from 'prompt-minifier';
minify(prompt, {
level: 'balanced',
countTokens: (t) => encode(t).length,
});tiktoken (OpenAI, WASM)
import { encoding_for_model } from 'tiktoken';
import { minify } from 'prompt-minifier';
const enc = encoding_for_model('gpt-4o');
minify(prompt, {
level: 'balanced',
countTokens: (t) => enc.encode(t).length,
});llama-tokenizer-js (Llama / Mistral)
import LlamaTokenizer from 'llama-tokenizer-js';
import { minify } from 'prompt-minifier';
minify(prompt, {
level: 'balanced',
countTokens: (t) => LlamaTokenizer.encode(t).length,
});Preserve zones
Some text in a prompt is structural and must not be touched. By default prompt-minifier skips:
``` ... ```fenced code blocks{{name}}(Handlebars / Jinja)${name}(JS template literal)
You can disable either, or add custom regions:
minify(prompt, {
preserveCodeBlocks: false, // touch code too
preserveTemplateVars: false, // touch {{x}} and ${x} too
preserve: [/<thinking>[\s\S]*?<\/thinking>/g], // never touch <thinking> blocks
});The matching string is left byte-for-byte identical in the output. The pipeline splits the input into editable + preserved chunks, runs rules only on the editable ones, and reassembles.
API
function minify(input: string, options?: MinifyOptions): MinifyResult;
interface MinifyOptions {
level?: 'safe' | 'balanced' | 'aggressive'; // default 'balanced'
countTokens?: (text: string) => number; // default chars/4
preserve?: RegExp[];
preserveCodeBlocks?: boolean; // default true
preserveTemplateVars?: boolean; // default true
rules?: { include?: string[]; exclude?: string[] };
customRules?: Rule[];
}
interface MinifyResult {
minified: string;
original: string;
tokensBefore: number;
tokensAfter: number;
charsBefore: number;
charsAfter: number;
savings: {
tokens: number;
tokenPercent: number;
chars: number;
charPercent: number;
};
changes: Change[];
}
interface Change {
rule: string;
category: 'safe' | 'balanced' | 'aggressive';
from: string;
to: string;
index: number; // position in the input at the time the rule fired
savedChars: number;
}Helpers:
import { listRules, listRulesForLevel } from 'prompt-minifier';
listRules(); // every built-in rule
listRulesForLevel('balanced'); // only the rules that fire at this levelCustom rules
A rule is a name, a category (which level it belongs to), and an apply function:
import type { Rule } from 'prompt-minifier';
const noEmoji: Rule = {
name: 'no-emoji',
category: 'safe',
description: 'Strip emoji',
apply(text) {
const re = /\p{Extended_Pictographic}/gu;
const matches = Array.from(text.matchAll(re));
let result = '';
let cursor = 0;
const changes = matches.map((m) => {
const idx = m.index!;
result += text.slice(cursor, idx);
cursor = idx + m[0].length;
return { from: m[0], to: '', index: idx, savedChars: m[0].length };
});
result += text.slice(cursor);
return { text: result, changes };
},
};
minify(prompt, { customRules: [noEmoji] });Custom rules run after the built-ins selected for the level.
CLI reference
prompt-minifier [file] [options]
--level <level> safe | balanced | aggressive (default: balanced)
--diff colored before/after diff with per-rule savings
--json emit full MinifyResult as JSON
--quiet suppress trailing summary on TTY output
--rules list every built-in rule
--include <a,b> only run these rules
--exclude <a,b> run all except these
--no-preserve-code do not protect ``` ``` blocks
--no-preserve-vars do not protect {{var}} or ${var}
--version
--helpBenchmarks
Run npm run bench:corpus to reproduce these numbers locally — the script downloads awesome-chatgpt-prompts (CC0) and runs every prompt at level: 'balanced'.
Corpus: 1,624 prompts, 4.5M chars total.
| Metric | Value | | --- | ---: | | Aggregate savings | 0.40% | | Median prompt | 0.00% | | P90 (top decile) | 0.63% | | Best single prompt | 59.89% | | Prompts saving ≥ 2% | 47 (2.9%) | | Prompts saving ≥ 5% | 14 (0.9%) |
The honest read: most prompts in this corpus are already well-engineered, so they save nothing. The package earns its keep on the long-tail prompts that aren't.
For a more representative picture of "what an actual verbose prompt saves," see the bundled fixtures under test/fixtures/:
| Fixture | Balanced savings |
| --- | ---: |
| synthesized-polite.txt (polite system prompt) | 29.6% |
| synthesized-technical.txt (verbose technical instructions) | 13.6% |
FAQ
Will this change my model's output? Probably not. The safe rules touch only invisible Unicode and whitespace. The balanced rules replace verbose phrases with their meaning-preserving short forms (the kind of edits a human editor would do). aggressive strips tautological pairs that some readers consider stylistic — when in doubt, stay on balanced. A verify mode that runs both versions through a cheap LLM and measures output similarity is on the v0.2 roadmap.
Why is the token count slightly different from my real tokenizer? Because by default, prompt-minifier uses chars / 4 as a free estimator. Pass your real tokenizer via countTokens for exact numbers — see Tokenizer setup.
Why didn't it touch my code block? Code fences (``` ... ```) and template variables ({{x}}, ${x}) are protected by default. Disable with preserveCodeBlocks: false or preserveTemplateVars: false. To protect arbitrary regions, pass preserve: [/.../].
Are the changes reversible? No — a minified prompt can't be exactly reconstructed from the output. But every change is recorded in result.changes (rule, before, after, position, chars saved), so you have a full audit trail of what the package did.
Roadmap
- v0.2 —
verifymode: run both prompts through a cheap LLM (Haiku, 4o-mini) and report a semantic-similarity score, so you can flag changes that move the needle. - v0.3 — markdown-AST preserve mode (currently regex-based — works but doesn't understand nested fences).
- v0.4 — companion packages
prompt-minifier-gpt,prompt-minifier-claude,prompt-minifier-llamathat pre-wire popular tokenizers.
Contributing
The single biggest way to improve this package is to expand its phrase dictionaries. Each dictionary is a JSON file under src/data/:
verbose-phrases.json— wordy → concisepoliteness.json— polite padding to deletefiller-words.json— narrative fillerredundant-qualifiers.json— tautological pairsredundant-adverbs.json— empty intensifiers
Open a PR adding entries — no code changes required for new phrases. Each entry should be unambiguous and meaning-preserving in nearly every context. If a substitution would change meaning under a plausible reading, leave it out.
Author
Cihangir Bozdogan — [email protected]
License
MIT © 2026 — see LICENSE.
