mdmin
v0.2.3
Published
Markdown compression + context window management for LLM apps. Rule-based engine (up to 35% token savings) + ContextBudget sliding window manager.
Maintainers
Readme
mdmin
Markdown compression + context window management for LLM apps.
Website: mdmin.dev • npm: npmjs.com/package/mdmin • Extension: Chrome Web Store (pending review)
What it does
Two things, used together or independently:
Compress markdown — rule-based engine strips verbose phrases, redundant formatting, and structural waste. 13–35% token savings depending on content. Free, instant, zero API calls.
Manage context windows —
ContextBudgetautomatically handles sliding message windows for chat apps, agents, and RAG pipelines. Compresses old turns before dropping them, pins critical facts that never disappear, protects recent messages verbatim.
Install
npm install -g mdminCLI
# Compress a file (output to stdout)
mdmin compress README.md
# Save to file
mdmin compress README.md -o README.min.md
# Batch compress a directory
mdmin compress ./docs/ --level aggressive
# Show token stats across levels
mdmin stats README.md
# Pipe from stdin
cat file.md | mdmin compress -Programmatic
const { compress, estimateTokens } = require('mdmin')
const { output, stats } = compress(markdownText, {
level: 'medium', // light | medium | aggressive
})
console.log(stats)
// { inputTokens: 2273, outputTokens: 1765, saved: 508, pct: 22.3 }Compression Levels
| Level | Savings | What it does |
|---|---|---|
| light | ~10% | Whitespace, comments, basic verbose patterns |
| medium | ~20-25% | + more verbose patterns, table compression, formatting cleanup |
| aggressive | ~25-35% | + article stripping, list compression, bold removal, dictionary dedup |
What It Compresses
- Verbose phrases: 150+ patterns — "In order to" → "To", "Due to the fact that" → "Because"
- Whitespace: Blank lines, trailing spaces, decorative horizontal rules
- Tables: Markdown tables → compact CSV or key:value format
- Formatting: Redundant bold on headers, deep heading nesting, emphasis markers
- Lists: Short bullet lists → inline comma-separated (aggressive)
- Links: Empty titles, unused references, verbose alt text
- Dictionary dedup: Repeated phrases replaced with §1, §2 tokens
ContextBudget — Context Window Manager
ContextBudget solves a problem every LLM app developer hits: what do you do when your conversation history exceeds the model's context window? Most developers splice off old messages with brittle ad-hoc code. ContextBudget does it properly — compressing before dropping, always protecting what matters most.
Priority order (highest to lowest)
| Priority | What | Behaviour |
|----------|------|-----------|
| 1st | Pins | .pin() facts are never compressed or dropped, ever |
| 2nd | Recent turns | Last keepLastN messages always verbatim |
| 3rd | Context docs | .addContext() docs compressed on ingestion, never dropped |
| 4th | Older messages | Rule-compressed first, then dropped oldest-first if still over |
| Last resort | System prompt | Only compressed if nothing else fits |
Basic usage
const { ContextBudget } = require('mdmin')
const budget = new ContextBudget({
limit: 128_000, // your model's context window (tokens)
reserve: 8_000, // headroom for LLM output
keepLastN: 10, // recent turns always kept verbatim
compressionLevel: 'aggressive', // light | medium | aggressive
})
// One-time setup
budget.setSystem('You are a helpful assistant.')
budget.pin('user_id=u_123') // survives every trim — use for session state
budget.pin('task=migrate schema v3') // useful for long agent loops
// RAG / knowledge base docs — compressed immediately on ingestion
budget.addContext(kbArticle, { label: 'Auth Guide' })
budget.addContext(pricingDoc, { label: 'Pricing' })
// Message loop — works with any LLM provider
await budget.addMessage({ role: 'user', content: userInput })
const { messages, stats } = budget.get()
// Pass directly to OpenAI, Anthropic, or any provider:
const response = await openai.chat.completions.create({ model: 'gpt-4o', messages })
await budget.addMessage({ role: 'assistant', content: response.choices[0].message.content })
// stats tells you what happened:
// { used: 14820, remaining: 105180, messageCount: 12, dropped: 0, compressed: 3 }Options
new ContextBudget({
limit: 128_000, // hard token ceiling (required)
reserve: 8_000, // held back for LLM output (default: 8000)
keepLastN: 10, // messages protected from compression/drop (default: 10)
compressionLevel: 'aggressive', // rule compression level (default: 'aggressive')
compressFn: null, // optional async (text) => string — plug in deep compress
onTrim: null, // optional callback ({ dropped, compressed, stats }) => void
})API reference
| Method | Returns | Description |
|--------|---------|-------------|
| .setSystem(text) | this | Set system prompt |
| .addContext(text, { label }) | this | Add RAG doc — compressed on ingestion, never dropped |
| .pin(fact) | this | Add a sticky fact — never dropped or compressed |
| .removePin(fact) | this | Remove a pinned fact |
| .addMessage(message) | Promise<this> | Add a chat message — triggers trim if over budget |
| .get() | { messages, stats } | Get final array ready for any LLM API |
| .tokenCount() | number | Current total token usage |
| .clear() | this | Reset messages + contexts, keep system + pins |
Tool / non-string content
Non-string message content (tool call arrays, image blocks) passes through unchanged. Token count estimated via JSON.stringify. Content is never compressed, only dropped if the budget is truly exhausted.
await budget.addMessage({
role: 'tool',
content: [{ type: 'tool_result', tool_use_id: 'abc', content: 'ok' }]
})TypeScript
Full TypeScript types included in web/lib/engine.ts for browser use, and the class is written in standard JS with JSDoc-compatible private fields for Node.js.
import { ContextBudget, ContextBudgetOptions, BudgetResult } from './lib/engine'MCP Server
Use mdmin as a tool in Claude Desktop, Cursor, or any MCP-compatible assistant:
{
"mcpServers": {
"mdmin": {
"command": "npx",
"args": ["mdmin-mcp"]
}
}
}With your API key (for compression history and Pro features):
{
"mcpServers": {
"mdmin": {
"command": "npx",
"args": ["mdmin-mcp"],
"env": {
"MDMIN_API_KEY": "mdmin_sk_..."
}
}
}
}Claude Code CLI:
claude mcp add mdmin npx mdmin-mcpAPI
Rule-based compression via HTTP — free for all registered users.
curl https://mdmin.dev/api/v1/compress \
-H "Authorization: Bearer $MDMIN_API_KEY" \
-H "Content-Type: application/json" \
-d '{"markdown": "...", "level": "aggressive"}'Get your API key at mdmin.dev/settings.
Browser Extension
Compress any selected text on any webpage in 2 clicks — no copy-paste, no tab switching.
Chrome Web Store — pending review (link will be added once approved)
Select text → click the mdmin icon → compressed text is ready to copy. Works offline, no account required.
Tiers:
- Free (no account) — compress locally, zero network calls
- Free (with account) — compressions logged to your dashboard; token savings tracked
- Pro — Deep Compress button in the popup (50%+ savings on verbose text)
Connect your account by pasting your mdmin_sk_ key in the extension settings panel. Generate one at mdmin.dev/settings.
Pro
$8/month — adds Deep Compress (LLM rewriting on top of the rule layer), compression history, and Deep Compress in the browser extension. 500 runs/month, resets on billing date.
Details at mdmin.dev.
License
AGPL-3.0-only — see LICENSE.txt
