mdmin
v0.3.1
Published
Markdown compression + context extraction for LLM apps. Rule-based engine (up to 35% savings) + TF-IDF context extractor (up to 95% reduction on targeted queries).
Maintainers
Readme
mdmin
Markdown compression + context extraction for LLM apps.
Website: mdmin.dev • npm: npmjs.com/package/mdmin • PyPI: pypi.org/project/mdmin • VS Code: marketplace
What it does
Three things, used together or independently:
Compress markdown — rule-based engine strips verbose phrases, redundant formatting, and structural waste. 13–35% token savings. Free, instant, zero API calls.
Extract relevant context — given a large document + a query, returns only the relevant chunks within a token budget. TF-IDF based, no external API, runs in milliseconds. Achieves 70–95% reduction on targeted queries. Free.
Manage context windows —
ContextBudgetautomatically handles sliding message windows for chat apps, agents, and RAG pipelines. Compresses old turns before dropping them, pins critical facts that never disappear.
Install
npm install -g mdminpip install mdminCLI
Compress
mdmin compress README.md # output to stdout
mdmin compress README.md -o README.min.md # save to file
mdmin compress ./docs/ --level aggressive # batch compress directory
mdmin stats README.md # token stats at all levels
cat file.md | mdmin compress - # pipe from stdinExtract
Extract only the sections of a document relevant to your query — no LLM required.
# Single file
mdmin extract CLAUDE.md --query "how does auth work"
mdmin extract bigdoc.md -q "database schema" --max 1500
# Query across a directory (multi-doc)
mdmin extract ./docs/ -q "authentication"
mdmin extract ./docs/ -q "auth" --recursive --max 4000Example output:
Querying 12 files: "authentication"
Found relevant content in 3/12 files | 847 tokens
── api-docs.md ──
## Authentication
JWT tokens required. Pass Bearer token in Authorization header...
── architecture.md ──
## Auth Service
Node.js on port 3001. Handles all auth for the platform...Programmatic
Node.js — Compress
const { compress, estimateTokens } = require('mdmin')
const { output, stats } = compress(markdownText, { level: 'medium' })
console.log(stats)
// { inputTokens: 2273, outputTokens: 1765, saved: 508, pct: 22.3 }Node.js — Extract
const { ContextExtractor } = require('mdmin')
const extractor = new ContextExtractor()
extractor.index(largeDocument)
const { text, stats } = extractor.extract('how does auth work', { maxTokens: 2000 })
console.log(text) // only the relevant chunks
console.log(stats.reduction) // e.g. 91.2 (%)Python — Compress + Extract
from mdmin import compress, extract
# Compress
result = compress(text, level="medium")
print(result.stats.pct) # 22.3
# Extract
result = extract(large_doc, "how does auth work", max_tokens=2000)
print(result.text) # relevant chunks only
print(result.stats.reduction) # 91.2mdmin compress file.md --level aggressive
mdmin extract file.md -q "auth flow" --max 2000
mdmin stats file.mdCompression Levels
| Level | Savings | What it does |
|---|---|---|
| light | ~10% | Whitespace, comments, basic verbose patterns |
| medium | ~20-25% | + more verbose patterns, table compression, formatting cleanup |
| aggressive | ~25-35% | + article stripping, list compression, bold removal, dictionary dedup |
What Gets Compressed
- Verbose phrases: 150+ patterns — "In order to" → "To", "Due to the fact that" → "Because"
- Whitespace: Blank lines, trailing spaces, decorative horizontal rules
- Tables: Markdown tables → compact CSV or key:value format
- Formatting: Redundant bold on headers, deep heading nesting, emphasis markers
- Lists: Short bullet lists → inline comma-separated (aggressive)
- Links: Empty titles, unused references, verbose alt text
- Dictionary dedup: Repeated phrases replaced with §1, §2 tokens
ContextBudget — Context Window Manager
ContextBudget solves a problem every LLM app developer hits: what do you do when your conversation history exceeds the model's context window? Most developers splice off old messages with brittle ad-hoc code. ContextBudget does it properly — compressing before dropping, always protecting what matters most.
Priority order (highest to lowest)
| Priority | What | Behaviour |
|----------|------|-----------|
| 1st | Pins | .pin() facts are never compressed or dropped, ever |
| 2nd | Recent turns | Last keepLastN messages always verbatim |
| 3rd | Context docs | .addContext() docs compressed on ingestion, never dropped |
| 4th | Older messages | Rule-compressed first, then dropped oldest-first if still over |
| Last resort | System prompt | Only compressed if nothing else fits |
Basic usage
const { ContextBudget } = require('mdmin')
const budget = new ContextBudget({
limit: 128_000, // your model's context window (tokens)
reserve: 8_000, // headroom for LLM output
keepLastN: 10, // recent turns always kept verbatim
compressionLevel: 'aggressive',
})
budget.setSystem('You are a helpful assistant.')
budget.pin('user_id=u_123')
budget.addContext(kbArticle, { label: 'Auth Guide' })
await budget.addMessage({ role: 'user', content: userInput })
const { messages, stats } = budget.get()
// Pass directly to any LLM providerAPI reference
| Method | Returns | Description |
|--------|---------|-------------|
| .setSystem(text) | this | Set system prompt |
| .addContext(text, { label }) | this | Add RAG doc — compressed on ingestion, never dropped |
| .pin(fact) | this | Add a sticky fact — never dropped or compressed |
| .addMessage(message) | Promise<this> | Add a chat message — triggers trim if over budget |
| .get() | { messages, stats } | Get final array ready for any LLM API |
| .tokenCount() | number | Current total token usage |
| .clear() | this | Reset messages + contexts, keep system + pins |
MCP Server
Use mdmin as a tool in Claude Desktop, Cursor, or any MCP-compatible assistant:
{
"mcpServers": {
"mdmin": {
"command": "npx",
"args": ["mdmin-mcp"],
"env": { "MDMIN_API_KEY": "mdmin_sk_..." }
}
}
}claude mcp add mdmin npx mdmin-mcpMCP tools: compress_markdown, extract_relevant, get_stats, manage_context_budget
API
Free for all registered users. Get your key at mdmin.dev/settings.
# Compress
curl https://mdmin.dev/api/v1/compress \
-H "Authorization: Bearer $MDMIN_API_KEY" \
-H "Content-Type: application/json" \
-d '{"markdown": "...", "level": "aggressive"}'
# Extract (free tier: 100KB doc, 2K token budget)
curl https://mdmin.dev/api/v1/extract \
-H "Authorization: Bearer $MDMIN_API_KEY" \
-H "Content-Type: application/json" \
-d '{"document": "...", "query": "how does auth work", "max_tokens": 2000}'Browser Extension
Compress any selected text on any webpage in 2 clicks — no copy-paste, no tab switching.
Chrome Web Store — pending review
Select text → click the mdmin icon → compressed text is ready to copy. Works offline, no account required.
VS Code Extension
Token counter in status bar + compress document/selection from command palette or right-click.
Install from VS Code Marketplace or search "mdmin" in the Extensions panel.
Pro
$8/month — Deep Compress (LLM rewriting, 50%+ savings), compression history, Pro API limits (2MB docs, 16K token budgets on extract). Details at mdmin.dev.
License
AGPL-3.0-only — see LICENSE.txt
