npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@anduril-code/compact.md

v0.1.6

Published

Token-efficient, context-aware compression for agent pipelines.

Readme

compact.md

Token-efficient Markdown compression and document intelligence for agent pipelines.

npm version license: MIT node >=20 bun >=1.3


Why compact.md

Markdown has become the lingua franca of AI agents, but it wastes 30–50% of tokens on formatting syntax: table borders, heading markers, repetitive delimiters, whitespace padding. Every token spent on structure is a token not spent on content.

compact.md gives agents a spectrum of strategies for fitting more useful content into a context window:

  • Lossless compressioncompact()/expand() deterministically encode and decode Markdown with zero information loss. expand(compact(md)) === md, always.
  • Targeted extraction — pull out only the sections an agent needs, with optional truncation limits.
  • AI summarization — abstractive LLM summaries (~200 tokens by default) for breadth-first exploration of large docs, with results cached so repeated calls are free.

The library and CLI expose the lossless path. The MCP server exposes all three.


Features

  • Lossless round-tripexpand(compact(md)) === md, always, verified by property tests
  • 30–50% token reduction on typical agent documents (lossless path)
  • Zero runtime dependencies for the core encode/decode path
  • Library + CLI + MCP server — one package, three interfaces
  • Stage-based pipeline — structural, whitespace, dedup, and semantic stages, each independently toggleable
  • Readable without expansion — compact format is parseable by LLMs even before expanding
  • Section navigation — list document structure with per-section token counts before loading any content
  • Targeted extraction — retrieve specific sections verbatim with character/row/item truncation limits
  • AI summarization — LLM-powered abstractive summaries with docType-aware prompts and in-process caching

Installation

npm install compact.md
# or
bun add compact.md

Quick Start

import { compact, expand, verify } from 'compact.md';

const md = `# Project Status

## Tasks

- [x] Database migration
- [ ] Frontend integration

| Name  | Role    | Status |
|-------|---------|--------|
| Alice | Lead    | Active |
| Bob   | Backend | Active |
`;

const result = compact(md);
console.log(result.output);
// # Project Status
// ## Tasks
// [x] Database migration
// [] Frontend integration
// |: Name, Role, Status
// | Alice, Lead, Active
// | Bob, Backend, Active

const restored = expand(result.output);
// restored === md  ✓

console.log(verify(md)); // true

With options and stats:

const { output, stats } = compact(md, {
  dedup: true,
  semantic: true,
  stats: true,
});

console.log(stats.savings); // e.g. 0.38 (38% fewer tokens)

API Reference

Library

import { compact, compactDiff, expand, pruneLog, verify, createPipeline } from 'compact.md';

compact(markdown, options?): CompactResult

Compresses a Markdown string. Returns { output: string, stats? }.

| Option | Type | Default | Description | |---|---|---|---| | dedup | boolean | false | Enable deduplication stage (dictionary substitution for repeated substrings) | | semantic | boolean | false | Enable semantic stage (strip redundant markup, normalize unicode punctuation) | | keepComments | boolean | false | Preserve HTML comments (stripped by default) | | onlySections | string[] | — | Keep only the listed heading sections | | stripSections | string[] | — | Remove the listed heading sections | | unwrapLines | boolean | false | Join soft-wrapped paragraph lines into a single line | | tableDelimiter | string | "," | Cell delimiter used in compact table rows | | versionMarker | boolean | false | Prepend %compact.md:1 version header | | stats | boolean | false | Compute and return token-saving statistics |

expand(compactText, options?): string

Expands compact.md format back to standard Markdown.

| Option | Type | Default | Description | |---|---|---|---| | tableDelimiter | string | "," | Cell delimiter used when reading compact table rows |

verify(markdown, options?): boolean

Returns true if expand(compact(markdown)) === markdown.

compactDiff(diffText, options?): string

Compresses unified git diff text (lossy, one-way). Useful for PR review and change analysis.

| Option | Type | Default | Description | |---|---|---|---| | context | number | 1 | Context lines to keep around changed lines (0 strips all context) | | compactHeaders | boolean | true | Replace diff/index/---/+++ header block with === path | | changesOnly | boolean | false | Emit only file path + changed lines (+/-) |

pruneLog(logText, options?): LogPruneResult

Lossy log/terminal output pruning for test, build, and CI output.

| Option | Type | Default | Description | |---|---|---|---| | stripAnsi | boolean | true | Strip ANSI and terminal control sequences | | foldProgress | boolean | true | Fold spinner/progress runs | | stripTimestamps | 'auto' \| 'strip' \| 'keep' | 'auto' | Timestamp pruning mode | | elidePassingTests | boolean | true | Remove passing tests when failures exist | | foldDebugLines | boolean | true | Fold debug-level log lines into a summary count | | elideHealthChecks | boolean | true | Remove /health//readyz-style noise | | foldJsonLines | boolean | true | Aggregate JSON-per-line logs by severity | | foldFrameworkStartup | boolean | true | Fold startup banner and boot boilerplate | | stripUserAgents | boolean | true | Replace long user-agent strings with <ua> | | dedupeStackTraces | boolean | true | Collapse repeated stack traces in retry loops | | foldRepeatedLines | boolean | true | Fold repetitive normalized lines | | foldGlobalRepeats | boolean | true | Fold non-consecutive repeated normalized lines | | allowTokenExpansion | boolean | false | Keep transformed output even if token count increases | | thresholdTokens | number | — | Optional token gate threshold metadata | | profile | 'test' \| 'ci' \| 'lint' \| 'runtime' | — | Preset pruning strategy; can be overridden by explicit options | | customRules | LogCustomRule[] | — | Optional strip/fold/block rules |

pruneLog() also accepts an optional tokenCounter ({ count(text): number }) for custom tokenization parity in no-regression decisions.

createPipeline(stages): Pipeline

Assembles a custom pipeline from an ordered array of Stage objects for advanced use cases.


CLI

Install globally or run via npx:

npx compact.md <command> [options]

| Command | Description | |---|---| | compact | Compress a Markdown file to compact.md format | | changes | Compress unified diff output for lower token usage | | prune | Lossy prune of terminal/log output | | expand | Expand a compact.md file back to Markdown | | extract | Extract and compress specific sections only | | verify | Assert lossless round-trip for a file | | metrics | Report token savings without writing output | | sections | List the heading sections in a document | | locate | Search sections by keyword |

# Compress
compact.md compact input.md -o output.cmd

# Expand
compact.md expand output.cmd -o restored.md

# Verify round-trip
compact.md verify input.md

# Stats only
compact.md metrics input.md

# Pipe-friendly
cat doc.md | compact.md compact > compressed.cmd
git diff | compact.md changes --changes-only
cat test-output.log | compact.md prune --stats
cat lint.log | compact.md prune --profile lint --stats
cat server.log | compact.md prune --profile runtime

# With options
compact.md compact input.md --dedup --semantic --stats

MCP Server

Add to your MCP client config:

{
  "mcpServers": {
    "compact-md": {
      "command": "npx",
      "args": ["compact-md-mcp"]
    }
  }
}

The MCP server exposes a spectrum of token-reduction strategies. Tools are grouped below by fidelity tier — from lossless to AI-summarized:

Lossless compression

| Tool | Description | |---|---| | compact_md_compact | Compress Markdown to compact.md format — fully reversible | | compact_md_expand | Expand compact.md format back to standard Markdown | | compact_md_verify | Assert that round-trip is lossless for a given input | | compact_md_metrics | Report token savings without writing any output | | compact_md_changes | Compress unified git diff text (one-way, lossy) | | compact_md_prune | Lossy pruning for logs/terminal output with token gate + optional summarize fallback |

Section navigation (start here for unknown documents)

| Tool | Description | |---|---| | compact_md_sections | List the section TOC with per-section token counts — use this first to budget context before loading content | | compact_md_locate | Search sections by keyword to find relevant content without reading the whole document |

Targeted extraction (verbatim content, optionally truncated)

| Tool | Description | |---|---| | compact_md_extract | Retrieve exact section content, with optional maxChars / maxListItems / maxTableRows truncation |

AI summarization (lossy, cached, higher token reduction)

| Tool | Description | |---|---| | compact_md_summarize | Abstractive LLM summary (~200 tokens by default). Supports docType: auto | guide | reference | spec. Results are cached — repeated calls on unchanged files are instant. | | compact_md_batch | Summarize multiple files in parallel in a single round-trip. Ideal for repo onboarding. |

Recommended agent workflow

1. compact_md_sections          → see document structure + token sizes
2a. doc is small (<500 tokens)  → read it directly
2b. need a high-level gist      → compact_md_summarize
2c. need a specific section     → compact_md_extract with onlySections
2d. need compressed full doc    → compact_md_compact

Compact Format Reference

Every transformation is lossless and reverses exactly on expand. Most of the token savings come from tables, list syntax, and tight block packing — not from rewriting every construct.

| Construct | Standard Markdown | compact.md output | |---|---|---| | Heading | ## Section | ## Section (unchanged) | | Ordered list item | 1. First | + First | | Nested unordered item | ··- Nested (2-space indent) | ..- Nested | | Table header row | \| A \| B \| + \|---|---| separator | \|: A, B | | Table data row | \| 1 \| 2 \| | \| 1, 2 | | Task list (incomplete) | - [ ] Todo | [] Todo | | Task list (complete) | - [x] Done | [x] Done | | Code fence | ```python … ``` | ```python … ``` (unchanged) | | Horizontal rule | --- | --- (unchanged) | | Version marker (optional) | — | %compact.md:1 |

What changes: tables (separator row and padding eliminated), ordered list numbers (1.+), nested list indentation (spaces → .. per level), and task list brackets (- [ ][]). Consecutive compact blocks (headings, tables, HR) are also tightly packed with a single newline between them instead of a blank line.

What passes through unchanged: headings, code blocks, horizontal rules, paragraphs, blockquotes, bold, italic, inline code, links, images, and frontmatter.

Note: The parser also accepts a shorthand heading syntax (:1 Title, :2 Section, …) and single-backtick code fences (`python … `) for manually authored compact input, but compact() does not produce these forms.

Dedup dictionary

When dedup: true and savings exceed 5%, repeated substrings are replaced with §N tokens and a dictionary is prepended:

§1=repeated substring here
§2=another repeated phrase
§§
(rest of compact content)

Development

bun install         # install dependencies
bun test            # run tests
bun run build       # compile ESM + CJS + type declarations
bun run lint        # biome check (lint + format)
bun run typecheck   # tsc --noEmit

Contributing

Read AGENTS.md before contributing — it documents the architecture invariants, the one-way dependency graph, and the rules that keep files small and the core zero-dependency.

The primary invariant is lossless round-trip: expand(compact(md)) === md for all inputs, always. When in doubt between two approaches, prefer the one that makes this guarantee easier to maintain.


License

MIT