context-compress

v2026.5.0

Published

18 days ago

Context-aware MCP server that compresses tool output for Claude Code

0High
0Medium
0Low

jiunbae

mcp claude context compression

context-compress

Stop drowning your AI agent in shell output. Compress tool output before it hits the context window — through an MCP server, a drop-in CLI, or both.

Quickstart · Compression Modes · vs RTK · How It Works · Configuration · CLI · Changelog

93% token reduction aggressive mode

+10.5pp over RTK same commands

4 modes incl. LLM-judged auto • aggressive • balanced • conservative

8 MCP tools + standalone CLI RTK-compatible wrap

Quickstart

# 1. Install
npm install -g context-compress

# 2. One-line setup — registers the MCP server, installs the hook,
#    enables transparent Bash compression
context-compress setup --auto

# 3. (optional) Pick a mode for the session
export CONTEXT_COMPRESS_MODE=balanced   # or: aggressive, conservative, auto

That's it. Restart Claude Code and shell output is now compressed before it enters context.

Prefer no MCP at all? context-compress wrap "<cmd>" compresses any shell command's output — drop-in for RTK. See CLI.

Install context-compress — an MCP server that compresses tool output for Claude Code.
Raw data stays in sandboxed subprocesses, only concise summaries enter your context window.
Saves ~99% of tokens on large outputs while keeping everything searchable via FTS5.

npm install -g context-compress
context-compress setup --auto
context-compress doctor

More info: https://github.com/Open330/context-compress

Why?

Every byte of tool output that enters your Claude Code context window reduces quality and speed. A single git log or npm test can dump 50KB+ into context — that's ~12,000 tokens gone.

context-compress intercepts these tools, processes output in a sandbox, and returns only what matters:

Before:  git log --oneline -100  →  8.2KB into context
After:   execute("git log ...")  →  0.3KB summary + full data searchable in FTS5

It works in two modes that compose freely:

MCP server — registers as a Claude Code MCP server with 8 tools (execute, search, batch_execute, fetch_and_index, index, execute_file, stats, discover). Agents call them directly when output would be large.
Standalone CLI — context-compress wrap "<cmd>" runs any shell command and pipes the output through the same compression pipeline. Drop-in for RTK and friends. The PreToolUse hook can route Bash calls through it transparently when CONTEXT_COMPRESS_FILTER_BASH=1.

Based on context-mode by Mert Koseoğlu — rewritten in TypeScript with security hardening, architectural improvements, and better DX.

Getting Started

Install

npm install -g context-compress

One-line setup

context-compress setup --auto

Writes ~/.claude/settings.json for you: registers the MCP server, installs the PreToolUse hook, enables transparent Bash compression. Idempotent — re-running with the same paths makes zero changes. Preserves any unrelated user settings.

Manual setup

claude mcp add context-compress -- node $(which context-compress)

Or add to your project's .mcp.json:

{
  "mcpServers": {
    "context-compress": {
      "command": "node",
      "args": ["/path/to/context-compress/dist/index.js"]
    }
  }
}

Verify

context-compress doctor

How It Works

┌─────────────────────────────────────────────────────────┐
│                     Claude Code                          │
│                                                          │
│  "Run tests"  ──→  PreToolUse Hook intercepts            │
│                          │                               │
│                          ▼                               │
│               ┌──────────────────┐                       │
│               │  context-compress │                      │
│               │   MCP Server      │                      │
│               └────────┬─────────┘                       │
│                        │                                 │
│            ┌───────────┼───────────┐                     │
│            ▼           ▼           ▼                     │
│     ┌──────────┐ ┌──────────┐ ┌──────────┐              │
│     │ Executor │ │  Store   │ │  Stats   │              │
│     │ (11 lang)│ │  (FTS5)  │ │ Tracker  │              │
│     └──────────┘ └──────────┘ └──────────┘              │
│            │           │                                 │
│            ▼           ▼                                 │
│     Raw output    Indexed &     Only summary             │
│     stays here    searchable    enters context           │
└─────────────────────────────────────────────────────────┘

8 MCP Tools

| Tool | What it does | |:-----|:-------------| | execute | Run code in 11 languages. Only stdout enters context. | | execute_file | Process a file via FILE_CONTENT variable — file never enters context. | | index | Chunk markdown/text into FTS5 knowledge base for search. | | search | BM25 search with Porter stemming → trigram → fuzzy fallback. | | fetch_and_index | Fetch URL → HTML-to-markdown → auto-index. Preview only in context. | | batch_execute | Run N commands + search in ONE call. Replaces 30+ tool calls. | | stats | Session + cumulative statistics: bytes saved, tokens avoided, savings ratio. | | discover | Lists indexed sources, top searchable terms, and suggests next actions. |

Supported Languages

javascript · typescript · python · shell · ruby · go · rust · php · perl · r · elixir

Bun auto-detected for 3-5x faster JS/TS execution.

Compression Modes

context-compress offers four compression modes that trade fidelity for compactness. Pass --mode to the CLI, set CONTEXT_COMPRESS_MODE in your environment, or let the default (balanced) just work.

# CLI flag (per-call override)
context-compress wrap --mode aggressive "git log -50"

# Env var (set once for the session)
export CONTEXT_COMPRESS_MODE=aggressive

The PreToolUse hook also forwards CONTEXT_COMPRESS_MODE automatically when wrapping Bash commands, so agents transparently get whatever mode you've configured.

Head-to-head with RTK

Reproduce locally:

git clone https://github.com/rtk-ai/rtk /tmp/rtk && (cd /tmp/rtk && cargo build --release)
RTK_BIN=/tmp/rtk/target/release/rtk tsx scripts/benchmark-vs-rtk.ts

Result on this repository (RTK 0.39.0 vs context-compress 2026.5.0):

| Command | Raw | RTK | CC conservative | CC balanced | CC aggressive | CC auto (LLM) | |:--|--:|--:|--:|--:|--:|--:| | git status | 577 B | 241 B (58%) | 577 B (0%) | 375 B (35%) | 187 B (68%) | balanced (35%) | | git log -10 (full) | 21.3 KB | 3.2 KB (85%) | 21.3 KB (0%) | 4.6 KB (79%) | 947 B (96%) | balanced (79%) | | git log -50 (full) | 36.9 KB | 10.1 KB (73%) | 36.9 KB (0%) | 12.3 KB (67%) | 3.2 KB (91%) | balanced (67%) | | git diff --stat | 425 B | 424 B (0%) | 425 B (0%) | 425 B (0%) | 425 B (0%) | balanced (0%) | | ls src/ | 149 B | 229 B (-54%) | 149 B (0%) | 149 B (0%) | 149 B (0%) | conservative (0%) | | ls -laR src/ | 3.8 KB | 229 B (94%) | 3.8 KB (0%) | 3.1 KB (19%) | 877 B (78%) | aggressive (78%) | | find *.ts | 1.0 KB | 589 B (44%) | 1.0 KB (0%) | 183 B (83%) | 183 B (83%) | aggressive (83%) | | npm test | 21.8 KB | 114 B (99%) | 16.7 KB (24%) | 120 B (99%) | 120 B (99%) | balanced (99%) | | Overall (byte-weighted) | 85.9 KB | 15.0 KB (82.5%) | 80.8 KB (6.0%) | 21.2 KB (75.4%) | 6.0 KB (93.0%) | 19.0 KB (77.9%) |

Three things to take from this table:

balanced is competitive on its own. The default mode hits ~75% reduction without dropping any metadata — agents get full commit headers, file perms/dates, and complete test failure detail. Only 7pp behind RTK while making a different fidelity trade-off.
aggressive decisively wins on raw compression — 93.0%, beating RTK by 10.5pp. Pick this when you want maximum token savings and the agent will rarely re-read the dropped detail.
auto lets the model pick. Per-command LLM judgment landed at 77.9% overall — between balanced and aggressive. The interesting result is what it picked: balanced for git/test outputs (where commit bodies and failure detail matter), aggressive for ls -laR and find (where the question is "what's there?", not "show me everything"), conservative for tiny outputs where compression is pointless.

Aggressive mode covers a wider command surface than the table above hints — it also handles df (drops pseudo-filesystems), du (top-N by size), ps aux (PID/%CPU/%MEM/CMD only, drops kernel threads), npm ls (strips tree-drawing chars + deduped/extraneous markers), and grep/rg (groups by file, truncates long lines).

What balanced now does (over conservative):

ls -l* drops total N, ./.. entries (universal noise) but keeps perms/dates
git log keeps headers + first 3 body lines per commit, replacing the rest with [+N lines omitted]
find / ls -R summarizes per-directory once output exceeds 20 entries
Generic dedup/progress/group runs at 5KB instead of 10KB

RTK has a single fixed compression strategy — comparable to context-compress aggressive. context-compress lets the agent choose: reach for aggressive when the question is "what changed", balanced when the question is "explain why".

Token Reduction

context-compress achieves 99.2% token reduction across a typical 12-operation coding session.

| Operation | Before | After | Reduction | |:--|--:|--:|--:| | Read bundled file (776KB) | 194,076 tok | 105 tok | 99.9% | | Playwright snapshot (56KB) | 14,000 tok | 75 tok | 99.5% | | Read CSV/JSON data (100KB) | 25,000 tok | 125 tok | 99.5% | | Read source file (21KB) | 5,250 tok | 88 tok | 98.3% | | npm install log (15KB) | 3,750 tok | 50 tok | 98.7% | | curl API response (12KB) | 3,000 tok | 88 tok | 97.1% | | npm test (42 tests) | 935 tok | 45 tok | 95.2% | | batch_execute (5 cmds) | 6,250 tok | 375 tok | 94.0% | | fetch_and_index (45KB page) | 11,250 tok | 750 tok | 93.3% | | grep (small output) | 361 tok | 361 tok | 0% | | Session Total | 267,121 tok | 2,223 tok | 99.2% |

Without context-compress, 12 operations consume 133% of the 200K context window — overflowing it entirely. With context-compress, the same operations use 1.1%, leaving 98.9% free for actual conversation.

Data isn't deleted — it's indexed in FTS5 and searchable on demand. Small outputs (<5KB) pass through uncompressed.

Read the full Token Reduction Report — includes cost analysis, architecture deep-dive, and FAQ on context loss trade-offs.

What Changed from context-mode

| | context-mode | context-compress | |:--|:------------|:-----------------| | Credentials | 20+ auth env vars passed by default | Opt-in only (passthroughEnvVars: []) | | Hook writes | Self-modifies settings.json | Zero filesystem writes | | Rust compile | Shell string → injection risk | execFileSync with array args | | Upgrade | git clone arbitrary code | Removed entirely | | FTS5 indexing | Always dual-table (Porter + trigram) | Lazy trigram — 50% fewer writes | | Runtime detect | Sequential execSync ~250ms | Parallel Promise.all ~40ms | | batch_execute | Sequential commands | Promise.allSettled parallel | | Config | None | ENV + file + defaults | | Errors | 23 silent catch blocks | CONTEXT_COMPRESS_DEBUG=1 logs all | | Uninstall | None | context-compress uninstall |

Configuration

Loaded in order: ENV vars → .context-compress.json → defaults

Environment Variables

# Enable debug logging (stderr)
CONTEXT_COMPRESS_DEBUG=1

# Pass specific env vars to subprocesses (default: none)
CONTEXT_COMPRESS_PASSTHROUGH_ENV=GH_TOKEN,AWS_PROFILE

# Disable curl/wget blocking
CONTEXT_COMPRESS_BLOCK_CURL=0

# Disable WebFetch blocking
CONTEXT_COMPRESS_BLOCK_WEBFETCH=0

# Disable Read/Grep nudges
CONTEXT_COMPRESS_NUDGE_READ=0
CONTEXT_COMPRESS_NUDGE_GREP=0

# Compression mode: conservative | balanced (default) | aggressive | auto
CONTEXT_COMPRESS_MODE=balanced

# Auto mode prefers the Anthropic API when this is set (faster than `claude -p` fallback)
ANTHROPIC_API_KEY=sk-ant-...

# RTK-style transparent Bash wrapping (default: off)
CONTEXT_COMPRESS_FILTER_BASH=1

# Override path to the context-compress binary used by the hook
CONTEXT_COMPRESS_BIN=/usr/local/bin/context-compress

Config File

Create .context-compress.json in your project root or home directory:

{
  "passthroughEnvVars": ["GH_TOKEN", "AWS_PROFILE", "KUBECONFIG"],
  "blockCurl": true,
  "blockWebFetch": true,
  "debug": false
}

CLI

context-compress                            # Start MCP server (stdio)
context-compress setup                      # Detect runtimes, show install instructions
context-compress setup --auto               # One-line: write ~/.claude/settings.json
context-compress init --auto                # Alias for setup --auto
context-compress doctor                     # Diagnose: runtimes, hooks, FTS5, version
context-compress uninstall                  # Clean removal: hooks, MCP reg, stale DBs

# RTK-style transparent compression — use anywhere, agent doesn't need MCP
context-compress wrap "npm test"                       # default = balanced
context-compress wrap --mode aggressive "git log -50"  # max compression
context-compress wrap --stream "tail -f /var/log/app.log"  # line-by-line for long-running cmds
context-compress filter --cmd "git push" < captured.log    # pipe filter

Bash auto-wrap (transparent mode)

Set CONTEXT_COMPRESS_FILTER_BASH=1 and the PreToolUse hook will route output-heavy Bash calls through context-compress wrap automatically — the agent doesn't need to call execute() to benefit. Combine with CONTEXT_COMPRESS_MODE=aggressive for maximum compression.

Doctor Output Example

  context-compress doctor

  [PASS] Performance: FAST — Bun detected
  [PASS] Language coverage: 7/11 (64%)
  [PASS] Server test: OK
  [PASS] PreToolUse hook configured
  [PASS] Hook integrity: SHA-256 verified (a3f1c8d2e4...)
  [PASS] FTS5 / better-sqlite3 works

  Version: v2026.5.0
  All checks passed.

Project Structure

context-compress/
├── src/
│   ├── index.ts              # MCP server entry
│   ├── server.ts             # Wires deps, registers tools (132 lines, was 845)
│   ├── executor.ts           # SubprocessExecutor + ANSI/dedup pipeline
│   ├── filters.ts            # Command-aware filters (git, npm, ls, find, ps, ...)
│   ├── store.ts              # ContentStore (FTS5 + BM25 + Porter + trigram + Levenshtein)
│   ├── network.ts            # SSRF / DNS rebinding protection
│   ├── stats.ts              # Session + cumulative session tracker
│   ├── config.ts             # Config: ENV → file → defaults
│   ├── snippet.ts            # FTS5 snippet extraction
│   ├── logger.ts             # Debug logger
│   ├── types.ts              # Shared types
│   ├── utils.ts              # detectInjectionPatterns, limitConcurrency, formatBytes
│   ├── runtime/
│   │   ├── index.ts          # Parallel runtime detection + registry
│   │   ├── plugin.ts         # LanguagePlugin interface
│   │   └── languages/        # 11 language plugins (js, ts, py, sh, rb, go, rs, php, pl, r, ex)
│   ├── tools/                # MCP tool handlers (one file per tool)
│   │   ├── context.ts        # Shared ToolContext interface
│   │   ├── execute.ts
│   │   ├── execute-file.ts
│   │   ├── index-content.ts
│   │   ├── search.ts
│   │   ├── fetch-and-index.ts
│   │   ├── batch-execute.ts
│   │   ├── stats.ts
│   │   └── discover.ts
│   ├── util/                 # Pure utilities (extracted from server.ts for testability)
│   │   ├── path.ts           # isWithinProject (path-traversal safe)
│   │   ├── fetch-code.ts     # buildFetchCode (HTML→md sandbox script)
│   │   ├── intent-filter.ts  # createIntentFilter factory
│   │   ├── label.ts          # compactLabel (compression levels)
│   │   ├── version.ts        # getVersion (deduped across CLI commands)
│   │   ├── stream-compress.ts# Line-by-line StreamCompressor for `wrap --stream`
│   │   └── auto-mode.ts      # LLM-driven mode selection (Anthropic API + claude CLI)
│   ├── hooks/
│   │   └── pretooluse.ts     # PreToolUse hook (curl/Bash/Read/Grep/WebFetch/Task)
│   └── cli/
│       ├── index.ts          # CLI dispatcher
│       ├── lite.ts           # Single-binary entry (filter+wrap only, no MCP)
│       ├── filter.ts         # `filter` (stdin) + `wrap` (spawn) commands
│       ├── setup.ts          # `setup` / `init` — interactive + --auto
│       ├── doctor.ts         # `doctor` — diagnostics
│       └── uninstall.ts      # `uninstall` — clean removal
├── tests/
│   ├── unit/                 # 18 unit test files (213 tests, all passing)
│   └── integration/          # 3 integration test files
├── scripts/
│   ├── benchmark.ts          # Synthetic compression benchmark
│   ├── benchmark-real.ts     # Real-command benchmark on this repo
│   └── benchmark-vs-rtk.ts   # Head-to-head vs RTK with --auto support
├── hooks/                    # Pre-built hook bundle (shipped in npm package)
├── skills/                   # Slash command definitions
├── docs/                     # Token reduction report + architecture docs
└── dist/                     # Compiled output (build artifact)

server.ts is now thin (132 lines) — it constructs deps, builds a ToolContext, registers the 8 tool modules, and wires shutdown. All tool handlers live under src/tools/, all reusable helpers under src/util/.

Security

| Threat | Mitigation | |:-------|:-----------| | Credential leakage | passthroughEnvVars defaults to [] — zero env vars passed to subprocesses unless opted in | | Shell injection | execFileSync with array arguments throughout — no string interpolation into shells | | SSRF / private-IP fetch | fetch_and_index blocks RFC1918, link-local, loopback, IPv4-mapped IPv6 (incl. hex form ::ffff:HHHH:HHHH), CGNAT | | DNS rebinding (TOCTOU) | resolveAndValidate + URL pinning to the resolved IP with original Host header preserved | | Path traversal | isWithinProject uses realpathSync to defeat symlink escapes; falls back to string-prefix for not-yet-existing paths | | Hook self-modification | Hooks are read-only — no fs.writeFileSync in src/hooks/. Hook integrity SHA-256 verified by doctor | | Arbitrary code execution | No upgrade command — no git clone or npm install at runtime. Setup writes only to ~/.claude/settings.json | | Silent failures | CONTEXT_COMPRESS_DEBUG=1 surfaces all catch-block errors to stderr | | Subprocess sandboxing | OS-level sandboxing not enforced (by design for the MCP trust model). See SECURITY.md for the full trust model. |

Contributing

git clone https://github.com/Open330/context-compress
cd context-compress
npm install

npm run typecheck        # Strict TS
npm run lint             # Biome
npm test                 # All tests (unit + integration)
npm run test:unit        # Unit tests only

npm run build            # Compile + bundle MCP server + CLI
npm run build:hooks      # Bundle the PreToolUse hook (with SHA-256)
npm run build:bin        # Cross-compile single binaries via Bun (4 targets)

Reproducing the benchmarks

# Synthetic — fast, reproducible, includes RTK-style commands
tsx scripts/benchmark.ts

# Real-world — runs actual commands in your repo
tsx scripts/benchmark-real.ts            # full
tsx scripts/benchmark-real.ts --quick    # skip npm test

# Head-to-head with RTK (build it first)
git clone https://github.com/rtk-ai/rtk /tmp/rtk
(cd /tmp/rtk && cargo build --release)

RTK_BIN=/tmp/rtk/target/release/rtk tsx scripts/benchmark-vs-rtk.ts
RTK_BIN=... tsx scripts/benchmark-vs-rtk.ts --auto    # also run LLM-judged auto mode
RTK_BIN=... tsx scripts/benchmark-vs-rtk.ts --json    # machine-readable

License

MIT — Based on context-mode by Mert Koseoğlu.

Inspired by RTK for the command-aware filtering tactic. context-compress builds on the same idea with multi-mode trade-offs, an LLM-judged auto mode, MCP integration, sandbox execution, and a searchable knowledge base.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

context-compress

Quickstart

Why?

Getting Started

Install

One-line setup

Manual setup

Verify

How It Works

8 MCP Tools

Supported Languages

Compression Modes

Head-to-head with RTK

Token Reduction

What Changed from context-mode

Configuration

Environment Variables

Config File

CLI

Bash auto-wrap (transparent mode)

Doctor Output Example

Project Structure

Security

Contributing

Reproducing the benchmarks

License