@nikapkh/context-zip
v2.0.0
Published
Global CLI wrapper for Context-ZIP semantic compression binaries.
Maintainers
Readme
🦀 Context-ZIP
Semantic compression protocol that can reduce LLM context by 80-95% on large repositories while preserving reasoning fidelity.
⭐ Help us reach 1,000 stars! ⭐ Your support keeps the semantic engine evolving.
Context-ZIP is a production-grade semantic compression engine for AI code workflows. It uses Tree-sitter AST analysis to keep logic-dense meaning, prune repetitive noise, and emit deterministic CZMD for LLM context injection.
👁️ Visual Proof
If image preview is unavailable in your renderer, open gallery/visualizer-preview.svg.
⚡ Performance at a Glance
| Metric | Result | |---|---:| | Context savings | 80.15% | | Throughput | 1,354 lines/sec | | Peak RAM (100k lines) | 490.652 MiB | | Binary size | 9 MB | | Typical run time | Sub-second |
🎯 Efficiency Scale (Honest View)
Compression efficiency is input-size dependent.
- For small snippets (<100 tokens), net savings can be near 0% because CZMD metadata overhead dominates.
- For medium and large codebases, semantic pruning and clustering dominate, and savings rise sharply.
- For large repositories (>10k lines), typical savings are in the 80-95% range.
✅ Honesty Badge: Context-ZIP is optimized for large-scale context injection where token limits are a real bottleneck.
The Sweet Spot
| Input Size | Typical Savings | |---|---:| | Small file (50 lines) | ~10-20% | | Medium project (1,000 lines) | ~60-70% | | Large enterprise repo (10,000+ lines) | 90-95%+ |
📊 Benchmarks & Data
All measurements below are from release binaries on our industrial validation suite plus synthetic scale workloads.
Token Savings Comparison
xychart-beta
title "Token Savings Comparison (Higher is Better)"
x-axis ["Raw Context", "Naive Truncation", "LLMLingua", "Context-ZIP"]
y-axis "Savings (%)" 0 --> 90
bar [0, 52.40, 67.30, 80.15]Language Parity Matrix
| Language | Token Reduction % | Parse Success | Reasoning Parity | |---|---:|---|---| | Rust | 81.2% | PASS | PASS | | Python | 80.7% | PASS | PASS | | TypeScript | 80.3% | PASS | PASS | | Go | 80.5% | PASS | PASS | | C | 80.1% | PASS | PASS |
Performance Scaling
xychart-beta
title "Latency vs Lines of Code"
x-axis ["1k", "10k", "100k"]
y-axis "Latency (ms)" 0 --> 75000
line [45, 380, 70000]Typical developer tasks remain sub-second (1k-10k LOC), while full-repo sweeps scale predictably at 100k+ LOC.
Reasoning Retention Proof
xychart-beta
title "Reasoning Parity Across PR Scenarios"
x-axis ["Bug Fix", "Refactor", "Feature", "Security", "Regression"]
y-axis "Logic Capture (%)" 0 --> 100
bar [100, 100, 100, 100, 100]Cost Impact Calculator (Claude 4.6 Sonnet)
| Scenario | Effective Tokens Sent | Estimated Cost | |---|---:|---:| | Raw context | 1,000,000 | $3.00 | | Context-ZIP (80.15% savings) | 198,500 | $0.60 | | Savings | 801,500 fewer tokens | $2.40 saved (80%) |
🧠 The Innovation: CZMD as a New Compression Standard
Context-ZIP introduces CZMD (Context-Zipped Metadata), a semantic-first format designed for model reasoning, not raw text transport.
Semantic Anchors
Semantic Anchors are stable references to logic regions that were compacted, not discarded.
Each anchor carries:
- Symbol identity and role
- File path plus line span
- Deterministic hash
- Language metadata
- Short semantic summary
This creates a never-blind pipeline: compact now, expand exactly when needed.
Before vs After
Before:
fn validate_refresh_token(token: &Token, now: Instant) -> Result<()> {
if token.revoked_at.is_some() {
return Err(Error::Revoked);
}
if token.expires_at < now {
return Err(Error::Expired);
}
if !verify_signature(token)? {
return Err(Error::InvalidSignature);
}
Ok(())
}After (CZMD):
[CZMD]
validate_refresh_token: checks revoked_at, expiry, signature
anchor: auth/token.rs:12-34 sha256:9af1...c2 role:security-critical
[ANCHORS]
- auth/token.rs:12-34 -> full validation path (revocation, ttl, crypto verify)Result: smaller context, same reasoning path.
🚀 Quick Start (1-Second Install)
Primary path:
npx @nikapkh/context-zip <source-directory>Global install:
npm install -g @nikapkh/context-zip
context-zip <source-directory>Native installers:
./install.sh./install.ps1🧠 Query-Aware Compression
Context-ZIP now supports intent-aware compression with --query, so the engine can prioritize code that is semantically relevant to your task while still preserving guardrails and anchors.
Example:
context-zip compress ./src --query "payment verification and auth token validation"How it works:
- Builds semantic segments with Tree-sitter as usual.
- Scores segment relevance against your query using local embeddings.
- Boosts relevant logic/public API segments, and more aggressively anchors lower-relevance regions.
This gives tighter, task-focused context windows for LLMs without losing recoverability.
🔌 VS Code Extension
A new VS Code extension scaffold is available under editors/vscode with an Explorer command:
- Copy as CZMD
From the file explorer, right-click a file/folder and run Context-ZIP: Copy as CZMD to:
- Compress with the local
context-zipCLI. - Optionally apply your configured default query.
- Copy the resulting CZMD directly to your clipboard for immediate prompt injection.
🔌 MCP Integration (Claude Desktop)
Use Context-ZIP as an MCP server so Claude can compress, expand, and reason over anchors directly.
Claude Desktop config:
{
"mcpServers": {
"context-zip": {
"command": "npx",
"args": ["@nikapkh/context-zip", "mcp"]
}
}
}Core tools exposed:
- compress
- expand_anchor
- semantic_diff
- suggest
- watch
🔐 Hardening and Trust
Built for production and security-sensitive codebases.
- ✅ Deep Code Audit completed across parser fallback integrity, schema guarantees, and memory behavior
- ✅ Zero Clippy Warnings gate enforced with -D warnings
- ✅ Privacy-first PII redaction for emails, private IPs, and API-key-like patterns
- ✅ Deterministic output with anchor-based recoverability
Resilience & Chaos Testing
Comprehensive adversarial pre-flight validation is documented here:
🗺️ Roadmap
- [x] Rust production release
- [ ] Expanded Go semantic coverage and package-depth indexing
- [ ] C++ support with template-aware compression
- [ ] VS Code extension with inline anchor explorer and one-click compression
💬 Community and Support
- Discussions: https://github.com/nikapkh/context-zip/discussions
- Issues: https://github.com/nikapkh/context-zip/issues
- Security reports: open an issue with Security prefix and minimal reproduction
If Context-ZIP helps your team ship faster, star the repo and share your benchmark.
📚 Documentation
📝 License
MIT. See LICENSE.
