context-mem
v0.4.0
Published
Context optimization for AI coding assistants — 90%+ token savings via summarization, search, and progressive disclosure
Maintainers
Readme
context-mem
Context optimization for AI coding assistants — 99% token savings, zero configuration, no LLM dependency.
AI coding assistants waste 60–80% of their context window on raw tool outputs — full npm logs, verbose test results, uncompressed JSON. This means shorter sessions, lost context, and repeated work.
context-mem captures tool outputs via hooks, compresses them using 14 content-aware summarizers, stores everything in local SQLite with full-text search, and serves compressed context back through the MCP protocol. No LLM calls, no cloud, no cost.
How It Compares
| | context-mem | claude-mem | context-mode | Context7 | |---|---|---|---|---| | Approach | 14 specialized summarizers | LLM-based compression | Sandbox + intent filter | External docs injection | | Token Savings | 99% (benchmarked) | ~95% (claimed) | 98% (claimed) | N/A | | Search | BM25 + Trigram + Fuzzy | Basic recall | BM25 + Trigram + Fuzzy | Doc lookup | | LLM Calls | None (free, deterministic) | Every observation ($$$) | None | None | | Knowledge Base | 5 categories, auto-extraction, relevance decay | No | No | No | | Budget Management | Configurable limits + overflow | No | Basic throttling | No | | Event Tracking | P1–P4, error-fix detection | No | Session events only | No | | Dashboard | Real-time web UI | No | No | No | | Session Continuity | Snapshot save/restore | Partial | Yes | No | | Content Types | 14 specialized detectors | Generic LLM | Generic sandbox | Docs only | | Privacy | Fully local, tag stripping | Local | Local | Cloud | | License | MIT | AGPL-3.0 | Elastic v2 | Open |
Quick Start
Claude Code (recommended):
/plugin marketplace add JubaKitiashvili/context-mem
/plugin install context-mem@context-memnpm (manual):
npm install -g context-mem
cd your-project
context-mem init
context-mem serveCursor — .cursor/mcp.json:
{ "mcpServers": { "context-mem": { "command": "npx", "args": ["-y", "context-mem", "serve"] } } }Windsurf — .windsurf/mcp.json:
{ "mcpServers": { "context-mem": { "command": "npx", "args": ["-y", "context-mem", "serve"] } } }GitHub Copilot — .vscode/mcp.json:
{ "servers": { "context-mem": { "type": "stdio", "command": "npx", "args": ["-y", "context-mem", "serve"] } } }Cline — add to MCP settings:
{ "mcpServers": { "context-mem": { "command": "npx", "args": ["-y", "context-mem", "serve"], "disabled": false } } }Roo Code — same as Cline format above.
Gemini CLI — .gemini/settings.json:
{ "mcpServers": { "context-mem": { "command": "npx", "args": ["-y", "context-mem", "serve"] } } }Goose — add to profile extensions:
extensions:
context-mem:
type: stdio
cmd: npx
args: ["-y", "context-mem", "serve"]OpenClaw — add to MCP config:
{ "mcpServers": { "context-mem": { "command": "npx", "args": ["-y", "context-mem", "serve"] } } }CrewAI / LangChain — see configs/ for Python integration examples.
Runtime Context Optimization (benchmark-verified)
| Mechanism | How it works | Savings | |---|---|---| | Content summarizer | Auto-detects 14 content types, produces statistical summaries | 97–100% per output | | Index + Search | FTS5 BM25 retrieval returns only relevant chunks, code preserved exactly | 80% per search | | Smart truncation | 4-tier fallback: JSON schema → Pattern → Head/Tail → Binary hash | 83–100% per output | | Session snapshots | Captures full session state in <2 KB | ~50% vs log replay | | Budget enforcement | Throttling at 80% prevents runaway token consumption | Prevents overflow |
Result: In a full coding session, 99% of tool output tokens are eliminated — leaving 99.6% of your context window free for actual problem solving. See BENCHMARK.md for complete results.
Headline Numbers
| Scenario | Raw | Compressed | Savings | |---|---|---|---| | Full coding session (50 tools) | 365.5 KB | 3.2 KB | 99% | | 14 content types (555.9 KB) | 555.9 KB | 5.6 KB | 99% | | Index + Search (6 scenarios) | 38.9 KB | 8.0 KB | 80% | | BM25 search latency | — | 0.3ms avg | 3,342 ops/s | | Trigram search latency | — | 0.008ms avg | 120,122 ops/s |
Verified on Apple M3 Pro, Node.js v22.22.0, 555.9 KB real-world test data across 21 scenarios.
What Gets Compressed
14 summarizers detect content type automatically and apply the optimal compression:
| Content Type | Example | Strategy |
|---|---|---|
| Shell output | npm install, build logs | Command + exit code + error extraction |
| JSON | API responses, configs | Schema extraction (keys + types, no values) |
| Errors | Stack traces, crashes | Error type + message + top frames |
| Test results | Jest, Vitest | Pass/fail/skip counts + failure details |
| TypeScript errors | error TS2345: | Error count by file + top error codes |
| Build output | Webpack, Vite, Next.js | Routes + bundle sizes + warnings |
| Git log | Commits, diffs | Commit count + authors + date range |
| CSV/TSV | Data files, analytics | Row/column count + headers + aggregation |
| Markdown | Docs, READMEs | Heading tree + code blocks + links |
| HTML | Web pages | Title + nav + headings + forms |
| Network | HTTP logs, access logs | Method/status distribution |
| Code | Source files | Function/class signatures |
| Log files | App logs, access logs | Level distribution + error extraction |
| Binary | Images, compiled files | SHA256 hash + byte count |
Features
Search — 3-layer hybrid: BM25 full-text → trigram fuzzy → Levenshtein typo-tolerant. Sub-millisecond latency with intent classification.
Knowledge Base — Save and search patterns, decisions, errors, APIs, components. Time-decay relevance scoring with automatic archival. Auto-extraction — decisions, errors, commits, and frequently-accessed files are automatically saved to the knowledge base without manual intervention.
Export/Import — Transfer knowledge between machines: context-mem export dumps knowledge, snapshots, and events as JSON; context-mem import restores them in another project. Merge or replace modes.
Budget Management — Session token limits with three overflow strategies: aggressive truncation, warn, hard stop.
Event Tracking — P1–P4 priority events with automatic error→fix detection.
Session Snapshots — Save/restore session state across restarts with progressive trimming.
Dashboard — Real-time web UI at http://localhost:51893 — auto-starts with serve, supports multi-project aggregation. Token economics, observations, search, knowledge base, events, system health. Switch between projects or see everything at once.
VS Code Extension — Sidebar dashboard, status bar with live savings, command palette (start/stop/search/stats). Install from marketplace: context-mem.
Auto-Detection — context-mem init detects your editor (Cursor, Windsurf, VS Code, Cline, Roo Code) and creates MCP config + AI rules automatically. First serve run also triggers lightweight auto-setup (.gitignore, rules) — zero manual config needed.
OpenClaw Native Plugin — Full ContextEngine integration with lifecycle hooks (bootstrap, ingest, assemble, compact, afterTurn, dispose). See openclaw-plugin/.
Privacy — Everything local. <private> tag stripping, custom regex redaction. No telemetry, no cloud.
Architecture
Tool Output → Hook Capture → HTTP Bridge (:51894) → Pipeline → Summarizer (14 types) → SQLite + FTS5
↓ ↓ ↓
ObserveQueue SHA256 Dedup 3-Layer Search
(burst protection) ↓ ↓
4-Tier Truncation Progressive Disclosure
↓ ↓
Auto-Extract KB AI Assistant ← MCP ServerMCP Tools
| Tool | Description |
|---|---|
| observe | Store an observation with auto-summarization |
| search | Hybrid search across all observations |
| get | Retrieve full observation by ID |
| timeline | Reverse-chronological observation list |
| stats | Token economics for current session |
| summarize | Summarize content without storing |
| configure | Update runtime configuration |
| execute | Run code snippets (JS/Python) |
| index_content | Index content with code-aware chunking |
| search_content | Search indexed content chunks |
| save_knowledge | Save to knowledge base |
| search_knowledge | Search knowledge base |
| budget_status | Current budget usage |
| budget_configure | Set budget limits |
| restore_session | Restore session from snapshot |
| emit_event | Emit a context event |
| query_events | Query events with filters |
CLI Commands
context-mem init # Initialize in current project
context-mem serve # Start MCP server (stdio)
context-mem status # Show database stats
context-mem doctor # Run health checks
context-mem dashboard # Open web dashboard
context-mem export # Export knowledge, snapshots, events as JSON
context-mem import # Import data from JSON export fileConfiguration
{
"storage": "auto",
"plugins": {
"summarizers": ["shell", "json", "error", "log", "code"],
"search": ["bm25", "trigram"],
"runtimes": ["javascript", "python"]
},
"privacy": {
"strip_tags": true,
"redact_patterns": []
},
"token_economics": true,
"lifecycle": {
"ttl_days": 30,
"max_db_size_mb": 500,
"max_observations": 50000,
"cleanup_schedule": "on_startup",
"preserve_types": ["decision", "commit"]
},
"port": 51893,
"db_path": ".context-mem/store.db"
}Documentation
| Doc | Description | |---|---| | Benchmark Results | Full benchmark suite — 21 scenarios, 7 parts | | Configuration Guide | All config options with defaults |
Platform Support
| Platform | MCP Config | AI Rules | Auto-Setup |
|---|---|---|---|
| Claude Code | CLAUDE.md | Appends to CLAUDE.md | init + serve |
| Cursor | mcp.json | .cursor/rules/context-mem.mdc | init + serve |
| Windsurf | mcp_config.json | .windsurf/rules/context-mem.md | init + serve |
| GitHub Copilot | mcp.json | .github/copilot-instructions.md | init + serve |
| Cline | cline_mcp_settings.json | .clinerules/context-mem.md | init + serve |
| Roo Code | mcp_settings.json | .roo/rules/context-mem.md | init + serve |
| Gemini CLI | GEMINI.md | Appends to GEMINI.md | init + serve |
| Antigravity | GEMINI.md | Appends to GEMINI.md | serve |
| Goose | recipe.yaml | — | Manual |
| OpenClaw | mcp_config.json | — | Manual |
| CrewAI | example.py | — | Manual |
| LangChain | example.py | — | Manual |
AI Rules teach the AI when and how to use context-mem tools automatically — calling observe after large outputs, restore_session on startup, search before re-reading files.
Available On
- npm —
npm install -g context-mem - VS Code Marketplace — Context Mem
- Claude Code Plugin —
/plugin marketplace add JubaKitiashvili/context-mem
License
MIT — use it however you want.
