@sparn/cortex
v1.3.0
Published
Context optimization for AI coding agents — reduce token usage by 60-90%
Downloads
412
Maintainers
Readme
Cortex
Context optimization for AI coding agents. Keeps your Claude Code sessions lean so they last longer, cost less, and stay focused.
The Problem
A typical Claude Code session generates 100K+ tokens in under an hour. File reads, test outputs, build logs — it all piles up. Eventually your context window is full of noise and Claude starts forgetting the decisions you made 20 minutes ago.
Cortex sits between Claude and your project, quietly compressing verbose outputs, tracking what's still relevant, and making sure critical context (like that error you're debugging) never gets lost.
Quick Start
npm install -g @sparn/cortex
cd your-project
cortex setupThat's it. One command handles everything: detects Claude Code, creates the config, installs hooks, starts the background daemon, and generates your CLAUDE.md. You don't need to change anything about how you use Claude Code — cortex works through hooks that fire automatically.
If you want more control, you can run each step yourself:
cortex init # Create .cortex/ config
cortex hooks install # Install Claude Code hooks
cortex daemon start # Start background optimizer
cortex docs # Generate CLAUDE.mdWhat It Actually Does
Hook-based compression
Every time Claude runs a Bash command, reads a file, or greps your codebase, cortex checks the output size. If it's large (3000+ tokens), it generates a content-aware summary and attaches it as additional context. Claude still sees the full output, but also gets a compact version to reference later when the original scrolls out of view.
The compression is type-aware:
- Test results — extracts pass/fail lines, skips the noise
- TypeScript errors — groups by error code (
TS2304(12), TS7006(3)) - Lint output — aggregates by rule
- Git diffs — lists changed files
- JSON responses — shows structure (array length, object keys)
- Build logs — pulls error and warning lines
Typical reduction: 60-90% on verbose outputs.
Session awareness
At the start of each prompt, cortex tells you what's going on:
[cortex] Session: 4.2MB | 3 outputs compressed | ~12K tokens savedWhen your session grows past 2MB, it nudges Claude to stay concise. Small thing, but it helps.
Cross-session memory
When you start a fresh Claude Code session in a project cortex knows about, it injects a briefing with:
- Important context from past sessions (errors, architectural decisions, patterns)
- Overdue technical debt items
- Active implementation plans
So Claude doesn't start from zero every time.
Daemon
The background daemon watches your session files and optimizes them when they get large. It also handles periodic consolidation (merging duplicate entries, cleaning up stale data). The daemon auto-starts when you open a Claude Code session — if it dies, the next prompt brings it back.
cortex daemon status # Check if it's running
cortex daemon stop # Stop itAuto-updating hooks
When you upgrade cortex via npm, the postinstall script detects your existing hooks and updates their paths. No manual re-installation needed.
Codebase Intelligence
Beyond compression, cortex can analyze your project structure to help Claude navigate smarter.
Dependency graph
cortex graph --analyze # Full analysis: entry points, hot paths, orphans
cortex graph --focus auth # Focus on files related to "auth"
cortex graph --entry src/index.ts # Trace from an entry pointSearch
Full-text search backed by SQLite FTS5, with ripgrep fallback:
cortex search init # Index your codebase
cortex search validateToken # Search
cortex search refresh # Re-index after changesDocs generation
Auto-generates a CLAUDE.md from your project structure, scripts, and dependency graph:
cortex docs
cortex docs --no-graph # Skip dependency analysis
cortex docs -o docs/CLAUDE.mdWorkflow planner
Create implementation plans with token budgets, then execute and verify:
cortex plan "Add user auth" --files src/auth.ts src/routes.ts
cortex exec <plan-id>
cortex verify <plan-id>Tech debt tracker
Track technical debt with severity levels:
cortex debt add "Fix N+1 queries" --severity P0 --due 2026-04-01 --files src/db.ts
cortex debt list --overdue
cortex debt resolve <id>Codebase analyzer
Multi-dimensional scoring across architecture, quality, database, security, tokens, and test coverage:
cortex analyze # Full analysis with score + grade
cortex analyze --json # Machine-readable output with fixable/fixType hints
cortex analyze --file src/api.ts # Per-file health score
cortex analyze --changed # Only files changed since last commit
cortex analyze --history # Score trend over time
cortex analyze --save-baseline # Save current results as baseline
cortex analyze --diff # Compare against saved baselineSupports .cortexignore for excluding files or suppressing specific rules:
# .cortexignore
src/generated/**
src/legacy/** QUAL-001,QUAL-003CLI
Running cortex with no arguments shows project status. Help shows the essential commands by default:
cortex # Project status
cortex --help # Essential commands
cortex --help --all # EverythingEssential commands: setup, status, optimize, stats, hooks.
Advanced commands (visible with --all): analyze, graph, search, docs, plan, exec, verify, debt, config, relay, consolidate, interactive, daemon, mcp:server.
Configuration
After setup, edit .cortex/config.yaml if you want to tune things:
pruning:
threshold: 5 # Keep top 5% of context
aggressiveness: 50 # 0-100
decay:
defaultTTL: 24 # Hours before context starts fading
decayThreshold: 0.95
realtime:
tokenBudget: 40000
autoOptimizeThreshold: 60000
agent: claude-code # auto-detected during setupOr use the CLI:
cortex config get pruning.threshold
cortex config set pruning.threshold 10MCP Server
Cortex runs as an MCP server for Claude Desktop or any MCP client:
cortex mcp:serverExposes four tools: cortex_optimize, cortex_stats, cortex_search, cortex_consolidate.
Programmatic API
import { createSparsePruner, estimateTokens } from '@sparn/cortex';
const pruner = createSparsePruner({ threshold: 5 });
const result = pruner.prune(largeContext, 5);
console.log(`${estimateTokens(largeContext)} -> ${estimateTokens(result.prunedContext)} tokens`);Full API: createDependencyGraph, createSearchEngine, createWorkflowPlanner, createDocsGenerator, createDebtTracker, createAnalysisHistory, createCortexIgnore, buildAnalysisContext, buildChangedFilesContext, buildSingleFileContext, createKVMemory, createBudgetPrunerFromConfig, createIncrementalOptimizer, and more.
How It Works
Cortex uses a multi-stage pipeline:
- Relevance filtering — Only the top 2-5% of context carries real signal
- Time decay — Older context fades unless reinforced by reuse
- Entry classification — Active, ready, or silent based on score
- Critical event detection — Errors and stack traces get permanently flagged (BTSP)
- Consolidation — Periodic merging of duplicates and cleanup of stale data
Development
git clone https://github.com/sparn-labs/cortex.git
cd cortex
npm install
npm run build
npm test
npm run lint
npm run typecheckLicense
MIT
