zerocontext
v0.1.0
Published
Serve only the AI rules that match the file you're working on — over MCP, just in time.
Maintainers
Readme
ZeroContext
Your AI tools waste tokens on rules you don't need right now. ZeroContext serves only the rules that match the code you're working on.
CLAUDE.md and .cursorrules load everything you've ever written into every
message — keeping rules resident in the window that have nothing to do with the code in
front of you. ZeroContext is a local MCP server that inspects your project and hands the
agent only the rules it asks for, when it asks — just-in-time, per file and stack. A
smaller, sharper context window, better-targeted answers, and zero config files
cluttering your repo.
npx zerocontext init # writes MCP config pointing at the local server
npx zerocontext serve # stdio MCP server: inspects your project, serves matching rules on request
npx zerocontext analyze # see how many rule tokens your CLAUDE.md / .cursorrules keeps in every message📊 This repo's own
CLAUDE.mdkeeps 1,913 rule tokens resident in every message's context window — ZeroContext keeps ~80% of them out, serving only the slice the task needs. Measured byanalyze(o200k estimate; run it yourself below). This is a window-scoping number, not a lower bill: a preloadedCLAUDE.mdis prompt-cached, so the real win is relevance + a smaller window. (Phase 0 A/B: just-in-time ran cost-neutral-to-higher end to end — we don't claim dollar savings.)
See your own rule-token footprint in 5 seconds
Before installing anything, run analyze on any repo with a CLAUDE.md or
.cursorrules:
$ npx zerocontext analyze
ZeroContext analyze — rule tokens resident in every message
Tokenizer: o200k_base (gpt-tokenizer) — proxy for Claude, ~±10–15% vs. Claude's exact count
CLAUDE.md
1,913 tokens · 7,533 chars · 9 sections (avg 213 tok/section)
→ resident in your context window on every message, relevant or not.
With ZeroContext (just-in-time, 1 slice/task) — rule tokens resident in the window:
fixed overhead 135 tok (MCP instructions + get_rules schema, measured)
+ tool call(s) 40 tok (assumption: agent writes 1 call)
+ rule slice(s) 213 tok (your avg section size)
= 388 tok resident vs 1,913 tok preloaded (every message)
→ 80% fewer rule tokens in the window (1,525 tok)
Break-even: ZeroContext keeps fewer resident once rules exceed ~388 tok. Yours: 1,913 ✓
Scope, not $: this is window-resident rule tokens, not session cost — a cached
CLAUDE.md is ~free to re-read, so the win is relevance, not a lower bill. [Phase 0 A/B]That's a real run on this repo (snapshot 2026-06; the input is this repo's live
CLAUDE.md, so re-run to refresh) — your numbers will differ, and bigger or polyglot rule
files keep more out of the window. The percentage is the headline (tokenizer-robust); the
absolute count is a labeled estimate. It measures rule tokens resident in the window,
not session cost: a preloaded CLAUDE.md is prompt-cached, so ZeroContext's win is
relevance and a leaner window, not a smaller bill. If your rules fall below the
break-even, analyze says so instead of inflating the number.
CLAUDE.md vs. ZeroContext
| | CLAUDE.md / .cursorrules | ZeroContext |
|---|---|---|
| When rules load | Front-loaded into every message | Pulled by the agent, only when it needs them |
| What loads | Everything you've ever written | Only the slice matching the active file / stack |
| Irrelevant rule tokens resident in the window | ~2–8k per message | ~0 |
| What the agent's attention is on | Every rule, every turn | The rules that match the code in front of it |
| Rule files in your repo | One per AI tool | Zero (rules live in the server, served on demand) |
| Config footprint | The rule file itself | One gitignored .mcp.json pointer — no rules |
How it works (3 sentences)
ZeroContext runs as a local MCP server that your AI tool spawns per session. It inspects
the working directory, detects the stack from project signals, and exposes a get_rules
tool. When the agent starts working on a file or task it calls that tool and gets back
only the matching rule slice — nothing else lands in the context window.
It's agent-pulled, not a reactive editor hook. Claude Code is a terminal agent with no "current file" event for a server to chase, so ZeroContext serves rules just in time when the agent asks — the honest mechanism that actually works today, no IDE extension required.
Supported stacks
Start narrow. Detection is by project signal:
| Signal | Rule set served |
|---|---|
| next.config.* or a next dependency | Next.js (also activates React + TypeScript) |
| tsconfig.json + a react/react-dom dependency | React + TypeScript |
| drizzle.config.* or drizzle-orm + pg/postgres | Node + Postgres + Drizzle |
| pyproject.toml with a fastapi dependency | FastAPI |
| Cargo.toml | Rust |
Detection is additive: a Next.js app is also React + TypeScript, so both sets activate and each file gets only the slices whose globs match it. More stacks land as PRs come in — they aren't curated up front.
Launch rule sets
Five hand-crafted sets ship at launch:
- React + TypeScript
- Next.js
- FastAPI
- Rust
- Node + Postgres + Drizzle
Monorepo override (planned)
Not wired yet — autodetection handles single-stack repos today. The intended shape: a
single repo-root .zerocontext.yaml maps a path prefix to a bundled rule-set name, so
each workspace gets the right slice. Pointers only, never inline rules — the one allowed
footprint in your repo:
# .zerocontext.yaml — planned, not yet read by the server
version: 1
routes:
apps/web: nextjs
services/api: fastapiFollow this section for monorepo routing as it lands.
How is this different?
- vs.
awesome-cursorrules— that's a rules library; this is a runtime that picks the right slice and serves it to the agent on demand. - vs. native Cursor / Claude rules — those are static and front-loaded; ZeroContext is just-in-time, so the tokens only land when a rule is actually relevant.
- Why not just use
CLAUDE.md? — it keeps every rule you've ever written resident in the window on every message, relevant or not. Runanalyzeto see your footprint.
Status
Pre-launch, solo OSS. Phase 0 is resolved. In a live headless session the agent
called get_rules on its own and got back the correct stack-matching slice, so
agent-pulled just-in-time injection works end to end. The honest metric is rule tokens
resident in the window (what analyze reports), not session cost: the same-task A/B
found just-in-time ran cost-neutral-to-higher, because a preloaded CLAUDE.md is
prompt-cached and every turn re-reads the whole window anyway. So the pitch is relevance
and a leaner window — never a lower bill. The mechanism we build and market is
agent-pulled, just-in-time (plus optional session-start); reactive file-switch is
not promised, because it isn't available to a terminal agent without an IDE extension.
Nothing user-facing claims behavior that isn't built — this README tracks only what actually ships today.
License
MIT — see LICENSE. An open-source tool that scratches a real itch.
