openprune
v0.8.1
Published
Open-source CLI to minimize LLM token usage via smart code context extraction
Maintainers
Readme
OpenPrune
Open-source, language-agnostic CLI that minimizes LLM token usage by pruning any codebase down to the context that actually matters. Works offline on repos of any size.
Commands
| Command | Description |
|---------|-------------|
| openprune init | Full index + embeddings + hash/snapshot baseline |
| openprune query "…" | Assemble skeleton + RAG chunks + diffs + session summary |
| openprune dashboard | Web UI at http://localhost:4242 |
| openprune watch | Debounced incremental re-index on file save |
| openprune stats | Token savings dashboard + index health |
| openprune diff | Unified diffs since last baseline |
| openprune compress | Manually compress session.json history |
| openprune session-end | Snapshot hashes + file baselines after a session |
| openprune mcp | MCP server on stdio for Cursor / Claude Desktop |
Quick start (Cursor MCP — recommended)
No manual copy-paste. The agent fetches pruned context automatically.
npm install -g openprune
cd your-project
openprune initAdd to .cursor/mcp.json in the folder you opened in Cursor (must be the same folder that contains .cursor/). See examples/cursor/mcp.json:
{
"mcpServers": {
"openprune": {
"type": "stdio",
"command": "npx",
"args": ["-y", "openprune", "mcp"],
"env": { "OPENPRUNE_ROOT": "${workspaceFolder}" }
}
}
}Important: Cursor requires "type": "stdio" for local servers. ${workspaceFolder} is the directory that contains .cursor/mcp.json (not a subfolder).
| You opened in Cursor | Put mcp.json at |
|----------------------|-------------------|
| your-app/ | your-app/.cursor/mcp.json |
| project-folder/ (monorepo) | parent .cursor/mcp.json pointing at openprune/dist/... (see repo root example) |
| openprune/ only | openprune/.cursor/mcp.json |
Copy examples/cursor/rules/openprune.mdc to .cursor/rules/openprune.mdc (or use the rule from this repo). Restart Cursor, enable the openprune MCP server in settings, then use Agent as usual — it should call get_codebase_context before broad file reads.
MCP tools
| Tool | When to use |
|------|-------------|
| evaluate_task_scope | Unsure if the task needs repo context |
| get_codebase_context | Required for implement / debug / explore tasks |
| openprune_index_status | Check init / stale index |
Quick start (CLI paste)
npm install -g openprune
cd your-project
openprune init
openprune query "how does auth work"
# paste the output into Cursor / ChatGPT / Claude, then ask your question
openprune watch # optional: keep index fresh while codingFrom source:
git clone https://github.com/openprune/openprune.git
cd openprune
npm install
npm run build
npm run openprune -- initHow OpenPrune saves cost
LLM tools (Cursor, Claude, ChatGPT, etc.) charge mainly on input tokens — the code and chat history you send each turn.
Without OpenPrune, a common pattern is:
@-mention whole folders or large files- Let the agent re-read many unrelated files each turn
- Paste long threads again on follow-ups
On a medium repo that can mean tens of thousands of tokens per message, repeated many times.
With OpenPrune MCP in Cursor, the agent calls get_codebase_context and receives the same pruned packet (~8k token cap) without you pasting anything.
With the CLI, you run openprune query "…" and paste one pruned context block that contains only:
- Files and symbols related to your question (skeleton)
- The few code regions that matter (RAG chunks)
- Small diffs for what you changed recently
- A short session summary (after multiple turns)
Same question, far less input — OpenPrune prints an estimate in the header, e.g. Tokens used: 1,400 / 8000 (vs ~45,000 raw).
| Phase | Command | Saves money? |
|-------|---------|----------------|
| Setup | openprune init | No (local indexing only) |
| Per question | openprune query or MCP get_codebase_context | Yes |
| While coding | openprune watch | Indirectly (keeps retrieval accurate) |
| Track savings | openprune stats | Shows history, does not save by itself |
OpenPrune integrates with Cursor via MCP (recommended) or manual query + paste.
Rough illustration (varies by repo and model pricing):
| Approach | Input per heavy turn | 20-turn session @ $3/1M input |
|----------|----------------------|-------------------------------|
| Paste whole src/ context | ~50,000 tokens | ~$3.00 |
| OpenPrune query packet | ~1,500 tokens | ~$0.09 |
Check your provider’s pricing; run openprune stats to see measured usage on your machine.
New user guide
1. One-time setup (per repo)
cd /path/to/your-app
openprune initThis creates .openprune/ (add it to gitignore — init does that for you). Nothing is sent to the cloud.
2. Before each coding task (saves tokens)
openprune query "your specific question in one sentence"Output is copied to your clipboard by default. Paste it as the first message or pinned context in your LLM chat.
Use --stdout to print only:
openprune query "how does billing work" --stdout3. Optional: keep the index fresh
openprune watchRe-indexes on save so the next query stays accurate. Run in a separate terminal while you code.
4. End of session
openprune session-end
openprune statsUpdates file baselines for diffs and shows token usage history.
Using OpenPrune with Cursor (MCP)
openprune initin your app repo- Add
.cursor/mcp.jsonand.cursor/rules/openprune.mdc(fromexamples/cursor/) - Ask Agent: "Add Google OAuth next to email login"
The rule tells the agent to call get_codebase_context first and to avoid @src/ unless something is missing from the packet.
Manual paste (without MCP)
Scenario: You have an existing app and ask Cursor to add a new feature (e.g. Google OAuth next to email login).
Step A — Prepare context
cd your-app
openprune init # skip if already done
openprune query "Add Google OAuth alongside existing login; callback route, token exchange, reuse session cookies"You get a compact block: auth-related skeleton, login/middleware/route chunks, recent edits to oauth.ts, and your question.
Step B — Paste into Cursor (sample prompt)
Copy the full openprune query output, then start a new Cursor chat (or Agent) with:
Use the project context below. Do not re-scan the whole repo unless something is missing.
Prefer editing the files listed in the skeleton; follow existing patterns in the code chunks.
--- OPENPRUNE CONTEXT START ---
(paste entire openprune query output here)
--- OPENPRUNE CONTEXT END ---
Task:
Implement Google OAuth alongside the existing email/password login.
- Reuse session and token patterns from login.ts
- Add routes for /auth/google and /auth/google/callback
- Wire requireAuth consistently with the middleware shown above
- List files you will change before editingCursor implements against ~1–3k tokens of targeted context instead of pulling in large parts of the tree on its own.
Step C — Follow-up questions (save again)
After you change several files, run a new query instead of re-pasting the whole chat:
openprune query "OAuth callback: exchange code for tokens and store refresh token like login()"Follow-up prompt template:
Updated context from OpenPrune (use this instead of re-reading all prior files):
--- OPENPRUNE CONTEXT START ---
(paste new query output)
--- OPENPRUNE CONTEXT END ---
Continue the OAuth work. Only change what this task requires.Step D — Shorter prompts when context is already in the thread
Once Cursor has the OpenPrune block in the thread, you can ask normally:
Add error handling for invalid OAuth state and log failures without leaking tokens.Re-run openprune query when you switch to a new area of the codebase (e.g. from auth to billing).
Sample prompts (copy-paste)
Explore unfamiliar code
--- OPENPRUNE CONTEXT START ---
(openprune query output)
--- OPENPRUNE CONTEXT END ---
Explain how authentication works in this codebase in 5 bullets.
Point to the exact files I should read next.Implement a feature
--- OPENPRUNE CONTEXT START ---
(openprune query output)
--- OPENPRUNE CONTEXT END ---
Implement: <describe feature in 2–3 sentences>
Constraints: match existing style; minimal diff; add no new dependencies unless necessary.Debug a bug
openprune query "why does validateToken fail on expired JWT in production config"--- OPENPRUNE CONTEXT START ---
(openprune query output)
--- OPENPRUNE CONTEXT END ---
Bug: <symptom>. Propose the smallest fix and which tests to add.Refactor safely
--- OPENPRUNE CONTEXT START ---
(openprune query output)
--- OPENPRUNE CONTEXT END ---
Refactor <module> for clarity without behavior changes.
List risks and files touched before making edits.Tips for maximum savings
- Run
openprune querywith a specific sentence — not vague one-word queries. - Paste the packet once per task, then ask focused follow-ups.
- Avoid
@-mentioning entiresrc/if OpenPrune already listed the right files. - Use
openprune watchor re-initafter large refactors so chunks stay current. - Check
openprune statsto see layer breakdown (skeleton vs RAG vs diff) and top retrieved files.
Large codebase features
- SQLite symbol index (
index.db) auto-enabled above 2,000 files - Batched indexing/embedding with configurable concurrency
- Incremental watch — only changed files re-indexed/re-embedded
- Batched vector search — min-heap top-k over SQLite cursors (no full-RAM load)
- SQLite state store (
state.db) for hashes + usage stats at scale - Full rebuild only when >15% of files change (configurable)
Configuration (.openprune/config.json)
{
"maxTokens": 8000,
"topK": 5,
"embeddingModel": "tfidf",
"indexBackend": "auto",
"sqliteIndexThreshold": 2000,
"indexConcurrency": 32,
"searchBatchSize": 2000,
"watchDebounceMs": 400,
"fullRebuildChangeRatio": 0.15,
"maxDiffFiles": 15,
"maxDiffTokens": 250
}Optional project ignore file: .openprune-ignore (same rules as .gitignore).
Output layers
[PROJECT SKELETON]— compressed symbol manifest[RELEVANT CODE CHUNKS]— TF-IDF / Ollama RAG top-k[RECENT CHANGES]— unified diffs vs last baseline[CONVERSATION SUMMARY]— compressed session history[USER QUERY]
Privacy
All indexing and embeddings stay on your machine under .openprune/. No telemetry by default.
v0.8 MCP (Cursor)
openprune mcp— stdio MCP server; no manual paste required.- Tools:
get_codebase_context,openprune_index_status,evaluate_task_scope. - Agent policy resource at
openprune://agent-policy+ Cursor rule template inexamples/cursor/.
npm run verify:mcp # MCP client end-to-end (after init)
npm run verify:sprint # retrieval regression (after init)MCP not showing in Cursor?
- Workspace root — Cursor only reads
.cursor/mcp.jsonfrom the opened folder. If you openedCode-Token-Saver, config must be there (not only underopenprune/.cursor/). "type": "stdio"— required in each server entry (Cursor MCP docs).- Enable the server — new MCP servers appear disabled in Settings → MCP; turn openprune on once.
- Build — local dev uses
node ${workspaceFolder}/dist/index.js; runnpm run buildfirst. - Index path — run
openprune initin the folder passed to--root. If your workspace isCode-Token-Saverbut the app is inopenprune/, use"--root", "${workspaceFolder}/openprune"inmcp.json(see repo root.cursor/mcp.json). - Reload — restart Cursor or run Developer: Reload Window.
- Logs — Output panel → MCP Logs (startup line shows
MCP project root: …).
v0.7 retrieval
- Query-focused skeleton — lists files/symbols relevant to your question, not the whole repo tree.
- Hybrid RAG — vector search + lexical boost (paths, symbols, summaries) + related files in the same module.
- Chunk deduplication — overlapping regions from the same file are merged to save tokens.
- Stale index warning —
openprune querywarns when files changed since lastinit. - Layer token breakdown — header shows skeleton / RAG / diff / session token use.
- Stats —
openprune statsshows top retrieved files from real query history.
Migrating from older dev previews
Run openprune init once — it automatically renames .codesieve/, .contextmesh/, or related ignore files to .openprune/ when present.
