openprune

v0.8.1

Published

25 days ago

Open-source CLI to minimize LLM token usage via smart code context extraction

0High
0Medium
0Low

ajayjeena

llm cli tokens rag cursor code-context mcp mcp-server openprune token-savings developer-tools

OpenPrune

Open-source, language-agnostic CLI that minimizes LLM token usage by pruning any codebase down to the context that actually matters. Works offline on repos of any size.

Commands

| Command | Description | |---------|-------------| | openprune init | Full index + embeddings + hash/snapshot baseline | | openprune query "…" | Assemble skeleton + RAG chunks + diffs + session summary | | openprune dashboard | Web UI at http://localhost:4242 | | openprune watch | Debounced incremental re-index on file save | | openprune stats | Token savings dashboard + index health | | openprune diff | Unified diffs since last baseline | | openprune compress | Manually compress session.json history | | openprune session-end | Snapshot hashes + file baselines after a session | | openprune mcp | MCP server on stdio for Cursor / Claude Desktop |

Quick start (Cursor MCP — recommended)

No manual copy-paste. The agent fetches pruned context automatically.

npm install -g openprune
cd your-project
openprune init

Add to .cursor/mcp.json in the folder you opened in Cursor (must be the same folder that contains .cursor/). See examples/cursor/mcp.json:

{
  "mcpServers": {
    "openprune": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "openprune", "mcp"],
      "env": { "OPENPRUNE_ROOT": "${workspaceFolder}" }
    }
  }
}

Important: Cursor requires "type": "stdio" for local servers. ${workspaceFolder} is the directory that contains .cursor/mcp.json (not a subfolder).

| You opened in Cursor | Put mcp.json at | |----------------------|-------------------| | your-app/ | your-app/.cursor/mcp.json | | project-folder/ (monorepo) | parent .cursor/mcp.json pointing at openprune/dist/... (see repo root example) | | openprune/ only | openprune/.cursor/mcp.json |

Copy examples/cursor/rules/openprune.mdc to .cursor/rules/openprune.mdc (or use the rule from this repo). Restart Cursor, enable the openprune MCP server in settings, then use Agent as usual — it should call get_codebase_context before broad file reads.

MCP tools

| Tool | When to use | |------|-------------| | evaluate_task_scope | Unsure if the task needs repo context | | get_codebase_context | Required for implement / debug / explore tasks | | openprune_index_status | Check init / stale index |

Quick start (CLI paste)

npm install -g openprune
cd your-project
openprune init
openprune query "how does auth work"
# paste the output into Cursor / ChatGPT / Claude, then ask your question
openprune watch   # optional: keep index fresh while coding

From source:

git clone https://github.com/openprune/openprune.git
cd openprune
npm install
npm run build
npm run openprune -- init

How OpenPrune saves cost

LLM tools (Cursor, Claude, ChatGPT, etc.) charge mainly on input tokens — the code and chat history you send each turn.

Without OpenPrune, a common pattern is:

@-mention whole folders or large files
Let the agent re-read many unrelated files each turn
Paste long threads again on follow-ups

On a medium repo that can mean tens of thousands of tokens per message, repeated many times.

With OpenPrune MCP in Cursor, the agent calls get_codebase_context and receives the same pruned packet (~8k token cap) without you pasting anything.

With the CLI, you run openprune query "…" and paste one pruned context block that contains only:

Files and symbols related to your question (skeleton)
The few code regions that matter (RAG chunks)
Small diffs for what you changed recently
A short session summary (after multiple turns)

Same question, far less input — OpenPrune prints an estimate in the header, e.g. Tokens used: 1,400 / 8000 (vs ~45,000 raw).

| Phase | Command | Saves money? | |-------|---------|----------------| | Setup | openprune init | No (local indexing only) | | Per question | openprune query or MCP get_codebase_context | Yes | | While coding | openprune watch | Indirectly (keeps retrieval accurate) | | Track savings | openprune stats | Shows history, does not save by itself |

OpenPrune integrates with Cursor via MCP (recommended) or manual query + paste.

Rough illustration (varies by repo and model pricing):

| Approach | Input per heavy turn | 20-turn session @ $3/1M input | |----------|----------------------|-------------------------------| | Paste whole src/ context | ~50,000 tokens | ~$3.00 | | OpenPrune query packet | ~1,500 tokens | ~$0.09 |

Check your provider’s pricing; run openprune stats to see measured usage on your machine.

New user guide

1. One-time setup (per repo)

cd /path/to/your-app
openprune init

This creates .openprune/ (add it to gitignore — init does that for you). Nothing is sent to the cloud.

2. Before each coding task (saves tokens)

openprune query "your specific question in one sentence"

Output is copied to your clipboard by default. Paste it as the first message or pinned context in your LLM chat.

Use --stdout to print only:

openprune query "how does billing work" --stdout

3. Optional: keep the index fresh

openprune watch

Re-indexes on save so the next query stays accurate. Run in a separate terminal while you code.

4. End of session

openprune session-end
openprune stats

Updates file baselines for diffs and shows token usage history.

Using OpenPrune with Cursor (MCP)

openprune init in your app repo
Add .cursor/mcp.json and .cursor/rules/openprune.mdc (from examples/cursor/)
Ask Agent: "Add Google OAuth next to email login"

The rule tells the agent to call get_codebase_context first and to avoid @src/ unless something is missing from the packet.

Manual paste (without MCP)

Scenario: You have an existing app and ask Cursor to add a new feature (e.g. Google OAuth next to email login).

Step A — Prepare context

cd your-app
openprune init   # skip if already done
openprune query "Add Google OAuth alongside existing login; callback route, token exchange, reuse session cookies"

You get a compact block: auth-related skeleton, login/middleware/route chunks, recent edits to oauth.ts, and your question.

Step B — Paste into Cursor (sample prompt)

Copy the full openprune query output, then start a new Cursor chat (or Agent) with:

Use the project context below. Do not re-scan the whole repo unless something is missing.
Prefer editing the files listed in the skeleton; follow existing patterns in the code chunks.

--- OPENPRUNE CONTEXT START ---
(paste entire openprune query output here)
--- OPENPRUNE CONTEXT END ---

Task:
Implement Google OAuth alongside the existing email/password login.
- Reuse session and token patterns from login.ts
- Add routes for /auth/google and /auth/google/callback
- Wire requireAuth consistently with the middleware shown above
- List files you will change before editing

Cursor implements against ~1–3k tokens of targeted context instead of pulling in large parts of the tree on its own.

Step C — Follow-up questions (save again)

After you change several files, run a new query instead of re-pasting the whole chat:

openprune query "OAuth callback: exchange code for tokens and store refresh token like login()"

Follow-up prompt template:

Updated context from OpenPrune (use this instead of re-reading all prior files):

--- OPENPRUNE CONTEXT START ---
(paste new query output)
--- OPENPRUNE CONTEXT END ---

Continue the OAuth work. Only change what this task requires.

Step D — Shorter prompts when context is already in the thread

Once Cursor has the OpenPrune block in the thread, you can ask normally:

Add error handling for invalid OAuth state and log failures without leaking tokens.

Re-run openprune query when you switch to a new area of the codebase (e.g. from auth to billing).

Sample prompts (copy-paste)

Explore unfamiliar code

--- OPENPRUNE CONTEXT START ---
(openprune query output)
--- OPENPRUNE CONTEXT END ---

Explain how authentication works in this codebase in 5 bullets.
Point to the exact files I should read next.

Implement a feature

--- OPENPRUNE CONTEXT START ---
(openprune query output)
--- OPENPRUNE CONTEXT END ---

Implement: <describe feature in 2–3 sentences>
Constraints: match existing style; minimal diff; add no new dependencies unless necessary.

Debug a bug

openprune query "why does validateToken fail on expired JWT in production config"

--- OPENPRUNE CONTEXT START ---
(openprune query output)
--- OPENPRUNE CONTEXT END ---

Bug: <symptom>. Propose the smallest fix and which tests to add.

Refactor safely

--- OPENPRUNE CONTEXT START ---
(openprune query output)
--- OPENPRUNE CONTEXT END ---

Refactor <module> for clarity without behavior changes.
List risks and files touched before making edits.

Tips for maximum savings

Run openprune query with a specific sentence — not vague one-word queries.
Paste the packet once per task, then ask focused follow-ups.
Avoid @-mentioning entire src/ if OpenPrune already listed the right files.
Use openprune watch or re-init after large refactors so chunks stay current.
Check openprune stats to see layer breakdown (skeleton vs RAG vs diff) and top retrieved files.

Large codebase features

SQLite symbol index (index.db) auto-enabled above 2,000 files
Batched indexing/embedding with configurable concurrency
Incremental watch — only changed files re-indexed/re-embedded
Batched vector search — min-heap top-k over SQLite cursors (no full-RAM load)
SQLite state store (state.db) for hashes + usage stats at scale
Full rebuild only when >15% of files change (configurable)

Configuration (`.openprune/config.json`)

{
  "maxTokens": 8000,
  "topK": 5,
  "embeddingModel": "tfidf",
  "indexBackend": "auto",
  "sqliteIndexThreshold": 2000,
  "indexConcurrency": 32,
  "searchBatchSize": 2000,
  "watchDebounceMs": 400,
  "fullRebuildChangeRatio": 0.15,
  "maxDiffFiles": 15,
  "maxDiffTokens": 250
}

Optional project ignore file: .openprune-ignore (same rules as .gitignore).

Output layers

[PROJECT SKELETON] — compressed symbol manifest
[RELEVANT CODE CHUNKS] — TF-IDF / Ollama RAG top-k
[RECENT CHANGES] — unified diffs vs last baseline
[CONVERSATION SUMMARY] — compressed session history
[USER QUERY]

Privacy

All indexing and embeddings stay on your machine under .openprune/. No telemetry by default.

v0.8 MCP (Cursor)

openprune mcp — stdio MCP server; no manual paste required.
Tools: get_codebase_context, openprune_index_status, evaluate_task_scope.
Agent policy resource at openprune://agent-policy + Cursor rule template in examples/cursor/.

npm run verify:mcp      # MCP client end-to-end (after init)
npm run verify:sprint   # retrieval regression (after init)

MCP not showing in Cursor?

Workspace root — Cursor only reads .cursor/mcp.json from the opened folder. If you opened Code-Token-Saver, config must be there (not only under openprune/.cursor/).
"type": "stdio" — required in each server entry (Cursor MCP docs).
Enable the server — new MCP servers appear disabled in Settings → MCP; turn openprune on once.
Build — local dev uses node ${workspaceFolder}/dist/index.js; run npm run build first.
Index path — run openprune init in the folder passed to --root. If your workspace is Code-Token-Saver but the app is in openprune/, use "--root", "${workspaceFolder}/openprune" in mcp.json (see repo root .cursor/mcp.json).
Reload — restart Cursor or run Developer: Reload Window.
Logs — Output panel → MCP Logs (startup line shows MCP project root: …).

v0.7 retrieval

Query-focused skeleton — lists files/symbols relevant to your question, not the whole repo tree.
Hybrid RAG — vector search + lexical boost (paths, symbols, summaries) + related files in the same module.
Chunk deduplication — overlapping regions from the same file are merged to save tokens.
Stale index warning — openprune query warns when files changed since last init.
Layer token breakdown — header shows skeleton / RAG / diff / session token use.
Stats — openprune stats shows top retrieved files from real query history.

Migrating from older dev previews

Run openprune init once — it automatically renames .codesieve/, .contextmesh/, or related ignore files to .openprune/ when present.