@ruso-0/tokenguard

v4.0.2

Published

2 months ago

MCP plugin for Claude Code — 3 router tools with invisible middleware for token-optimized code navigation, compression, and safety. Lite mode (instant) or Pro mode (semantic search).

Downloads

0High
0Medium
0Low

ruso-0

mcp claude-code code-safety ast-validation circuit-breaker tree-sitter defensive-coding surgical-edit ai-safety semantic-search

TokenGuard v4.0.2 - 3 Tools. 480 Tests. Zero Cloud. Instant Startup.

What Changed in v4.0

Batch Edit, Architecture Map, Blast Radius, and Sniper Refactor.

| Feature | What It Does | |---|---| | batch_edit | Atomically edit multiple symbols across multiple files. All-or-nothing: if any file fails validation, nothing is written. | | Architecture Map | tg_navigate map now shows dependency tiers (core/logic/leaf) based on import centrality. | | Blast Radius | When you change a function's signature, TokenGuard warns which files import it and suggests batch_edit. | | prepare_refactor | AST-based confidence classification for safe renaming. Classifies each occurrence as "high" or "review" (strings, comments, keys). | | AST Symbol Names | Parser uses tree-sitter @_name captures instead of ~10 fragile regexes. |

v4.0.0 Bugfixes: multi-line console.log stripping, Python # in strings, proper glob matching via picomatch, stale docstring.

v4.0.1 Fix: Corrected inflated tokensAvoided metric that double-counted file reads.

v4.0.2 Fixes (5 logic + 7 doc): Blind Sniper (exhaustive SQL scan for prepare_refactor), batch edit race condition (two-phase file locking), indexOf wrong function (local window search), extractSignature string confusion (string-state tracking), silent plan amnesia (visible warning for oversized plans).

What Changed in v3.1

Creative Circuit Breaker — 3-level escalation that teaches Claude new strategies instead of just blocking:

| Level | Strategy | What Claude Does | |---|---|---| | Level 1 — Rewrite | Stop patching, start fresh | Reads uncompressed code, writes symbol_v2 via insert_after, tests, then swaps | | Level 2 — Decompose | Break into smaller pieces | Extracts 2-3 pure helpers, tests each, rewrites original as thin orchestrator | | Level 3 — Hard Stop | Ask the human | Explains what failed, the error pattern, why strategies 1-2 didn't work |

Each redirect includes compress:false so Claude sees actual code, not compressed placeholders.

Other v3.1 additions:

Amnesia total — softReset() purges ALL history for the tripped file, giving Claude 3 clean attempts with the new strategy
Topological edits — insert_before / insert_after modes for tg_code edit (add code without replacing)
Smart auto-indent — Relative rebase: strips Claude's indent, applies the target symbol's indent (works with tabs, spaces, Python, Go)
Behavioral advisor — Suggests compression when Claude reads large files raw
Danger zones — tg_guard status shows the 5 heaviest unread files so Claude avoids raw-reading them
CLI hygiene — --help / --version flags
Cross-platform splice — Verified byte indices with indexOf fallback for Linux/macOS/Windows consistency

What Changed in v3.0

TokenGuard v2 had 16 tools. That meant ~3,520 tokens of fixed overhead just for tool definitions, plus wasted output tokens as the LLM reasoned about which of 16 tools to call. For small/medium projects, TokenGuard was net-negative.

v3.0 fixes this by collapsing 16 tools into 3 routers and moving validation/safety into invisible middleware:

| v2 (16 tools) | v3 (3 tools) | What Changed | |---|---|---| | tg_search, tg_def, tg_refs, tg_outline, tg_map | tg_navigate | One router, action parameter selects behavior | | tg_read, tg_compress, tg_semantic_edit, tg_undo, tg_terminal | tg_code | Edits auto-validated via AST before disk write | | tg_pin, tg_status, tg_session_report | tg_guard | Safety + monitoring unified | | tg_validate | invisible middleware | Runs automatically inside tg_code edit | | tg_circuit_breaker | invisible middleware | Monitors all calls, 3-level creative escalation | | tg_audit | CLI only | Removed from MCP, available via npx @ruso-0/tokenguard --audit |

Result: ~660 tokens of tool definitions instead of ~3,520. 81% reduction in fixed overhead.

Lite Mode vs Pro Mode

| | Lite (Default) | Pro (Opt-in) | |---|---|---| | Startup | Instant (~100ms) | ~5-10s (ONNX model load) | | Search | BM25 keyword search | Hybrid semantic + BM25 with RRF | | Dependencies | Tree-sitter only | Tree-sitter + ONNX Runtime | | Enable | Default | --enable-embeddings flag |

Lite mode is perfect for most projects. Pro mode adds semantic understanding for large codebases.

The Problem

You're 90 minutes into a Claude Pro session. You've been exploring a codebase, reading files, running grep searches. Suddenly: context limit reached. Your session is over.

Why? Because every grep reads entire files. Every Read dumps thousands of tokens. Every broken code write causes a fix-retry loop that burns your remaining context.

Best for: Medium-to-large codebases (50+ files) where reading every file would exhaust Claude's context window. For small projects (<20 files), native Claude Code tools may be sufficient.

The Solution

TokenGuard sits between you and token waste with 3 smart tools:

| What You Do Now | What TokenGuard Does | Savings | |---|---|---| | grep "auth" ./src reads 50 files | tg_navigate action:"search" query:"authentication" returns 5 relevant chunks | ~97% (estimated) | | Read src/engine.ts dumps 5,502 tokens | tg_code action:"compress" path:"src/engine.ts" sends 1,753 tokens | ~68% (estimated) | | Read file + skim for function | tg_navigate action:"definition" symbol:"AuthService" jumps straight there | 300x faster | | Copy-paste 500 lines of npm errors | tg_code action:"filter_output" extracts the 3 actual errors | ~89% (estimated) | | Rewrite entire file to change one function | tg_code action:"edit" patches only the AST node | ~98% output saved (estimated) | | Write broken code → see error → retry loop | Automatic AST validation blocks bad writes before disk | Prevents loop | | Claude gets stuck in write-test-fail loops | Creative circuit breaker teaches new strategies | Saves session | | Claude forgets "always use fetch, not axios" | tg_guard action:"pin" keeps rules in every response | Never forgotten |

Note on compression: Medium compression keeps function signatures + key body lines (return, throw, await, assignments). If you're debugging a specific function's internals, use compress:false or level:"light" to see the full code. The Creative Circuit Breaker automatically instructs compress:false when it redirects you to rewrite a function.

The 3 Tools

`tg_navigate` — Search & Navigate

| Action | Description | |---|---| | search | Hybrid semantic + BM25 search (Pro) or keyword search (Lite). Returns compressed AST chunks. | | definition | Go-to-definition by symbol name. 100% precise AST lookup. | | references | Find all references to a symbol across the project. | | outline | List all symbols in a file with signatures and line ranges. | | map | Static repo map with pinned rules and architecture tiers (core/logic/leaf). Prompt-cache-friendly. | | prepare_refactor | AST-based confidence classification for safe renaming. Classifies each occurrence as "high" or "review". |

`tg_code` — Read, Compress & Edit

| Action | Description | |---|---| | read | Smart file reader with behavioral advisor (suggests compression for large files). | | compress | Full-control compression. 3 levels (light/medium/aggressive) or 6 tiers. | | edit | Surgically edit a function/class by name. Supports replace, insert_before, insert_after. Auto-validated via AST. | | batch_edit | Atomically edit multiple symbols across multiple files. All-or-nothing with reverse splice ordering. | | undo | Revert the last edit. One-shot backup restore. | | filter_output | Filter noisy terminal output. Strips ANSI, deduplicates, extracts errors. |

`tg_guard` — Safety & Memory

| Action | Description | |---|---| | pin | Pin a rule Claude should never forget. Injected into every map response. | | unpin | Remove a pinned rule. | | status | Token burn rate, exhaustion prediction, danger zones (heaviest unread files), and alert levels. | | report | Full session savings receipt with per-file-type breakdown and USD estimates. | | reset | Clear circuit breaker state to let Claude retry with a fresh approach. | | set_plan | Anchor a master plan file for heartbeat re-injection (~15 tool calls). Bankruptcy Shield rejects plans >4000 tokens. | | memorize | Write progress notes to persistent scratchpad. Re-injected during heartbeat to survive context compaction. |

Supported Languages

TokenGuard's features have different levels of support depending on the language:

| Feature | TS/JS | Python | Go | Other (Rust, Java, C++, etc.) | |---------|-------|--------|----|-------------------------------| | BM25 keyword search | ✅ | ✅ | ✅ | ✅ | | Compression (Stage 1-2: comments, whitespace, token filtering) | ✅ | ✅ | ✅ | ✅ | | Compression (Stage 3: AST body stripping) | ✅ | ✅ | ✅ | ❌ | | AST validation before write | ✅ | ✅ | ✅ | ❌ | | Semantic edit (replace/insert) | ✅ | ✅ | ✅ | ❌ | | Go-to-definition / references | ✅ | ✅ | ✅ | ❌ | | Semantic search (Pro mode) | ✅ | ✅ | ✅ | ✅ |

For unsupported languages, TokenGuard still works as a keyword search engine and text-level compressor. AST features (validation, structural compression, surgical edits) require a Tree-sitter grammar — contributions for additional languages are welcome.

Invisible Middleware

These run automatically — you never call them directly:

AST Validation: Every tg_code action:"edit" validates syntax via tree-sitter before writing to disk. Invalid code is blocked with exact line/column error details and fix suggestions.
Creative Circuit Breaker: Monitors all tool calls for destructive patterns (same error 3x, same file 5x). Instead of just blocking, it escalates through 3 creative strategies: Rewrite → Decompose → Hard Stop. Each level includes compress:false file reads and concrete step-by-step instructions. Auto-resets with amnesia total on strategy change.
File Lock: File-level mutex prevents concurrent edit corruption. When tg_code action:"edit" or batch_edit targets a file, it acquires an exclusive lock. Stale locks auto-expire after 30 seconds.
Behavioral Advisor: When Claude reads a large file raw (without compression), advises using tg_code action:"compress" next time. Teaches efficient patterns without blocking.

Auto-Context Inlining (X-Ray Vision)

When Claude asks for a function definition, it often needs to understand the dependencies that function calls. Without Auto-Context, Claude makes N additional tool calls to look up each dependency — burning tokens and time.

TokenGuard solves this by automatically resolving imported dependencies and injecting their signatures in the same response:

| Without Auto-Context | With Auto-Context | |---------------------|-------------------| | 1. tg_navigate definition "validateToken" | 1. tg_navigate definition "validateToken" | | 2. tg_navigate definition "HashUtils" | (signatures auto-injected) | | 3. tg_navigate definition "TokenStore" | | | 3 tool calls, ~1,800 tokens | 1 tool call, ~700 tokens |

Security: Signatures containing passwords, API keys, or auth tokens are automatically excluded. JSDoc comments are stripped to prevent prompt injection.

Disable: Pass auto_context: false if you want pure output without injected signatures.

Context Heartbeat (Anti-Amnesia Protocol)

Claude Code compacts context after extended sessions, destroying plans, schemas, and architectural decisions. Context Heartbeat solves this by silently re-injecting your critical constraints every ~15 tool calls.

Setup:

# Anchor your plan at the start of a session
tg_guard action:"set_plan" text:"PLAN.md"

# Leave notes as you progress
tg_guard action:"memorize" text:"Finished auth module. Starting on payments. Using Stripe SDK."

How it works:

Counts tool calls deterministically (no async lag)
Only injects during safe operations (read, search, definition) — never during edits
Places memory ABOVE the tool response (Attention Sandwich pattern)
Detects server restarts and resets the counter automatically
Rejects plans >4,000 tokens to prevent accelerating compaction

What gets re-injected:

Your master plan (schemas, constraints, architecture)
Your scratchpad notes (progress, decisions)
Pinned rules
Recent successful edits (spatial awareness)
Circuit Breaker state (if active)

Installation

# One command — runs directly from npm:
npx @ruso-0/tokenguard

Or install globally:

npm install -g @ruso-0/tokenguard

CLI

tokenguard --help       # Show usage and options
tokenguard --version    # Show version (4.0.2)
tokenguard init         # Generate optimal CLAUDE.md instructions
tokenguard --audit      # Run security audit (CLI only)

Claude Code Configuration

Option A — CLI (recommended):

# Lite mode (instant startup, keyword search):
claude mcp add tokenguard -- npx @ruso-0/tokenguard

# Pro mode (semantic search, requires ONNX model download on first run):
claude mcp add tokenguard -- npx @ruso-0/tokenguard --enable-embeddings

Option B — Manual config in .claude.json or claude_desktop_config.json:

{
  "mcpServers": {
    "tokenguard": {
      "command": "npx",
      "args": ["-y", "@ruso-0/tokenguard"]
    }
  }
}

For Pro mode, add "--enable-embeddings" to the args array.

Cleanup

TokenGuard creates .tokenguard.db and .tokenguard/backups/ in your project root. These are automatically excluded from git via standard .gitignore patterns. To remove them:

rm -rf .tokenguard.db .tokenguard/ CLAUDE.md

Benchmark

We're running reproducible benchmarks on real-world refactors (Express.js, Axios). Results with full methodology and API billing logs will be published here.

Star the repo to get notified when benchmarks drop.

Quick Start

# TokenGuard runs as an MCP server — just use the tools:

# 1. Pin your project rules (they'll never be forgotten)
tg_guard action:"pin" text:"Always use fetch, not axios"
tg_guard action:"pin" text:"API base URL is /api/v2"

# 2. Get the repo map (cached by Anthropic prompt cache, includes pinned rules)
tg_navigate action:"map"

# 3. Search semantically (replaces grep)
tg_navigate action:"search" query:"authentication middleware"

# 4. Jump to a definition (replaces Read + Ctrl+F)
tg_navigate action:"definition" symbol:"AuthService"

# 5. Surgically edit a function (auto-validated, no file rewrite needed)
tg_code action:"edit" path:"src/auth.ts" symbol:"validateToken" new_code:"..."

# 6. Add a new function after an existing one (topological edit)
tg_code action:"edit" path:"src/auth.ts" symbol:"validateToken" mode:"insert_after" new_code:"..."

# 7. Filter noisy terminal output
tg_code action:"filter_output" output:"<paste error output>"

# 8. Check danger zones + burn rate
tg_guard action:"status"

# 9. Full session report with receipt
tg_guard action:"report"

Architecture

+-------------------------------------------------------------+
|                  Claude Code (MCP Client)                    |
+----------------------------+--------------------------------+
                             | stdio (JSON-RPC)
+----------------------------v--------------------------------+
|          TokenGuard MCP Server (3 router tools)              |
|                                                              |
|  +--------------------------------------------------------+  |
|  |  Middleware Layer (invisible)                            |  |
|  |  +------------------+ +---------------------+          |  |
|  |  | AST Validator    | | Creative Circuit    |          |  |
|  |  | (pre-edit check) | | Breaker (3 levels)  |          |  |
|  |  +------------------+ +---------------------+          |  |
|  |  +------------------+ +---------------------+          |  |
|  |  | File Lock        | | Behavioral          |          |  |
|  |  | (edit mutex)     | | Advisor (reads)     |          |  |
|  |  +------------------+ +---------------------+          |  |
|  +--------------------------------------------------------+  |
|                                                              |
|  +------------------+------------------+------------------+  |
|  | tg_navigate      | tg_code          | tg_guard         |  |
|  | search           | read             | pin / unpin      |  |
|  | definition       | compress         | status           |  |
|  | references       | edit (validated) | report           |  |
|  | outline          | batch_edit       | set_plan         |  |
|  | map              | undo             | memorize         |  |
|  | prepare_refactor | filter_output    | reset            |  |
|  +--------+---------+--------+---------+--------+---------+  |
|           |                  |                  |            |
|  +--------v------------------v------------------v---------+  |
|  |                    Core Layer                           |  |
|  |  +----------+ +----------+ +----------+ +----------+  |  |
|  |  | Embedder | |  Parser  | | Database | | Sandbox  |  |  |
|  |  |(jina v2) | |(TreeSit.)| | (SQLite) | |(Validate)|  |  |
|  |  +----------+ +----------+ +----------+ +----------+  |  |
|  +---------------------------------------------------------+  |
+--------------------------------------------------------------+

Stress Tested

480 tests. 0 failures. 22 test suites. Cross-platform CI on Ubuntu, Windows, and macOS.

| Scenario | What We Tested | Result | |---|---|---| | Router dispatch | All 19 {tool, action} combinations | Pass | | Middleware wrap | Creative circuit breaker 3-level escalation, amnesia total | Pass | | AST validation | Valid/invalid code, error formatting | Pass | | Backward compat | All 16 original tool behaviors preserved | Pass | | Empty files | 0-byte input through every pipeline stage | Pass | | 500KB TypeScript | ~3,500 generated functions | Pass | | Binary data | Random bytes, null bytes, non-UTF-8 | Pass | | Unicode / CJK / Emoji | Japanese identifiers, emoji in strings | Pass | | Minified 50KB JS | Single-line, no whitespace, 2000 functions | Pass | | 20-level nesting | Deeply nested function chains | Pass | | 50-file concurrent batch | Batch insert + hybrid search | Pass | | Surgical edits | Replace, insert_before, insert_after with auto-indent | Pass | | Pin memory | Add/remove/persist/limits/deterministic output | Pass | | E2E circuit breaker | 3 failures → Level 1 redirect → amnesia → recovery | Pass | | Batch edit ACID | Multi-file atomic edits, rollback on syntax error | Pass | | Architecture map | Import centrality, percentile tiers, FastLookup | Pass | | Blast radius | Signature change detection, dependent file warnings | Pass | | Prepare refactor | AST confidence classification (high/review) | Pass | | Cross-platform splice | Verified byte indices on Linux, Windows, macOS | Pass | | Auto-Context Inlining | Import extraction, security filters, Go namespace inference | Pass | | Context Heartbeat | Anti-amnesia re-injection, restart detection, bankruptcy shield | Pass | | v4.0.2 bugfix regression | Exhaustive symbol search, duplicate functions, signature strings | Pass | | Plaintext fallback | BM25 search for unsupported languages (.rs, .java, .cpp) | Pass |

Real-World Validation

Tested against a 57-file production Next.js + Supabase app (SICAEP):

~94% token reduction (estimated) (tier 1 compression)
10,532 tokens saved on a single search query
480/480 tests passed across 3 operating systems
Surgically fixed a real .single() → .maybeSingle() bug via tg_code action:"edit"
Creative circuit breaker correctly detected and redirected repeated error patterns
Path traversal attack (../../../../etc/passwd) → BLOCKED

Methodology note: Token savings are estimated using a chars/4 heuristic (±30% vs actual tokenizer). Comparisons are against raw file reads, not against other AI coding tools. These numbers represent the upper bound of savings.

Note: TokenGuard is most effective on projects with 50+ files. For very small projects (<20 files), the overhead may not justify the savings.

Security

Zero cloud: All processing is local. No API keys, no telemetry, no network calls.
No data leaves your machine: Embeddings computed locally via ONNX Runtime.
Path traversal protection: All file paths validated with safePath().
Symlink resolution: All file paths resolved via realpathSync() to prevent symlink escapes.
Sensitive file blocklist: .env, .ssh, .git/credentials, .pem, .key files are blocked automatically.
Pin sanitization: Pinned rules are sanitized to block URLs, shell commands, and path traversal.
File-level mutex: Concurrent edits to the same file are blocked to prevent corruption.
SQLite storage: Your code index stays in .tokenguard.db in your project root.
WASM memory safety: All tree-sitter parsing wrapped in safeParse() with guaranteed cleanup.
MIT licensed: Fully open source, audit the code yourself.

Contributing

PRs welcome! Please read CONTRIBUTING.md before submitting.

# Development
git clone https://github.com/Ruso-0/TokenGuard.git
cd TokenGuard
npm install
npm run build
npm test

License

MIT