token-shield
v1.0.7
Published
AI token optimizer — compress prompts, enhance Flash model, smart routing. Save 80% tokens.
Downloads
924
Maintainers
Readme
🛡️ Token Shield
AI token optimizer — compress prompts, enhance Flash model, smart routing. Save 80% tokens. Make free models work like paid ones.
🚀 Install
npm install -g token-shield⚡ Quick Start
# 1. Initialize in your project
cd your-project
token-shield init
# 2. Hook into your AI tools (Claude Code, Cursor, Windsurf, Cline, Antigravity...)
token-shield install
# 3. Compress a prompt
token-shield compress "Hey could you please help me fix this JavaScript bug?"
# 4. View status + quota dashboard
token-shield status🎯 Two Problems Solved
| Problem | Solution | |---------|----------| | Claude token limit finishes in 1-2 days | Prompt compression → limit lasts 4-5× longer | | Flash model (free, 5h reset) gives weak results | Flash Enhancer → 50% → 85% quality |
🧩 6-Layer System
| Layer | What it does | Saving |
|-------|-------------|--------|
| 1. Input Compression | Remove filler, prune history, extract relevant code | 40-80% input tokens |
| 2. Project Intelligence | anatomy.md file index + mistake memory (cerebrum.md) | 80% file-reading tokens |
| 3. Smart Routing | Flash (free) → Claude → escalate based on quota | Max free usage |
| 4. CLI Proxy Wrapper | rtk-ai | Intercept console commands (npm test, git log) & summarize | ~90% terminal output tokens |
| 5. Flash Enhancer | Add expert role + step-by-step + format for Flash model | +35% Flash quality |
| 6. Caveman Mode | AI responds in compressed style (Lite/Full/Ultra) | 40-75% output tokens |
| 7. Quality + Escalation | Score responses, auto-escalate bad Flash answers | Zero wasted tokens |
💻 Commands
token-shield init # Create .token-shield/ intelligence files
token-shield install # Inject hooks into AI tools
token-shield proxy "npm run test" # Wraps noisy commands to suppress terminal token spam
# Try wrapping git log, ls, or pytest!
token-shield compress "prompt" # Compress + Flash enhance + route
token-shield status # Live quota + savings dashboard
token-shield dashboard # Web dashboard at localhost:4242
token-shield uninstall # Uninstall token-shield🪝 AI Tool Support
token-shield install auto-detects and injects compression rules into:
| Tool | Config file |
|------|------------|
| Claude Code | CLAUDE.md |
| Antigravity (Gemini) | GEMINI.md |
| Cursor | .cursorrules |
| Windsurf | .windsurf/rules.md |
| Cline / Roo | .clinerules |
| Any tool | AGENTS.md |
📁 Project Intelligence (.token-shield/)
After token-shield init, every project gets:
.token-shield/
├── anatomy.md ← File index with token estimates (AI reads this, not source files)
├── cerebrum.md ← Mistake memory (AI never repeats the same error)
├── memory.md ← Session learnings
├── ledger.json ← Lifetime token savings log
└── config.json ← Settings (caveman level, quota timers)anatomy.md tells AI what each file contains before reading it → 6-49× fewer file reads. cerebrum.md prevents the same mistake from happening twice → zero repeated errors.
🪨 Caveman Mode
Inspired by caveman (36k ⭐) — makes AI respond in compressed style:
Normal AI: "The reason your React component is re-rendering is likely because
you're creating a new object reference on each render cycle..."
→ 69 tokens
🪶 Lite: "Component re-renders: new object ref each render. Use useMemo."
→ 41 tokens (40% saved)
🪨 Full: "New obj ref each render. Inline prop = new ref = re-render. useMemo."
→ 24 tokens (65% saved)
🔥 Ultra: "Inline obj → new ref → re-render. useMemo."
→ 17 tokens (75% saved)⚡ Flash Model Enhancer
Flash model is free (resets every 5h) but gives weaker results than Claude. Token Shield enhances prompts for Flash automatically:
Before → Flash: "Fix my React bug"
After → Flash: "You are a Senior React/Next.js developer.
Think step by step before answering.
Format: (1) Root cause (2) Fix with code (3) One-line explanation.
Fix my React bug"Result: Flash quality goes from ~50% → ~85% of Claude quality. Free.
📊 Expected Results
| Metric | Before | After | |--------|--------|-------| | Claude limit duration | 1-2 days | 5-7 days | | Input tokens per prompt | 847 avg | ~150 avg | | Flash result quality | 50% | 85% | | Output tokens (Ultra) | 200 avg | ~50 avg |
🏗️ Built With Inspiration From
- caveman — output compression
- openwolf — project anatomy + memory
- code-review-graph — codebase indexing
Token Shield adds: Flash Enhancer + Smart Routing + Model Escalation — features none of them have.
🔐 Privacy
- 100% local processing — no server, no cloud
- API keys never transmitted anywhere
.token-shield/stays in your project folder- Zero telemetry, zero tracking
License
MIT
