@simosphere/tokenizer-speedup
v0.8.1
Published
Intelligent context optimization for LLM sessions
Maintainers
Readme
The Problem
Every Claude Code session loads context — CLAUDE.md, project docs, feature specs, lessons learned. Most of it isn't needed for the current task, yet it consumes thousands of tokens every time.
TokenizerSpeedUp detects what you're working on and loads only the relevant context. The result: 30-60% fewer tokens per session.
How It Works
Prompt: "fix the authentication bug"
|
Keyword Detection
[fix] [bug] matched
|
Rule Engine loads only:
docs/lessons-learned/*.md
|
Context Compression
(summaries if needed)
|
Optimized context injected
2,300 tokens instead of 5,000| Layer | Function | |---|---| | Context Gating | YAML rules detect keywords and load only matching docs | | Context Compression | Summaries instead of full files when appropriate | | Session Intelligence | Proactive context management to avoid compacting |
Quick Start
# Install globally
npm install -g @simosphere/tokenizer-speedup
# Setup with your API key
tsu setup --api-key YOUR_KEYRestart Claude Code. Done.
Get your API key at tokenizer-speedup.simosphereai.com.
Claude Desktop (MCP)
Since v0.8.0, TokenizerSpeedUp includes an MCP server for Claude Desktop.
Setup
npm install -g @simosphere/tokenizer-speedupAdd to your claude_desktop_config.json:
{
"mcpServers": {
"tokenizer-speedup": {
"command": "tsu-mcp"
}
}
}Restart Claude Desktop.
Available Tools
| Tool | Description |
|---|---|
| optimize-context | Optimize LLM context for a given prompt |
| analyze-tokens | Count and estimate tokens for text or files |
| get-savings | Return cumulative session savings metrics |
Available Resources
| Resource | Description |
|---|---|
| config://current | Current YAML configuration |
| report://savings | Session savings report |
Results
Real-world data from SIMO GmbH (2 weeks):
| Metric | Value | |---|---| | Token reduction | 54% | | Before | 550 lines / ~5,000 tokens | | After | 251 lines / ~2,300 tokens | | Monthly savings | $36 | | ROI | 1:6.7 |
Configuration
Create a tokenizer-speedup.yaml in your project root:
version: 1
rules:
- trigger: [fix, bug, error]
load: docs/lessons-learned/*.md
max_tokens: 2000
- trigger: [design, architecture]
load: docs/decisions/*.md
summary_only: true
- trigger: [feature, implement]
load: docs/features/*.md
max_tokens: 4000
context_limits:
max_injected_tokens: 8000
max_file_tokens: 4000See Configuration in the wiki for details.
CLI
tsu setup --api-key KEY # Configure plugin + hooks
tsu analyze file.md # Count tokens in a file
tsu config # Validate YAML configPricing
| Free | Pro | Team | |---|---|---| | $0/month | $5/month | $15/seat/month | | 100K tokens saved | Unlimited | Unlimited | | Basic dashboard | Full analytics | Admin dashboard | | Default rules | Custom rules | Team-wide sync |
Privacy
- Your prompts never leave your machine
- Only anonymous token counters are transmitted
- No tracking, no analytics cookies
- Open source core — inspect and audit anytime
Architecture
Claude Code
|
|-- UserPromptSubmit --> Rule Engine --> Context Injection
|-- Stop --> Metering Client --> Cloud API- Plugin — Bash/Node.js, Claude Code Hooks API
- Rule Engine — YAML config, Node.js parser
- Token Meter — gpt-tokenizer (local)
- Cloud — Metering API + Dashboard + Stripe Billing
- MCP Server — JSON-RPC over stdio, for Claude Desktop
Requirements
- Node.js 20+
- Claude Code or Claude Desktop
Links
License
Business Source License 1.1 (BSL-1.1)
