@simosphere/tokenizer-speedup

v0.10.1

Published

15 days ago

Intelligent context optimization for LLM sessions

0High
0Medium
0Low

aschwan1211-simo

claude-code token-optimization llm context-management cost-reduction claude ai developer-tools plugin

The Problem

Every Claude Code session loads context — CLAUDE.md, project docs, feature specs, lessons learned. Most of it isn't needed for the current task, yet it consumes thousands of tokens every time.

TokenizerSpeedUp detects what you're working on and loads only the relevant context. The result: 30-60% fewer tokens per session.

How It Works

Prompt: "fix the authentication bug"
                 |
         Keyword Detection
          [fix] [bug] matched
                 |
     Rule Engine loads only:
       docs/lessons-learned/*.md
                 |
     Context Compression
       (summaries if needed)
                 |
    Optimized context injected
       2,300 tokens instead of 5,000

| Layer | Function | |---|---| | Context Gating | YAML rules detect keywords and load only matching docs | | Context Compression | Summaries instead of full files when appropriate | | Session Intelligence | Proactive context management to avoid compacting |

Quick Start

# Install globally
npm install -g @simosphere/tokenizer-speedup

# Setup with your API key
tsu setup --api-key YOUR_KEY

Restart Claude Code. Done.

Get your API key at tokenizer-speedup.simosphereai.com.

Claude Desktop (MCP)

Since v0.8.0, TokenizerSpeedUp includes an MCP server for Claude Desktop.

Setup

npm install -g @simosphere/tokenizer-speedup

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "tokenizer-speedup": {
      "command": "tsu-mcp"
    }
  }
}

Restart Claude Desktop.

Available Tools

| Tool | Description | |---|---| | optimize-context | Optimize LLM context for a given prompt | | analyze-tokens | Count and estimate tokens for text or files | | get-savings | Return cumulative session savings metrics |

Available Resources

| Resource | Description | |---|---| | config://current | Current YAML configuration | | report://savings | Session savings report |

Results

Real-world data from SIMO GmbH (2 weeks):

| Metric | Value | |---|---| | Token reduction | 54% | | Before | 550 lines / ~5,000 tokens | | After | 251 lines / ~2,300 tokens | | Monthly savings | $36 | | ROI | 1:6.7 |

Configuration

Create a tokenizer-speedup.yaml in your project root:

version: 1
rules:
  - trigger: [fix, bug, error]
    load: docs/lessons-learned/*.md
    max_tokens: 2000

  - trigger: [design, architecture]
    load: docs/decisions/*.md
    summary_only: true

  - trigger: [feature, implement]
    load: docs/features/*.md
    max_tokens: 4000

context_limits:
  max_injected_tokens: 8000
  max_file_tokens: 4000

See Configuration in the wiki for details.

CLI

tsu setup --api-key KEY    # Configure plugin + hooks
tsu analyze file.md        # Count tokens in a file
tsu config                 # Validate YAML config

Pricing

| Free | Pro | Team | |---|---|---| | $0/month | $5/month | $15/seat/month | | 100K tokens saved | Unlimited | Unlimited | | Basic dashboard | Full analytics | Admin dashboard | | Default rules | Custom rules | Team-wide sync |

Privacy

Your prompts never leave your machine
Only anonymous token counters are transmitted
No tracking, no analytics cookies
Open source core — inspect and audit anytime

Architecture

Claude Code
  |
  |-- UserPromptSubmit --> Rule Engine --> Context Injection
  |-- Stop             --> Metering Client --> Cloud API

Plugin — Bash/Node.js, Claude Code Hooks API
Rule Engine — YAML config, Node.js parser
Token Meter — gpt-tokenizer (local)
Cloud — Metering API + Dashboard + Stripe Billing
MCP Server — JSON-RPC over stdio, for Claude Desktop

Requirements

Node.js 20+
Claude Code or Claude Desktop

License

Business Source License 1.1 (BSL-1.1)