npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

cheap-claude

v1.1.0

Published

57% token cost reduction for Claude Code. Zero quality loss. Plugin + API proxy.

Readme

   _____ _                         _____ _                 _      
  / ____| |                       / ____| |               | |     
 | |    | |__   ___  __ _ _ __   | |    | | __ _ _   _  __| | ___ 
 | |    | '_ \ / _ \/ _` | '_ \  | |    | |/ _` | | | |/ _` |/ _ \
 | |____| | | |  __/ (_| | |_) | | |____| | (_| | |_| | (_| |  __/
  \_____|_| |_|\___|\__,_| .__/   \_____|_|\__,_|\__,_|\__,_|\___|
                          | |                                       
                          |_|                                       

70% cost reduction on API calls. 57% on Claude Code sessions. Zero quality loss.

Why This Exists

On April 4, 2026, Anthropic blocked Claude Pro and Max subscribers from using their flat-rate plans with third-party AI agent frameworks — starting with OpenClaw, expanding to all third-party harnesses this month.

The impact is brutal:

  • 135,000+ OpenClaw instances were running when the announcement hit
  • A $200/month Max subscription was covering $1,000-$5,000 in actual compute — third-party tools bypass Anthropic's Prompt Cache optimizations that official tools use to cut costs
  • Users are now forced onto pay-as-you-go API billing, where a heavy day of agentic coding can cost $20-50+
  • Anthropic is offering a one-time credit and up to 30% off pre-purchased bundles, but that doesn't close the 5x price gap
  • Google made a parallel move against Gemini CLI users connecting third-party tools — this is an industry-wide reckoning with the economics of subsidizing agentic compute at flat rates

Cheap Claude exists to close that gap. If you're one of the 135K+ developers who just lost flat-rate pricing, this tool automatically applies the same Prompt Cache optimizations that Anthropic's official tools use — bringing your API costs back down to something sustainable.

Before & After

                        BEFORE                          AFTER
                   (no Cheap Claude)              (with Cheap Claude)

  API session (10 turns):
    Cost:       $0.42/session      →         $0.13/session    (70% saved)
    Cache:           0%            →              90-98%

  Claude Code session (12 turns):
    Cost:       $0.39/session      →         $0.17/session    (57% saved)
    Cache:           0%            →              75-94%

  ┌──────────────────────────────────────────────────────────────┐
  │  Monthly cost (20 devs, Opus 4.6, 5 sessions/day):         │
  │                                                              │
  │  API users:                                                  │
  │  Before:  ████████████████████████████████████████  $1,269   │
  │  After:   ████████████                              $384     │
  │  Saved:                 ░░░░░░░░░░░░░░░░░░░░░░░░░░  $885    │
  │                                                              │
  │  Claude Code users:                                          │
  │  Before:  ████████████████████████████████████████  $1,172   │
  │  After:   ██████████████████                        $446     │
  │  Saved:                     ░░░░░░░░░░░░░░░░░░░░░░  $726    │
  └──────────────────────────────────────────────────────────────┘

Install

Claude Code Plugin (recommended)

# Option A: npm (fastest)
npm install -g cheap-claude
ln -sf $(npm root -g)/cheap-claude/plugin ~/.claude/plugins/cheap-claude

# Option B: clone
git clone https://github.com/ajsai47/cheap-claude.git ~/cheap-claude
ln -sf ~/cheap-claude/plugin ~/.claude/plugins/cheap-claude

Restart Claude Code. The plugin activates automatically — you'll see /cheap-stats available and duplicate read warnings in your sessions.

API Proxy (for SDKs, scripts, custom apps)

# Clone + start
git clone https://github.com/ajsai47/cheap-claude.git ~/cheap-claude
cd ~/cheap-claude && npm install
npx tsx src/proxy/server.ts &
export ANTHROPIC_BASE_URL=http://localhost:8082
# Dashboard at http://localhost:8082/dashboard

npm version license

Commands

Once the plugin is installed, these are available in any Claude Code session:

| Command | What it does | |---------|-------------| | /cheap-stats | Show session costs, duplicate reads, saving tips | | cheap_session_stats | MCP tool — compact cost summary (~50 tokens) | | cheap_session_details | MCP tool — full breakdown with per-tool stats | | cheap_dedup_report | MCP tool — list every duplicate file read this session | | cheap_cost_tip | MCP tool — get personalized cost-saving suggestions | | cheap_outline | MCP tool — structural file overview (4-8x cheaper than Read) | | cheap_unfold | MCP tool — read just one function from a file | | cheap_search | MCP tool — compact grep across directory (max 20 results) |

Example: /cheap-stats

> /cheap-stats

  Cheap Claude — Session Stats
  ─────────────────────────────────────
  Tool calls:        47
  Estimated cost:    $0.14
  File reads:        12 (3 duplicates caught)
  Tokens wasted:     ~4,200 on re-reads

  Tips:
  • You re-read src/index.ts 3 times. Use your existing knowledge.
  • 8 Bash calls — consider using Grep/Glob instead.
  ─────────────────────────────────────

How It Works

Plugin (Claude Code users)

The plugin drops into ~/.claude/plugins/ and activates automatically:

┌─────────────────────────────────────────────────────────┐
│  Claude Code                                            │
│                                                          │
│  ┌──────────┐  CLAUDE.md injects terse + efficiency     │
│  │  Cheap   │  hints every session                      │
│  │  Claude  │                                            │
│  │  Plugin  │  PreToolUse hook warns before re-reading   │
│  │          │  files you already have in context         │
│  │          │                                            │
│  │          │  PostToolUse hook logs every tool call     │
│  │          │  and tracks file read patterns             │
│  │          │                                            │
│  │          │  MCP server provides /cheap-stats and      │
│  │          │  cost tracking tools                       │
│  └──────────┘                                            │
└─────────────────────────────────────────────────────────┘

| Component | File | What it does | |-----------|------|-------------| | Context injection | CLAUDE.md | Terse output rules, file-read efficiency hints | | Read dedup warning | scripts/pre-tool-use.sh | Catches duplicate reads before they happen | | Tool tracking | scripts/post-tool-use.sh | Logs all tool calls with token estimates | | Session init | scripts/session-start.sh | Initializes per-session tracking | | Cost tools | mcp/server.ts | cheap_session_stats, cheap_dedup_report, cheap_cost_tip | | Stats command | skills/cheap-stats/ | /cheap-stats slash command |

Proxy (API consumers)

Six engines run on every /v1/messages request:

Request  ─→  Model Router  ─→  MCP Optimizer  ─→  History Compressor
                                                          │
api.anthropic.com  ←─  Cache Maximizer  ←─  Output Enforcer  ←─  Result Dedup
         │
         └─→  SSE stream back to client (cost tracked per turn)

| # | Engine | What it does | Savings | |---|--------|-------------|---------| | 1 | Model Router | Routes simple turns ("yes", "run tests") to Haiku | ~5% on Opus | | 2 | MCP Optimizer | Defers unused tool schemas from turn 1 | 85% tool overhead | | 3 | History Compressor | Haiku summarizes old turns (keeps last 4 messages) | 50-70% on history | | 4 | Result Deduplicator | SHA256 dedup of identical file re-reads | 2-5 dupes/session | | 5 | Output Enforcer | Terse prompt injection (never caps max_tokens) | ~35% output | | 6 | Cache Maximizer | Adds cache_control breakpoints to prefix | 80-94% cache hits |

Dashboard

The proxy serves a live dashboard at http://localhost:8082/dashboard:

┌──────────────────────────────────────────────────┐
│  Cheap Claude                   proxy active     │
│                                                   │
│  Today          Savings          Setup            │
│  ─────────────────────────────────────────────── │
│                                                   │
│  Before:  $5.82 / day                            │
│  Today:   $2.51 / day         ↓ 57%              │
│                                                   │
│  ████████████████████░░░  $3.31 saved today      │
│                                                   │
│  Cache Maximizer      $1.42   42%                │
│  MCP Optimizer        $0.81   24%                │
│  History Compressor   $0.68   21%                │
│  Model Router         $0.22    7%                │
│  Result Deduplicator  $0.12    4%                │
│  Output Enforcer      $0.06    2%                │
└──────────────────────────────────────────────────┘

Pricing Context

Why this matters — the cost difference between cached and uncached:

                    Input         Cache Read       You Save
  Opus 4.6:     $5.00/MTok  →   $0.50/MTok        90%
  Sonnet 4.6:   $3.00/MTok  →   $0.30/MTok        90%
  Haiku 4.5:    $1.00/MTok  →   $0.10/MTok        90%

The Cache Maximizer's job: make as much of every request cacheable as possible. On a typical work turn, 90% of input tokens are cached — you pay 10 cents instead of a dollar.

API Users — What to Expect

The proxy works for any app using the Anthropic SDK. Your savings depend on your system prompt size:

  Your System Prompt          Cache Hit Rate     Savings
  ────────────────────────    ──────────────     ───────
  8K+ tokens (large app)      90-98%             70%     ← verified on 10 turns
  2-8K tokens (medium app)    80-95%             40-60%
  Under 2K tokens             0%                 0%      ← too small to cache

  Minimum for caching:
    Haiku 4.5:   2,048 tokens (~8K chars)
    Sonnet/Opus: 1,024 tokens (~4K chars)

If your system prompt is under the minimum, the proxy can't help with caching. The Output Enforcer and Model Router still provide modest savings.

Verified: 10-turn API chatbot with 8K system prompt

  Turn  1:   0% cache  $0.011  (one-time cache write)
  Turn  2:  98% cache  $0.001  ← 87% cheaper immediately
  Turn  3:  97% cache  $0.002
  ...
  Turn 10:  90% cache  $0.002
  Total:    $0.026 (would have been $0.085 → 70% saved)

Best for

  • Multi-turn chatbots with stable system prompts → 70% savings
  • AI agents with tools and growing history → 57% savings
  • RAG apps with large static context + dynamic retrieval → 30-50% savings
  • Batch processing with repeated prompts → 50-60% savings

Not helpful for

  • One-shot API calls with tiny system prompts
  • Apps that already implement cache_control manually
  • Streaming-only apps where latency matters more than cost (proxy adds <5ms)

Quick test for your app

# Start the proxy
git clone https://github.com/ajsai47/cheap-claude.git && cd cheap-claude
npm install && npx tsx src/proxy/server.ts &

# Point your app at it
export ANTHROPIC_BASE_URL=http://localhost:8082

# Run your app normally, then check savings
curl http://localhost:8082/stats

The Honest Story

We ran 21+ iterations using an autoagent-style loop. Here's what actually happened:

Iteration   Score    What we tried
─────────   ─────    ─────────────────────────────────────────
Baseline    14.5%    5 engines, nothing tuned
   1        56.9%    ← THE BIG ONE: MCP Optimizer from turn 1
   3        57.9%    Tuned compression thresholds
   8        70.1%    Aggressive file truncation (reverted — loses context)
  15        76.6%    Keep only 1 recent message (reverted — loses context)
  23        79.3%    Cap max_tokens at 30% (reverted — breaks code gen)
  Reset     57.3%    Reverted everything that hurt quality
  Final     62%*     Added model routing (*projected on Opus)
  API test  70%      10-turn chatbot with 8K system prompt (verified)

The jump from 14% to 57% came from one insight: run the MCP Optimizer from turn 1 so the cached prefix is byte-identical across all turns. Everything else was tuning.

We hit 79% by cheating — truncating file content Claude needs, capping output length, stripping conversation context. It looked great on the eval. It would break real usage. We reverted all of it.

The 70% API number is real — tested on a 10-turn multi-turn chatbot with an 8K-token system prompt. Turns 2-10 all hit 90-98% cache.

57% is the honest number. 62% with model routing on Opus.

Project Structure

cheap-claude/
├── plugin/                        # Claude Code plugin
│   ├── .claude-plugin/plugin.json
│   ├── .mcp.json                  # MCP server registration
│   ├── CLAUDE.md                  # Injected every session
│   ├── hooks/hooks.json           # Lifecycle hooks
│   ├── mcp/server.ts              # Cost tracking tools
│   ├── scripts/                   # Hook implementations
│   └── skills/cheap-stats/        # /cheap-stats command
├── src/proxy/                     # API proxy (6 engines)
│   ├── server.ts                  # HTTP proxy + dashboard
│   ├── pipeline.ts                # Engine orchestrator
│   └── engines/                   # The 6 engines
├── src/dashboard/index.html       # Live cost dashboard
├── src/cli/                       # init + stop
├── test/harness.ts                # 12-turn benchmark
├── eval.sh                        # Automated eval
└── program.md                     # Autoagent config

Data

All data at ~/.claude-cheap/. Nothing in your repo. Nothing leaves your machine.

License

MIT

Acknowledgments

ClaudeMem · ccusage · token-optimizer · autoagent