npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

prompt-cache-optimizer

v0.2.1

Published

Drop-in wrapper for the Anthropic SDK that auto-places cache_control breakpoints, measures cache hit rate, and explains exactly why your prompt caching silently broke.

Readme

prompt-cache-optimizer

npm version npm downloads CI License: MIT TypeScript

Drop-in wrapper for the Anthropic SDK that makes prompt caching effortless. Auto-places cache_control breakpoints based on observed prompt stability, measures real cache hit rate from the response usage object, and explains exactly what changed when your cache silently breaks.

Real output: autoCache marks the system prompt cacheable after observing it twice, the next 3 calls hit the cache (~1569 cached tokens each, $0.0042 saved per call), and a deliberate drift triggers the cache-miss diagnostic showing the exact characters that changed.

Real output from bun run example. Six calls — autoCache marks the system prompt cacheable after observing it twice, calls 3–5 hit the cache (~1569 cached tokens each), and a final deliberate drift triggers the diagnostic showing the exact characters that changed. client.stability() reports system score=0.80 cumulative across the run.

Status: v0.2 — auto-placement + cache-miss diagnostics + per-segment stability report. Backwards compatible with v0.1.

Why this exists

Anthropic prompt caching gives you a 90% discount on the cached portion of your prompt. But the API is finicky:

  • A misplaced cache_control breakpoint silently degrades to a full-price call
  • You only get 4 breakpoints per request — they have to be spent well
  • Cache prefixes break if message order shifts even slightly
  • The default TTL is 5 minutes; lots of setups silently regress when calls come in slower than that
  • The only way to know it's working is to parse cache_read_input_tokens yourself

prompt-cache-optimizer handles all of that for you.

Install

npm install prompt-cache-optimizer @anthropic-ai/sdk
# or
bun add prompt-cache-optimizer @anthropic-ai/sdk

Quick start (v0.2 — auto-placement)

import { CachedAnthropic } from "prompt-cache-optimizer";

const client = new CachedAnthropic({
  apiKey: process.env.ANTHROPIC_API_KEY!,
  autoCache: true,           // ← let the wrapper place cache_control for you
  diagnoseMisses: true,      // ← explain what changed when the cache misses
  warnIfHitRateBelow: 0.6,
});

// Use the SDK exactly like normal. No placeBreakpoints() needed.
const response = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  system: longSystemPrompt,
  messages: conversation,
});

console.log(response.cacheInfo);
// { hit: true, cachedTokens: 8420, uncachedTokens: 312, dollarsSaved: 0.024, ... }

console.log(client.stats());
// { totalCalls: 1, hitRate: 1, totalCachedTokens: 8420, dollarsSaved: 0.024, ... }

console.log(client.stability());
// { entries: [{ segment: 'system', stabilityScore: 1, approxTokens: 2103, ... }], ... }

The first call always misses (that's when the cache is written). Once the wrapper has seen the system prompt twice unchanged, it auto-marks it cacheable and subsequent calls hit. No code changes needed when your prompt shape evolves — auto-placement re-evaluates each call.

How auto-placement decides what to cache

On every call the wrapper:

  1. Fingerprints each candidate segment — system, tools, and every cumulative messages[0..N] prefix — using SHA-256 over a canonical form (cache_control markers stripped so they don't affect the hash).
  2. Tracks the fingerprint history per segment.
  3. Once a segment has been seen unchanged for at least autoCacheMinObservations consecutive calls (default 2), it qualifies for auto-placement.
  4. Picks the highest-value placements within Anthropic's 4-breakpoint budget: system first, then tools, then the longest stable message prefix.

You can inspect this state live with client.stability().

Manual breakpoint placement (still supported)

If you want explicit control, placeBreakpoints from v0.1 still works exactly as before. Auto-placement is a no-op whenever you've already marked anything cacheable yourself — your intent is always respected.

import { placeBreakpoints } from "prompt-cache-optimizer";

const { system, messages } = placeBreakpoints({
  system: longSystemPrompt,
  messages: conversation,
  strategy: "after-system",
});

await client.messages.create({ model, max_tokens, system, messages });

Three strategies are available:

  • after-system — cache the system prompt (best for RAG and long instructions)
  • after-last-assistant — cache the conversation history (best for chat)
  • system-and-history — cache both (uses 2 of your 4 breakpoints)

Stats

client.stats();
// {
//   totalCalls: 142,
//   cacheHits: 124,
//   hitRate: 0.873,
//   totalCachedTokens: 1_240_000,
//   totalUncachedTokens: 52_400,
//   totalCacheWriteTokens: 21_000,
//   dollarsSaved: 3.72,
//   dollarsSpent: 1.41,
// }

Cache-miss diagnostics

Enable diagnoseMisses: true and every cache-write-without-read warning gets a structured diff explaining what changed. Example:

new CachedAnthropic({
  apiKey,
  diagnoseMisses: true,
  onWarning: (event) => {
    if (event.code === "cache-write-without-read") {
      console.error(event.message);
      // → "...Detected: system prompt changed at character 1240: ...the docs as of [Tuesday|Wednesday]..."
      console.error(event.detail?.diff);
      // → [{ segment: 'system', summary: '...', detail: { changeIndex: 1240, ... } }]
    }
  },
});

Common things it catches:

  • system prompt drift (inserted timestamps, dynamic context)
  • tool order changes
  • retrieved-document reordering
  • TTL expiration (cache was fine, then nobody called within 5 minutes)

Warnings

The client emits passive warnings (never throws, never blocks a request):

  • no-cache-control-found — you forgot to mark anything cacheable AND auto-cache hasn't activated yet
  • cache-write-without-read — your prefix changed call-over-call; cache is broken (carries a diff when diagnoseMisses: true)
  • low-hit-rate — rolling hit rate fell below your threshold
  • unknown-model — pricing unknown, so dollar accounting is skipped
  • auto-placement-applied — info-level: the wrapper just placed cache_control on a newly-stable segment

Route them anywhere:

new CachedAnthropic({
  apiKey,
  onWarning: (event) => logger.warn(event),
});

Roadmap

  • ~~v0.2 — auto-placement of cache_control breakpoints based on observed prompt stability~~ ✅ shipped
  • v0.3 — safe message and tool reordering to maximize the stable prefix
  • v0.4 — OpenAI and Gemini prompt caching support
  • v1.0 — persistent stats adapter, middleware mode

Zero runtime dependencies

@anthropic-ai/sdk is a peer dependency. prompt-cache-optimizer itself has zero runtime deps. v0.2 uses Node's built-in node:crypto for fingerprinting.

Contributing

PRs welcome — see CONTRIBUTING.md.

Support this project

If this package saved you money on your Anthropic bill, consider buying me a coffee. This project is MIT-licensed and free forever; sponsorship just helps me spend more time on it.

GitHub Sponsors

License

MIT © Leonhail Paypa


If this package saved you money on your Anthropic bill, please star the repo. It's the single biggest signal that helps other developers find it.