npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

anthropic-router

v0.1.0

Published

Drop-in Anthropic SDK wrapper that automatically routes requests to the cheapest model that can handle them.

Readme

anthropic-router

Drop-in Anthropic SDK wrapper that automatically routes messages.create() calls to the cheapest model that can reliably handle them.

npm install anthropic-router

The problem

You default to Sonnet on every request — including the ones Haiku handles just as well. The pricing difference:

| Model | Price / 1K tokens | |---|---| | Haiku | $0.0002 | | Sonnet | $0.003 | | Opus | $0.015 |

That's a 15x spread between Haiku and Sonnet. Most apps have 40–70% of requests that are Haiku-eligible.

Projected savings

| Monthly Anthropic spend | Haiku-eligible requests | Spend with router | Monthly savings | |---|---|---|---| | $50 | 50% | $27 | $23 | | $100 | 50% | $54 | $46 | | $200 | 60% | $75 | $125 | | $500 | 60% | $187 | $313 |

Usage

// Before
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();

// After — zero other changes needed
import { AnthropicRouter } from "anthropic-router";
const client = new AnthropicRouter();

// Same interface
const response = await client.messages.create({
  model: "auto",  // or omit — defaults to auto-routing
  max_tokens: 1024,
  messages: [{ role: "user", content: "What is the capital of France?" }],
});

console.log(response.routing);
// {
//   requested: "auto",
//   selected: "claude-haiku-4-5-20251001",
//   confidence: 0.94,
//   signals: ["short_prompt", "factual_keywords"],
//   retried: false,
//   latencyMs: 0
// }

Override: explicit model always wins

// Pin a specific call — bypasses routing entirely
const response = await client.messages.create({
  model: "claude-sonnet-4-6",  // explicit model: no routing
  max_tokens: 2048,
  messages: [{ role: "user", content: "Design a scalable auth system." }],
});

Streaming

messages.stream() works as a drop-in: the router classifies the request and injects the selected model before opening the stream. No buffering, no retry logic on streams (that requires a full response).

const stream = client.messages.stream({
  model: "auto",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Tell me a short story." }],
});

for await (const event of stream) {
  // same as SDK stream events
}

Routing logic

Every request is classified by 11 heuristic signals before the API call:

| Signal | Direction | |---|---| | Short prompt (< 200 tokens) | → Haiku | | Factual keywords (what is, who is, translate, summarize...) | → Haiku | | Long prompt (> 800 tokens) | → Sonnet | | Complex keywords (analyze, debug, architecture, reason through...) | → Sonnet | | Multi-step instructions | → Sonnet | | Simple tools (≤ 2 tools, flat schema) | → neutral | | Complex tools (3+ tools or nested schemas) | → Sonnet | | Long system prompt (> 500 chars) | → Sonnet | | Deep conversation (> 5 turns) | → Sonnet |

Conservative default: routes UP (Sonnet) when confidence < 0.85 or fewer than 2 independent signals support downrouting. When in doubt, quality over cost.

Auto-retry: if a Haiku response looks truncated (stop_reason: max_tokens on a substantive prompt) or matches a refusal pattern, retries once on Sonnet automatically. routing.retried: true tells you when this happened.

Rate limit fallback: if Haiku returns 429, automatically retries on Sonnet with routing.fallback: "rate_limit".

Options

const client = new AnthropicRouter({
  confidenceThreshold: 0.85,  // default; raise to be more conservative
  telemetry: false,            // opt-in only; sends signals, never prompt text
  onMisroute: (event) => {
    // called when you manually override the selected model
    // event: { signals, selectedModel, developerOverrideModel, confidence }
    console.log("Router would have used:", event.selectedModel);
  },
});

Pass an existing Anthropic client as the first argument:

import Anthropic from "@anthropic-ai/sdk";
const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
const client = new AnthropicRouter(anthropic, { telemetry: true });

Routing metadata

Every messages.create() response includes a routing field:

type RoutingMetadata = {
  requested: string;       // "auto" or the model you passed
  selected: string;        // model that was actually called
  selectedTier: "haiku" | "sonnet" | "opus";
  confidence: number;      // 0–1
  signals: string[];       // which signals fired
  override: string | null; // set when you pinned an explicit model
  retried: boolean;
  retryReason: string | null;
  fallback: string | null; // e.g. "rate_limit"
  latencyMs: number;       // classification time (not total request latency)
};

Accuracy

The classifier is tested against a labeled ground truth dataset of real API call patterns. The CI gate requires > 90% accuracy on every PR. Current: 100% on 30 examples.

Add your own real API calls to tests/ground-truth.ts to calibrate the classifier against your specific usage patterns before deploying.

What's out of scope in v0

  • Streaming routing with retry: messages.stream() classifies and routes but does not retry on low-confidence responses. Retry logic requires buffering the full response — v0.2.
  • Multi-provider routing (OpenAI, Gemini): planned post-v1.

License

MIT