npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@dev-kasibhatla/grate-limiter

v0.1.1

Published

Anticipatory rate-limit orchestration engine for multi-provider systems — predict 429s before they happen

Readme

grate-limiter

npm version CI License: Apache-2.0 TypeScript

Anticipatory rate-limit orchestration engine for multi-provider systems.

Stop reacting to 429 Too Many Requests. grate-limiter predicts quota exhaustion before it happens and routes requests to the best available provider — all within a single in-process call, with zero network overhead.

  • Zero dependencies — pure TypeScript, no external runtime packages
  • Browser + Node.js — ESM and CommonJS builds with full TypeScript declarations
  • Anticipatory routing — scores every provider on quota, health, priority, and latency before each request
  • Automatic failover — cooldown tracking with EWMA health decay means degraded providers are bypassed automatically
  • Deterministic testing — built-in MockClock lets you simulate time-based behavior without setTimeout
  • Thread-safe equivalent — single-threaded JS model; state mutations go through a single observe() path

Part of a multi-language monorepo — identical algorithm and conformance tests across Rust, Python, and TypeScript.


Installation

npm install @dev-kasibhatla/grate-limiter
yarn add @dev-kasibhatla/grate-limiter
pnpm add @dev-kasibhatla/grate-limiter

Requirements: Node.js 18+ or any modern browser. No build step required for browser ESM.


Quick Start

import {
  GrateLimiter,
  EngineConfig,
  ProviderConfig,
  CapabilityConfig,
  CapabilityProvider,
  QuotaConfig,
  Observation,
  Usage,
  Outcome,
  Dimension,
  Window,
  StatusClass,
} from "@dev-kasibhatla/grate-limiter";

// Create the engine
const engine = new GrateLimiter();

// Register providers with their rate-limit quotas
engine.upsertProvider({
  name: "openai",
  quotas: [{ dimension: Dimension.REQUESTS, limit: 5000, window: Window.MINUTE }],
  priority: 10,
  cooldownSeconds: 30,
});

engine.upsertProvider({
  name: "anthropic",
  quotas: [{ dimension: Dimension.REQUESTS, limit: 3000, window: Window.MINUTE }],
  priority: 8,
  cooldownSeconds: 30,
});

// Register a capability (logical operation that can be served by multiple providers)
engine.upsertCapability({
  name: "chat-completion",
  providers: [
    { provider: "openai", priority: 10 },
    { provider: "anthropic", priority: 8 },
  ],
});

// Select the best provider for the next request
const decision = engine.select("chat-completion");
console.log(`Use: ${decision.provider} (score: ${decision.score.toFixed(2)})`);
// → "Use: openai (score: 0.94)"

// After the request completes, report the outcome
engine.observe({
  provider: "openai",
  capability: "chat-completion",
  usage: { requests: 1, tokens: 1200 },
  outcome: { status: StatusClass.SUCCESS, latencyMs: 830 },
});

Core Concepts

Providers and Capabilities

A provider is a named upstream service (e.g. "openai", "anthropic") with associated rate-limit quotas. A capability is a logical operation (e.g. "chat-completion", "embeddings") that can be served by one or more providers.

// Provider with multiple quota dimensions
engine.upsertProvider({
  name: "openai-gpt4",
  quotas: [
    { dimension: Dimension.REQUESTS, limit: 500, window: Window.MINUTE },
    { dimension: Dimension.TOKENS, limit: 150_000, window: Window.MINUTE },
    { dimension: Dimension.CONCURRENCY, limit: 20 },
  ],
  priority: 10,
  cooldownSeconds: 60,
});

Scoring Algorithm

Every call to select() scores all eligible providers using a weighted formula:

score = quota_score × 0.40
      + health_score × 0.35
      + priority_score × 0.20
      + latency_score × 0.05

The provider with the highest score wins. Providers in cooldown or below minimum health are excluded entirely.

Health Tracking

Health decays with each failure using an Exponential Weighted Moving Average (EWMA) and recovers gradually with successes. Providers that hit consecutive failures enter a cooldown period and are bypassed until it expires.

// Observe a rate-limit response — health will decay, cooldown may trigger
engine.observe({
  provider: "openai",
  outcome: { status: StatusClass.RATE_LIMITED, latencyMs: 200 },
  usage: { requests: 1 },
});

// Check if a provider is currently in cooldown
const inCooldown = engine.providerInCooldown("openai");
const health = engine.providerHealth("openai"); // 0.0–1.0

Quota Strategies

| Strategy | When to use | |----------|-------------| | Dimension.REQUESTS | Per-request rate limits (RPM / RPD) | | Dimension.TOKENS | Token-based limits (TPM / TPD) | | Dimension.CONCURRENCY | Max simultaneous in-flight requests |


Deterministic Testing

Use MockClock to write fully deterministic tests — no real timers, no flakiness:

import { GrateLimiter, MockClock, EngineConfig } from "@dev-kasibhatla/grate-limiter";
import { describe, it, expect } from "vitest";

describe("rate limit failover", () => {
  it("routes to backup after primary hits limit", () => {
    const clock = new MockClock();
    const engine = new GrateLimiter({ clock });

    engine.upsertProvider({
      name: "primary",
      quotas: [{ dimension: Dimension.REQUESTS, limit: 2, window: Window.MINUTE }],
      priority: 10,
      cooldownSeconds: 30,
    });
    engine.upsertProvider({
      name: "backup",
      quotas: [{ dimension: Dimension.REQUESTS, limit: 100, window: Window.MINUTE }],
      priority: 5,
      cooldownSeconds: 30,
    });
    engine.upsertCapability({
      name: "api",
      providers: [{ provider: "primary", priority: 10 }, { provider: "backup", priority: 5 }],
    });

    // Exhaust primary with rate-limited responses
    for (let i = 0; i < 3; i++) {
      clock.advanceMs(1000);
      engine.observe({
        provider: "primary",
        outcome: { status: StatusClass.RATE_LIMITED, latencyMs: 50 },
        usage: { requests: 1 },
      });
    }

    // Should now route to backup
    const decision = engine.select("api");
    expect(decision.provider).toBe("backup");

    // After cooldown expires, primary is eligible again
    clock.advanceSecs(60);
    const recovered = engine.select("api");
    expect(recovered.provider).toBe("primary");
  });
});

API Reference

GrateLimiter

class GrateLimiter {
  constructor(config?: EngineConfig)

  // Register or update a provider and its quota configuration
  upsertProvider(config: ProviderConfig): void

  // Register or update a capability and its provider mappings
  upsertCapability(config: CapabilityConfig): void

  // Select the best provider for a capability
  // Throws NoAvailableProviders if all providers are in cooldown
  // Throws UnknownCapability if capability is not registered
  select(capability: string): Decision

  // Record the outcome of a completed request
  // Throws UnknownProvider if provider is not registered
  observe(obs: Observation): void

  // Query provider state
  providerHealth(provider: string): number | null
  providerInCooldown(provider: string): boolean
  remainingQuota(provider: string, dimension: Dimension): number | null

  // Get aggregate metrics
  metrics(): MetricsSnapshot
}

Decision

interface Decision {
  provider: string       // Chosen provider name
  score: number          // Composite score (0.0–1.0)
  alternatives: Alternative[]  // Other eligible providers, ranked
  breakdown: ScoreBreakdown    // Score components for observability
}

EngineConfig

interface EngineConfig {
  clock?: Clock               // Override for testing (use MockClock)
  scoring?: ScoringWeights    // Adjust score component weights
  health?: HealthConfig       // Tune EWMA decay and cooldown thresholds
}

MockClock

class MockClock {
  advanceMs(ms: number): void
  advanceSecs(secs: number): void
  now(): Timestamp
}

CommonJS Usage

const { GrateLimiter, Dimension, Window, StatusClass } = require("@dev-kasibhatla/grate-limiter");

const engine = new GrateLimiter();
engine.upsertProvider({
  name: "provider-a",
  quotas: [{ dimension: Dimension.REQUESTS, limit: 1000, window: Window.MINUTE }],
  priority: 10,
  cooldownSeconds: 30,
});

Error Handling

import { UnknownCapability, UnknownProvider, NoAvailableProviders } from "@dev-kasibhatla/grate-limiter";

try {
  const decision = engine.select("chat-completion");
  // use decision...
} catch (e) {
  if (e instanceof NoAvailableProviders) {
    // All providers are in cooldown or unhealthy
    // Implement circuit-breaker or return 503
  } else if (e instanceof UnknownCapability) {
    // Capability was never registered
  }
}

Advanced Configuration

import { GrateLimiter, defaultScoringWeights, defaultHealthConfig } from "@dev-kasibhatla/grate-limiter";

const engine = new GrateLimiter({
  scoring: {
    ...defaultScoringWeights(),
    quota: 0.50,    // Weight quota health more heavily
    health: 0.30,
    priority: 0.15,
    latency: 0.05,
  },
  health: {
    ...defaultHealthConfig(),
    ewmaAlpha: 0.3,                   // Faster decay on failures
    cooldownThreshold: 0.2,           // Enter cooldown below 20% health
    minHealthForSelection: 0.1,       // Exclude below 10%
    maxCooldownSecs: 300,             // Cap cooldown at 5 minutes
  },
});

Contributing

Issues and pull requests are welcome at github.com/dev-kasibhatla/grate-limiter.


Related


License

Apache-2.0 © Aditya Kasibhatla