npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

ratelimit-flex

v3.3.0

Published

Flexible rate limiting for Node.js (Express, Fastify, NestJS, Hono) — Redis, PostgreSQL, MongoDB, DynamoDB, or in-memory stores; sliding window, token bucket, fixed window

Readme

ratelimit-flex

Flexible, TypeScript-first rate limiting for Node.js with Express, Fastify, NestJS, and Hono.

npm version License: MIT Security Tests TypeScript Node

Features

  • Three algorithms: Sliding window, Token bucket, Fixed window — implemented across MemoryStore, RedisStore (Lua), PgStore, MongoStore (exact for all strategies), and DynamoStore (exact fixed window & token bucket; sliding window uses a weighted approximation on DynamoDB — see docs/stores/dynamo.md)
  • Frameworks: Express and Fastify (separate entry for Fastify to keep bundles lean); NestJS (ratelimit-flex/nestjs) and Hono (ratelimit-flex/hono)
  • Stores: MemoryStore, RedisStore, ClusterStore, PgStore, MongoStore, DynamoStore
  • Request queuing: Queue over-limit requests instead of rejecting them immediately (expressQueuedRateLimiter, fastifyQueuedRateLimiter, createRateLimiterQueue)
  • TypeScript-first: strict types, discriminated options where it matters
  • Redis resilience: insurance limiter fallback, circuit breaker, counter sync on recovery; or fail-open / fail-closed when Redis is unavailable without insurance
  • In-memory block shielding: InMemoryShield / inMemoryBlock — cache blocked keys in process memory so hot keys stop hitting Redis under attack
  • Metrics & observability (Express & Fastify): aggregated snapshots, Prometheus, OpenTelemetry — metrics: true ([full docs][doc-metrics])
  • Weighted requests: incrementCost (or store.increment(..., { cost })) so expensive endpoints consume more quota than cheap ones
  • Presets: singleInstancePreset, multiInstancePreset, resilientRedisPreset, clusterPreset, queuedClusterPreset, apiGatewayPreset, authEndpointPreset, publicApiPreset, postgresPreset, mongoPreset, dynamoPreset
  • Limiter composition: compose.all(), compose.overflow(), compose.firstAvailable(), compose.race(), compose.windows(), compose.withBurst(), nested ComposedStore
  • Programmatic key management: KeyManager for blocks, penalties, rewards, events, audit log, and optional admin HTTP API
  • Security: key cardinality, Redis namespaces, Lua usage, and locking down admin routes

Comparison with rate-limiter-flexible

  • Store backends: rate-limiter-flexible ships more drivers (Memcached, MySQL, SQLite, Etcd, Prisma, Drizzle, and others). ratelimit-flex focuses on fewer, high-traffic integrations: Redis, PostgreSQL, MongoDB, and DynamoDB, each with atomic window semantics (where the datastore allows) and shared test coverage.

Table of Contents

Installation

npm install ratelimit-flex
yarn add ratelimit-flex
pnpm add ratelimit-flex

Peer dependencies (install only what you use):

| Package | When you need it | |---------|------------------| | express (+ @types/express for TS) | Express middleware | | fastify, fastify-plugin | Fastify plugin (ratelimit-flex/fastify) | | @nestjs/common, @nestjs/core (+ optional @nestjs/graphql for GraphQL context) | NestJS module (ratelimit-flex/nestjs) | | hono | Hono middleware (ratelimit-flex/hono) | | ioredis | RedisStore with url (or use your own Redis client adapter) | | pg | PgStore (ratelimit-flex/postgres) | | mongodb | MongoStore (ratelimit-flex/mongo) | | @aws-sdk/client-dynamodb, @aws-sdk/lib-dynamodb | DynamoStore (ratelimit-flex/dynamo) | | prom-client | Optional: metrics.prometheus.registry integration | | @opentelemetry/api | Optional: metrics.openTelemetry.meter integration |

All peers are optional at install time; the runtime you choose must be present when you import that integration.

Node.js: >= 20 (see package.json engines).

Quick Start

Redis (shared limits across instances)

import express from 'express';
import { expressRateLimiter, multiInstancePreset } from 'ratelimit-flex';

const app = express();
app.use(expressRateLimiter(multiInstancePreset({ url: process.env.REDIS_URL! })));
app.get('/health', (_req, res) => res.json({ ok: true }));

PostgreSQL

See docs/stores/postgres.md for schema, indexes, and operations notes.

import express from 'express';
import { Pool } from 'pg';
import { expressRateLimiter, postgresPreset } from 'ratelimit-flex';
import { pgStoreSchema } from 'ratelimit-flex/postgres';

const pool = new Pool({ connectionString: process.env.DATABASE_URL });
// Run once during deploy / migrations (not per request):
await pool.query(pgStoreSchema);

const app = express();
app.use(expressRateLimiter(postgresPreset({ pool })));

MongoDB

See docs/stores/mongo.md for TTL indexes and client shapes.

import express from 'express';
import { MongoClient } from 'mongodb';
import { expressRateLimiter, mongoPreset } from 'ratelimit-flex';

const client = new MongoClient(process.env.MONGODB_URI!);
await client.connect();

const app = express();
app.use(expressRateLimiter(mongoPreset({ client, dbName: 'myapp' })));

DynamoDB

See docs/stores/dynamo.md for table creation, TTL, and sliding-window behavior.

import {
  CreateTableCommand,
  DynamoDBClient,
  UpdateTimeToLiveCommand,
} from '@aws-sdk/client-dynamodb';
import { DynamoDBDocumentClient } from '@aws-sdk/lib-dynamodb';
import express from 'express';
import { dynamoPreset, expressRateLimiter } from 'ratelimit-flex';
import {
  dynamoStoreEnableTtlParams,
  dynamoStoreTableSchema,
} from 'ratelimit-flex/dynamo';

const raw = new DynamoDBClient({ region: process.env.AWS_REGION ?? 'us-east-1' });
// Once at deploy (prefer CDK / Terraform in production):
await raw.send(new CreateTableCommand(dynamoStoreTableSchema));
await raw.send(new UpdateTimeToLiveCommand(dynamoStoreEnableTtlParams));

const doc = DynamoDBDocumentClient.from(raw);
const app = express();
app.use(expressRateLimiter(dynamoPreset({ client: doc, tableName: 'rate_limits' })));

Express (in-process defaults)

import express from 'express';
import rateLimit, { RateLimitStrategy } from 'ratelimit-flex';

const app = express();

// Sliding window (default) - smooth, accurate rate limiting
app.use(rateLimit({
  strategy: RateLimitStrategy.SLIDING_WINDOW, // optional, this is the default
  maxRequests: 100,
  windowMs: 60_000,
}));

// Token bucket - allows bursts
app.use(rateLimit({
  strategy: RateLimitStrategy.TOKEN_BUCKET,
  tokensPerInterval: 20,
  interval: 60_000,
  bucketSize: 60,
}));

// Fixed window - simplest, lowest memory
app.use(rateLimit({
  strategy: RateLimitStrategy.FIXED_WINDOW,
  maxRequests: 100,
  windowMs: 60_000,
}));

app.get('/health', (_req, res) => res.json({ ok: true }));

Fastify (same strategies)

import Fastify from 'fastify';
import { fastifyRateLimiter, RateLimitStrategy } from 'ratelimit-flex/fastify';

const app = Fastify();

// Sliding window (default)
await app.register(fastifyRateLimiter, {
  strategy: RateLimitStrategy.SLIDING_WINDOW,
  maxRequests: 100,
  windowMs: 60_000,
});

app.get('/health', async () => ({ ok: true }));

⚠️ Security Considerations: Before deploying to production, review Security and abuse for guidance on key cardinality, Redis namespaces, and admin API authentication.

Framework Integration

NestJS

// app.module.ts
import { Controller, Inject, Injectable, Module, Post } from '@nestjs/common';
import { ConfigModule, ConfigService } from '@nestjs/config';
import { KeyManager, RedisStore } from 'ratelimit-flex';
import { RateLimit, RateLimitModule, SkipRateLimit, RATE_LIMIT_KEY_MANAGER } from 'ratelimit-flex/nestjs';

@Module({
  imports: [
    RateLimitModule.forRoot({
      maxRequests: 100,
      windowMs: 60_000,
    }),
  ],
})
export class AppModule {}

// Async config with ConfigService (use in @Module({ imports: [...] }))
@Module({
  imports: [
    RateLimitModule.forRootAsync({
      imports: [ConfigModule],
      inject: [ConfigService],
      useFactory: (config: ConfigService) => ({
        store: new RedisStore({ url: config.get('REDIS_URL')!, /* ... */ }),
        maxRequests: config.get('RATE_LIMIT_MAX'),
      }),
    }),
  ],
})
export class AppModuleAsync {}

// Per-route override
import { RateLimit, SkipRateLimit } from 'ratelimit-flex/nestjs';

@Controller('auth')
export class AuthController {
  @RateLimit({ maxRequests: 5, windowMs: 60_000 })
  @Post('login')
  async login() {
    // ...
  }
}

@SkipRateLimit()
@Controller('health')
export class HealthController {
  // ...
}

// Inject store/keyManager in services
@Injectable()
export class AdminService {
  constructor(@Inject(RATE_LIMIT_KEY_MANAGER) private km: KeyManager) {}
  async blockUser(key: string) {
    await this.km.block(key, 3600_000);
  }
}

NestJS: Per-Route Configuration

RateLimitGuard uses the same RateLimitEngine and backing store for the whole app (or feature module). Per-route @RateLimit({ ... }) can override maxRequests, windowMs, cost, and keyGenerator.

Per-route strategy: The module uses one strategy for all routes. To apply different algorithms (e.g. token bucket vs sliding window) to different routes, register multiple RateLimitModule instances in separate feature modules with different strategy settings.

Performance note: The guard caches one RateLimitEngine per handler. Prefer static limits in decorators; avoid mutating reflected metadata at runtime.

NestJS: KeyManager Lifecycle

Simple rule: The module destroys KeyManagers it creates. User-supplied KeyManagers are never touched by the module.

  • Auto-created (from penaltyBox): Module calls keyManager.destroy() on onModuleDestroy
  • User-supplied (passed via keyManager option): You manage the lifecycle — call destroy() in your own OnModuleDestroy hook
  • Testing: await app.close() handles cleanup for auto-created KeyManagers
  • Non-Nest apps: Call keyManager.destroy() when shutting down

NestJS: globalGuard and module scope

globalGuard: true (default):

  • Registers APP_GUARD for automatic rate limiting on all routes
  • Makes the module global — RATE_LIMIT_* injection tokens available everywhere
  • Use @SkipRateLimit() decorator to exclude specific controllers/routes

globalGuard: false:

  • Does NOT register APP_GUARD
  • Module is NOT global — feature modules must imports: [RateLimitModule] to access tokens
  • Manually apply @UseGuards(RateLimitGuard) where needed

Upgrading from v2.x? See [Migration Guide][doc-migration] for breaking changes in v3.0.0.

Hono

import { Hono } from 'hono';
import { rateLimiter } from 'ratelimit-flex/hono';

const app = new Hono();

// Basic usage
const limiter = rateLimiter({
  maxRequests: 100,
  windowMs: 60_000,
  keyGenerator: (c) => c.req.header('x-api-key') ?? 'anon',
});

app.use('*', limiter);

// Per-route
app.post('/login', rateLimiter({ maxRequests: 5, windowMs: 60_000 }), async (c) => {
  return c.json({ ok: true });
});

// With Redis and in-memory shield
import { RedisStore } from 'ratelimit-flex';

const REDIS_URL = process.env.REDIS_URL!;

app.use(
  '*',
  rateLimiter({
    store: new RedisStore({ url: REDIS_URL }),
    maxRequests: 100,
    windowMs: 60_000,
    standardHeaders: 'draft-8',
    inMemoryBlock: true, // Enable DoS protection
  }),
);

// With metrics
const limiterWithMetrics = rateLimiter({
  maxRequests: 100,
  windowMs: 60_000,
  metrics: {
    enabled: true,
    intervalMs: 10_000,
  },
});

app.use('*', limiterWithMetrics);

// Access metrics
app.get('/metrics', (c) => {
  const snapshot = limiterWithMetrics.getMetricsSnapshot();
  return c.json(snapshot);
});

// Cleanup on shutdown
process.on('SIGTERM', async () => {
  await limiterWithMetrics.shutdown();
  process.exit(0);
});

// Queued rate limiter (wait instead of reject)
import { queuedRateLimiter } from 'ratelimit-flex/hono';

app.use(
  '/api/*',
  queuedRateLimiter({
    maxRequests: 10,
    windowMs: 60_000,
    maxQueueSize: 50,
    maxQueueTimeMs: 30_000,
  }),
);

// WebSocket rate limiting
import { webSocketLimiter } from 'ratelimit-flex/hono';
import { upgradeWebSocket } from 'hono/cloudflare-workers';

app.get(
  '/ws',
  webSocketLimiter({
    maxRequests: 10,
    windowMs: 60_000,
    keyGenerator: (c) => c.req.header('x-api-key') ?? 'anon',
  }),
  upgradeWebSocket(() => ({
    onMessage(event, ws) {
      ws.send('pong');
    },
  })),
);

Hono: engine parity

Same options as Express: rateLimiter accepts the full merged RateLimitOptions surface — including limits, compose.windows / ComposedStore, draft, groupedWindowStores, penaltyBox, keyManager, onLayerBlock, and incrementCost. Composed layers are available as c.get('rateLimitComposed') (same idea as Express req.rateLimitComposed).

queuedRateLimiter: Uses the same merge path as rateLimiter (full engine options: limits, composed store, inMemoryBlock, metrics, cost / incrementCost, allowlist/blocklist, standard headers, etc.). The returned handler matches rateLimiter for observability (metricsManager, shield, keyManager, openTelemetryAdapter, event hooks, shutdown, …) and adds queue. It still drives RateLimiterQueue via store.increment only — it does not run RateLimitEngine, so engine-only behavior is unavailable: no draft, no pre-increment keyManager / penaltyBox enforcement, and no c.get('rateLimitComposed'). Same trade-off as Express expressQueuedRateLimiter.

skipFailedRequests / skipSuccessfulRequests: The middleware awaits next() after a successful consume, then uses resolvedHonoRollbackStatus (exported from ratelimit-flex/hono) so a missing c.res, 0, or invalid c.res.status values are treated as 200 before applying the rollback. Rollbacks use resolveIncrementOpts / matchingDecrementOptions for weighted, grouped, and composed stores (same as Express / Fastify).

Cloudflare Workers: Pass waitUntil: (p) => c.executionCtx.waitUntil(p) so post-response decrement work for skip-response rules is scheduled on the execution context (optional on Node).

Custom rollback rules: If you need logic beyond HTTP status (e.g. body shape), add middleware after rateLimiter and call store.decrement with resolveIncrementOpts / matchingDecrementOptions; use HONO_RATE_LIMIT_INCREMENT_COST with the cost option for weighted quota.

Core Features

In-memory block shielding

Problem statement

Under DoS conditions, every blocked request still hits Redis — 100k req/sec from an attacker means 100k Redis calls/sec from your own app. InMemoryShield caches blocked keys in local memory so subsequent requests for the same key never touch the store. Result: 7x+ faster under attack, 99%+ fewer store calls.

Quick start

// Option 1: via middleware options (simplest)
app.use(expressRateLimiter({
  store: new RedisStore({ url: REDIS_URL, ... }),
  maxRequests: 100,
  windowMs: 60_000,
  inMemoryBlock: true, // shield kicks in at maxRequests
}));

// Option 2: explicit shield with custom config
import { shield, RedisStore } from 'ratelimit-flex';
const shielded = shield(new RedisStore({ ... }), {
  blockOnConsumed: 100,
  maxBlockedKeys: 10_000,
  onBlock: (key) => console.log(`Shielded: ${key}`),
});
app.use(expressRateLimiter({ store: shielded, maxRequests: 100, windowMs: 60_000 }));

Metrics

const metrics = limiter.shield?.getMetrics();
// {
//   blockedKeyCount: 42,        // keys currently blocked in memory
//   storeCallsSaved: 98721,     // total store calls avoided
//   totalKeysBlocked: 150,      // total keys blocked since startup
//   totalKeysExpired: 80,       // keys removed due to window expiry
//   totalKeysEvicted: 28,       // keys removed due to LRU eviction
//   hitRate: 0.993,             // cache hit rate
//   storeCalls: 684             // actual store calls made
// }

MetricsManager and periodic snapshots: When metrics are enabled, middleware passes the same InMemoryShield instance used as the engine store into MetricsManager. Each onMetrics snapshot may include shield — that object is shield.getMetrics() for that instance (blocked-key counts, hit rate, store calls avoided, etc.). Request, block, and latency totals still describe traffic through the engine, which calls increment on the outer store. If you pass an InMemoryShield as store and set inMemoryBlock: true, a second shield wraps the first; snapshot.shield reflects the outer layer only, and in non-production a one-time console.warn flags possible double-shielding (intentional stacking is supported).

How it works

Each request first checks an in-memory map for the key: if the key is still “shielded” (blocked and not yet expired), the limiter returns the cached blocked result in about ~0.01ms — no Redis round-trip. If there is no entry, or it expired, the request takes the slow path: increment() on the backing store (typically ~2–5ms for Redis, depending on network and load). When the store shows the key has consumed enough quota, the shield records that state locally and keeps serving blocked responses from RAM until the block window expires or you invalidate the entry (for example via KeyManager).

InMemoryShield implements RateLimitStore: wrap Redis, a composed store, or any custom implementation; use it with compose.* and multi-layer setups; expose shield metrics alongside Prometheus/OpenTelemetry; opt into onBlock, onExpire, and onShieldHit callbacks; and wire KeyManager so reward, unblock, and delete operations clear stale shield entries.

Programmatic key management

ratelimit-flex exposes a KeyManager for programmatic control of rate limit keys. Block abusive clients, apply penalty/reward points, inspect state, and react to events — all with full TypeScript types, an audit trail, and optional Redis persistence.

Basic usage

import express from 'express';
import { KeyManager, MemoryStore, RateLimitStrategy, expressRateLimiter } from 'ratelimit-flex';

const app = express();
const store = new MemoryStore({ strategy: RateLimitStrategy.SLIDING_WINDOW, windowMs: 60_000, maxRequests: 100 });
const keyManager = new KeyManager({ store, maxRequests: 100, windowMs: 60_000 });

const limiter = expressRateLimiter({ store, keyManager });
app.use(limiter);

// Programmatic control — from an admin route, webhook handler, etc.
await keyManager.block('abusive-ip', 3600_000, { type: 'manual', message: 'Spam detected' });
await keyManager.penalty('suspicious-user', 5);
await keyManager.reward('verified-user', 10);
const state = await keyManager.get('any-key');

Escalating penalties

import { KeyManager, exponentialEscalation } from 'ratelimit-flex';

const keyManager = new KeyManager({
  store,
  maxRequests: 100,
  windowMs: 60_000,
  penaltyBlockThreshold: 3,
  penaltyEscalation: exponentialEscalation(60_000), // 1min, 2min, 4min, 8min...
});

Event-driven alerting

keyManager.on('blocked', ({ key, reason }) => {
  alerting.send(`Key ${key} blocked: ${reason.type}`);
});

Admin endpoints

import { createAdminRouter } from 'ratelimit-flex';

app.use('/admin/ratelimit', authMiddleware, createAdminRouter(keyManager));
// GET /admin/ratelimit/keys/:key
// POST /admin/ratelimit/keys/:key/block
// etc.

What KeyManager provides

KeyManager gives you typed block reasons (manual, penalty-escalation, abuse-pattern, custom), an event emitter (blocked, unblocked, penalized, rewarded, and more), an audit log with filtering, escalation strategies for automatic penalty blocks, optional admin REST endpoints (createAdminRouter, fastifyAdminPlugin), and optional Redis-backed block persistence (RedisBlockStore) so block state can be shared across processes.

Redis-backed block persistence

Share block state across processes using RedisBlockStore:

import { KeyManager, RedisBlockStore, RedisStore, RateLimitStrategy } from 'ratelimit-flex';
import Redis from 'ioredis';

// Create a single Redis client instance
const redis = new Redis(process.env.REDIS_URL!);

// Share the client between RedisStore (for rate limit counters) and RedisBlockStore (for blocks)
const store = new RedisStore({
  client: redis,
  strategy: RateLimitStrategy.SLIDING_WINDOW,
  windowMs: 60_000,
  maxRequests: 100,
});

const blockStore = new RedisBlockStore(redis, { keyPrefix: 'rlf:blocks:' });

const keyManager = new KeyManager({
  store,
  blockStore,
  maxRequests: 100,
  windowMs: 60_000,
  syncIntervalMs: 5000, // Pull remote blocks every 5 seconds
});

// Blocks are now persisted to Redis and visible across all processes
await keyManager.block('abusive-ip', 3600_000, { type: 'manual', message: 'Spam' });

Cross-process consistency: KeyManager syncs blocks from Redis every syncIntervalMs (default 5000ms). Call await keyManager.syncBlocks() manually for immediate consistency.

Migrating from penaltyBox

Why you cannot set penaltyBox and keyManager together: mergeRateLimiterOptions throws if both appear in the same options object. penaltyBox uses the engine’s built-in violation counter and penaltyUntil map. A user-supplied KeyManager adds a separate blocking and penalty-point system (penalty(), escalation, audit). Allowing both would pit two policies against each other for the same keys.

Option A — keep penaltyBox: If you only need “N real rate-limit blocks within violationWindowMs, then ban for penaltyDurationMs, keep penaltyBox and do not pass your own keyManager. (Frameworks may still synthesize an internal KeyManager for Nest lifecycle or related wiring when you only use penaltyBox; that is not the same as configuring both options yourself.)

Option B — migrate to an explicit KeyManager: Drop penaltyBox and drive bans through KeyManager. Map fields roughly like this:

| penaltyBox | KeyManager | |--------------|----------------| | violationsThreshold | penaltyBlockThreshold (penalty points before an automatic block) | | penaltyDurationMs | penaltyBlockDurationMs (base duration), or replace with penaltyEscalation for longer blocks on repeat offenses | | onPenalty | keyManager.on('blocked', …) (and/or audit entries) |

The engine does not call keyManager.penalty() when a request hits the rate limit — you wire that yourself, typically from onLimitReached, so each limit hit adds a penalty point toward the threshold:

Before (penaltyBox):

app.use(
  expressRateLimiter({
    store,
    maxRequests: 100,
    windowMs: 60_000,
    penaltyBox: {
      violationsThreshold: 3,
      violationWindowMs: 3_600_000,
      penaltyDurationMs: 60_000,
    },
  }),
);

After (KeyManager + onLimitReached + escalation):

import { expressRateLimiter, KeyManager, exponentialEscalation } from 'ratelimit-flex';

const keyGenerator = (req: import('express').Request) =>
  /* same key you use for rate limiting, e.g. forwarded IP */ String(req.ip ?? '');

const keyManager = new KeyManager({
  store,
  maxRequests: 100,
  windowMs: 60_000,
  penaltyBlockThreshold: 3,
  penaltyEscalation: exponentialEscalation(60_000), // 1m, 2m, 4m, … after each threshold breach
});

app.use(
  expressRateLimiter({
    store,
    maxRequests: 100,
    windowMs: 60_000,
    keyGenerator,
    keyManager,
    onLimitReached: async (req) => {
      await keyManager.penalty(keyGenerator(req), 1);
    },
  }),
);

keyManager.on('blocked', ({ key, reason }) => {
  console.log(`Blocked: ${key}`, reason);
});

Semantics note: penaltyBox counts blocks in a sliding violationWindowMs (default one hour). KeyManager penalty points are tracked in an adjustment window tied to the limiter’s windowMs, not to violationWindowMs. If your old config relied on a long violation window and a short rate-limit window, either keep penaltyBox or add your own sliding-window counting before calling penalty().

Benefits of Option B:

  • Typed block reasons (manual, penalty-escalation, abuse-pattern, custom)
  • Event system for real-time alerting
  • Audit log with filtering
  • Escalation strategies (exponential, fibonacci, etc.)
  • Admin HTTP endpoints
  • Redis-backed block persistence

Security and abuse

Key cardinality and keyGenerator

Rate limit state (memory stores, InMemoryShield block maps, KeyManager bookkeeping, Redis keys, etc.) grows with distinct storage keys. A keyGenerator that returns a new high-cardinality value per request (full URL including unbounded query strings, raw JWTs, unbounded device fingerprints) lets attackers inflate memory or Redis usage.

Mitigations: Prefer stable, low-cardinality identifiers (user id, tenant id, API key id). Normalize or hash untrusted inputs before using them as keys. The library does not cap key string length—enforce a maximum or digest in your keyGenerator if inputs are user-controlled. Use InMemoryShieldOptions.maxBlockedKeys and related limits where applicable.

Redis namespace (keyPrefix)

RedisStore (and RedisBlockStore) prefix all logical keys. Use a different keyPrefix (and/or Redis DB index) per application or tenant when multiple services share one Redis so counters and blocks do not collide. Document the convention for your org.

Lua scripts (RedisStore)

All Lua in RedisStore is static source in the package. Quota and key data are passed only as KEYS / ARGV to EVAL—never build Lua by concatenating user input into the script body.

Key Manager admin HTTP API

createAdminRouter (Express) and createFastifyAdminPlugin expose full control over rate limit and block state. In production, mount them only behind authentication, authorization, and ideally network isolation (VPN, admin-only ingress). The JSDoc on those factories repeats this warning—treat it as mandatory for exposed deployments.

Atomicity & Distributed Systems

Redis operations are atomic

All RedisStore operations use Lua scripts for atomicity. Each rate limit check executes as a single atomic operation on the Redis server—no race conditions, no requests slipping through under concurrent load from multiple processes or nodes.

What this means:

  • Sliding window: ZREMRANGEBYSCORE (prune expired) + ZADD (add entries) + ZCARD (count) + PEXPIRE — all in one EVAL
  • Fixed window: INCRBY + conditional PEXPIRE + PTTL — all in one EVAL
  • Token bucket: HGET (read state) + refill calculation + token deduction + HSET (write) + PEXPIRE — all in one EVAL

No interleaving: Other Redis clients cannot execute commands between the steps of a rate limit operation. The entire check-and-increment logic runs atomically.

Why Lua? Redis EVAL executes Lua scripts as atomic blocks. While a script runs, Redis does not process other commands from other clients. This guarantees that:

  1. Concurrent requests from multiple app instances cannot race
  2. Distributed systems get consistent, accurate rate limiting
  3. Multi-step operations (read → calculate → write) are safe

Script caching: Most Redis clients (including ioredis and node-redis) automatically cache Lua scripts server-side after the first execution using EVALSHA. Subsequent calls reuse the cached script, reducing network overhead. The library passes the full script source on every call; the client handles optimization transparently.

MemoryStore & ClusterStore: In-process stores use JavaScript synchronous operations (no atomicity concerns within a single event loop). ClusterStore uses IPC message passing with acknowledgments to coordinate across Node.js cluster workers.

Distributed deployment considerations

When running multiple app instances with RedisStore:

  • Shared state: All instances see the same counters in Redis
  • Consistent limits: A user hitting 100 req/min is enforced globally, not per instance
  • No coordination needed: Each instance independently calls Redis; Lua atomicity handles races
  • Network latency: Redis round-trip adds ~1-5ms per request (use InMemoryShield to cache blocked keys and eliminate Redis calls for hot attackers)

Cluster vs Redis:

  • ClusterStore: Coordinates rate limits across Node.js cluster workers in a single machine (IPC, no network)
  • RedisStore: Coordinates across multiple machines/containers/regions (network, shared Redis)

For multi-instance deployments (Kubernetes, serverless, multiple VMs), use RedisStore. For single-machine concurrency (one server, multiple CPU cores), use ClusterStore.

Limiter composition

Combine multiple rate limiters with the compose builder. Every composition mode implements RateLimitStore, so composed stores plug directly into expressRateLimiter / fastifyRateLimiter.

Composition Modes

| Mode | Behavior | Use case | |------|----------|----------| | all | Block if any layer blocks | Multi-window limiting (10/sec AND 100/min) | | overflow | Try primary first; if blocked, try burst pool | Steady rate + burst allowance | | first-available | Try layers in order; first that allows wins | Failover chain (Redis → memory) | | race | Fire all layers in parallel; fastest wins | Multi-region latency optimization |

Quick Examples

Multi-window (10/sec AND 100/min):

import { compose, expressRateLimiter } from 'ratelimit-flex';

const store = compose.windows(
  { windowMs: 1_000, maxRequests: 10 },
  { windowMs: 60_000, maxRequests: 100 },
);

app.use(expressRateLimiter({ store }));

Burst allowance (steady + burst):

const store = compose.withBurst({
  steady: { windowMs: 1_000, maxRequests: 5 },
  burst:  { windowMs: 60_000, maxRequests: 20 },
});

app.use(expressRateLimiter({ store }));

Failover chain (Redis → memory):

const store = compose.firstAvailable(
  compose.layer('redis', redisStore),
  compose.layer('memory', memoryStore),
);

app.use(expressRateLimiter({ store }));

Full documentation: See [docs/COMPOSITION.md][doc-composition] for:

  • Nested composition patterns
  • Per-layer observability
  • Redis composition presets
  • Migration from limits array

Request queuing

Source of truth: Full FIFO semantics, head-of-line blocking, and multi-key patterns are documented in JSDoc on src/queue/RateLimiterQueue.ts (RateLimiterQueueOptions, RateLimiterQueue). That file is the canonical explanation; this section summarizes it for README readers.

Typical use case: Outbound API throttling (one queue per external API, single key for all requests).

Head-of-line blocking (by design): The queue is one FIFO array. If you share that queue across different keys, a waiting request for key A sits in front of a request for key B — even when B still has rate-limit capacity — because release order follows enqueue order, not per-key fairness.

flowchart LR
  A["enqueue: key A — over limit, waits first"] --> B["enqueue: key B — has quota but queued after A"]
  B --> C["B cannot skip ahead — one FIFO per RateLimiterQueue"]

Request queuing

Queue over-limit requests instead of rejecting them immediately. Requests wait in a FIFO queue and are released when quota becomes available.

Quick Start

Express:

import { expressQueuedRateLimiter } from 'ratelimit-flex';

app.use('/api', expressQueuedRateLimiter({
  maxRequests: 5,
  windowMs: 10_000,
  maxQueueSize: 50,
  maxQueueTimeMs: 30_000,
}));

Fastify:

import { fastifyQueuedRateLimiter } from 'ratelimit-flex/fastify';

await app.register(fastifyQueuedRateLimiter, {
  maxRequests: 5,
  windowMs: 10_000,
  maxQueueSize: 50,
  maxQueueTimeMs: 30_000,
});

Outbound API throttling:

import { createRateLimiterQueue } from 'ratelimit-flex';

const githubQueue = createRateLimiterQueue({
  maxRequests: 30,
  windowMs: 60_000,
  maxQueueSize: 200,
});

await githubQueue.removeTokens('github-api');
const response = await fetch('https://api.github.com/repos/...');

Important: Head-of-Line Blocking

The queue is one FIFO array. If you share that queue across different keys, a waiting request for key A blocks requests for key B — even when B has capacity.

Solution: Use one queue per key, or use KeyedRateLimiterQueue for automatic per-key queues with LRU eviction.

Full documentation: See [docs/QUEUING.md][doc-queuing] for:

  • Multi-key patterns
  • Graceful shutdown
  • Store ownership
  • Advanced patterns (per-tenant, priority queuing)

Implementation:

  • Redis: ZSET with ZREMRANGEBYSCORE + ZADD + ZCARD in atomic Lua
  • Memory: Sorted array of timestamps per key
  • Boundary behavior: Smooth - no 2x burst at window edges
import { expressRateLimiter, RateLimitStrategy } from 'ratelimit-flex';

app.use(
  expressRateLimiter({
    strategy: RateLimitStrategy.SLIDING_WINDOW, // default
    windowMs: 60_000,
    maxRequests: 100,
  }),
);

Token bucket (for bursty traffic)

Refills tokens on a schedule; clients can burst up to bucketSize. Best for spiky traffic (mobile apps, retries, webhooks).

Implementation:

  • Redis: HASH with atomic refill calculation + token deduction in Lua
  • Memory: Stores { tokens, lastRefill } per key
  • Burst control: Allows bursts when bucket is full
import { expressRateLimiter, RateLimitStrategy } from 'ratelimit-flex';

app.use(
  expressRateLimiter({
    strategy: RateLimitStrategy.TOKEN_BUCKET,
    tokensPerInterval: 20,  // Add 20 tokens per minute
    interval: 60_000,       // Every 60 seconds
    bucketSize: 60,         // Max 60 tokens (allows 3x burst)
  }),
);

Fixed window (simplest)

One counter per fixed time slice. Simplest and lowest memory; acceptable when occasional boundary spikes are OK.

Implementation:

  • Redis: INCRBY + PEXPIRE in atomic Lua script
  • Memory: Single counter per key
  • Warning: Users can burst 2x limit at boundaries (50 at 11:59:59, 50 at 12:00:00)
import { expressRateLimiter, RateLimitStrategy } from 'ratelimit-flex';

app.use(
  expressRateLimiter({
    strategy: RateLimitStrategy.FIXED_WINDOW,
    windowMs: 60_000,
    maxRequests: 100,
  }),
);

Performance Benchmarks

Benchmarks measured on Apple M1 Pro, Node.js v20, using isolated test harness. Your results may vary based on hardware, network latency (Redis), and load patterns.

Throughput (requests/second)

| Store | Strategy | Throughput | Notes | |-------|----------|------------|-------| | MemoryStore | Sliding Window | ~450,000 | Single process, in-memory only | | MemoryStore | Fixed Window | ~750,000 | Lowest overhead | | MemoryStore | Token Bucket | ~550,000 | Refill calculation overhead | | RedisStore | Sliding Window | ~35,000 | Network-bound, local Redis | | RedisStore | Fixed Window | ~45,000 | Simpler Lua script | | InMemoryShield (hit) | — | ~1,800,000 | Blocked keys cached in memory | | InMemoryShield (miss) | — | ~35,000 | Falls through to Redis |

Latency Overhead (p50 / p95 / p99)

| Store | p50 | p95 | p99 | Notes | |-------|-----|-----|-----|-------| | MemoryStore | 0.05ms | 0.12ms | 0.25ms | Pure JavaScript, no I/O | | RedisStore (local) | 1.8ms | 4.2ms | 8.5ms | Includes network + Lua execution | | RedisStore (remote) | 5-15ms | 15-30ms | 30-50ms | Depends on network latency | | InMemoryShield (hit) | 0.01ms | 0.03ms | 0.06ms | Hash map lookup only | | InMemoryShield (miss) | 1.8ms | 4.2ms | 8.5ms | Same as RedisStore |

Memory Usage (per 10k keys)

| Store | Strategy | Memory | Notes | |-------|----------|--------|-------| | MemoryStore | Sliding Window | ~2.5 MB | Stores timestamps per hit | | MemoryStore | Fixed Window | ~0.8 MB | Single counter per key | | MemoryStore | Token Bucket | ~1.2 MB | Stores tokens + lastRefill | | InMemoryShield | — | ~1.5 MB | Blocked keys + expiry times |

Scalability

Single process (MemoryStore):

  • Linear scaling with CPU cores (use Node.js cluster or ClusterStore)
  • No network overhead
  • Memory grows with unique keys

Multi-process (RedisStore):

  • Horizontal scaling across machines
  • Network latency adds ~1-5ms per request (local Redis)
  • Shared state across all instances

InMemoryShield + Redis:

  • Best of both: shared state + local caching for hot keys
  • 7x faster for blocked keys under attack
  • 99%+ reduction in Redis calls for repeat offenders

Benchmark Methodology

Benchmarks use:

  • Isolated test harness with controlled load
  • Single key (worst case for contention)
  • Mixed read/write patterns
  • Local Redis (Docker) for network tests
  • No other services running

Run benchmarks yourself:

git clone https://github.com/yourusername/ratelimit-flex
cd ratelimit-flex
npm install
npm run benchmark

Note: These are micro-benchmarks. Real-world performance depends on your application's request patterns, key cardinality, network topology, and Redis configuration.

Weighted / cost-based rate limiting

By default each request consumes one quota unit. For endpoints that should count more (file uploads, heavy database work, high GraphQL complexity), use a cost greater than 1.

Middleware / engine — set incrementCost on the rate limiter options (number or function of the request):

import { expressRateLimiter } from 'ratelimit-flex';

app.use(
  expressRateLimiter({
    maxRequests: 100,
    windowMs: 60_000,
    incrementCost: (req) =>
      String((req as import('express').Request).path ?? '').startsWith('/upload') ? 10 : 1,
  }),
);

Custom pipelines — call the store directly with increment / decrement options:

await store.increment(key, { cost: 10 });
// … later, undo the same weight (e.g. custom skip logic):
await store.decrement(key, { cost: 10 });

Dynamic caps plus cost still work together: increment accepts { maxRequests?, cost? } on window strategies.

Helpers resolveIncrementOpts(options, req) and matchingDecrementOptions(incOpts) are exported if you build your own middleware and need the same increment/decrement pairing as the built-in engine.

Redis implementation note: for sliding windows with cost > 1, each ZSET member is a distinct random value so Redis never silently merges two hits into one.

Store backends

Choose a backend from latency, consistency, and operational constraints. Deeper setup for SQL, MongoDB, and DynamoDB lives in docs/stores/postgres.md, docs/stores/mongo.md, and docs/stores/dynamo.md.

| Backend | Atomic sliding window | TTL cleanup | Latency | Best for | |---------|----------------------|-------------|---------|----------| | MemoryStore | exact (in-process) | in-process | <0.01ms | Single process | | RedisStore | exact (Lua ZSET) | Redis EXPIRE | 1–5ms | Multi-instance, high TPS | | ClusterStore | exact (IPC) | in-process | <0.1ms | Node cluster, one host | | PgStore | exact (JSONB array) | background sweep | 2–10ms | Postgres shops without Redis | | MongoStore | exact (aggregation pipeline) | TTL index | 2–10ms | MongoDB shops | | DynamoStore | approximate (weighted sub-window) | DynamoDB TTL | 5–20ms | AWS-native deployments |

Deployment guide

When to use MemoryStore

Use MemoryStore when:

  • One Node process serves all traffic (no horizontal scale)
  • Local development and prototyping
  • Automated tests
  • Small deployments with a single instance

Counters live only in that process. No Redis required.

import { expressRateLimiter, MemoryStore, RateLimitStrategy } from 'ratelimit-flex';

const store = new MemoryStore({
  strategy: RateLimitStrategy.SLIDING_WINDOW,
  windowMs: 60_000,
  maxRequests: 100,
});

app.use(expressRateLimiter({ store, windowMs: 60_000, maxRequests: 100 }));

If you omit store, the middleware creates a MemoryStore from windowMs / maxRequests (or token-bucket fields).

When to use ClusterStore

Use ClusterStore when:

  • Node.js native cluster module (not PM2)
  • No Redis available or desired
  • Single server with multiple CPU cores
// primary.ts (ESM — top-level await)
import cluster from 'node:cluster';
import { ClusterStorePrimary } from 'ratelimit-flex';

if (cluster.isPrimary) {
  ClusterStorePrimary.init();
  for (let i = 0; i < 4; i++) cluster.fork();
} else {
  await import('./app.js');
}
// app.ts (worker)
import express from 'express';
import { expressRateLimiter, clusterPreset } from 'ratelimit-flex';

const app = express();
app.use(expressRateLimiter(clusterPreset({ maxRequests: 100, windowMs: 60_000 })));

IPC protocol version: Worker init and primary init_ack carry protocolVersion (constants CLUSTER_IPC_PROTOCOL_VERSION and MIN_CLUSTER_IPC_PROTOCOL_VERSION in src/cluster/protocol.ts). During rolling deploys, if a worker’s version is newer than the primary, the primary responds with init_nack so the process fails fast instead of corrupting counters. Legacy peers that omit protocolVersion are treated as version 1.

When to use RedisStore

Use RedisStore when:

  • Multiple Node processes (e.g. PM2 cluster)
  • Multiple servers behind a load balancer
  • Kubernetes, Docker Swarm, or similar
  • Microservices where the same client can hit different instances
  • You need one global limit across replicas
import { expressRateLimiter, RedisStore, RateLimitStrategy } from 'ratelimit-flex';

const store = new RedisStore({
  strategy: RateLimitStrategy.SLIDING_WINDOW,
  windowMs: 60_000,
  maxRequests: 100,
  url: process.env.REDIS_URL!,
});

app.use(expressRateLimiter({ store, strategy: RateLimitStrategy.SLIDING_WINDOW }));

Prefer passing a shared Redis URL or client from every instance. Use a distinct key prefix (keyPrefix) per app or per limiter if several services share one Redis.

Clients and adapters: The default url path uses optional peer ioredis. For @redis/client (node-redis), adaptNodeRedisClient; for ioredis, adaptIoRedisClient—see RedisLikeClient in the API reference. Bun and Upstash need thin wrappers (Lua EVAL required); copy-paste starters live in examples/redis/README.md (not published packages—maintain locally).

Lua EVAL, EVALSHA, and connections: RedisStore always invokes eval(fullScript, …) on your client. It does not embed EVALSHA. Clients often optimize repeated EVAL into EVALSHA after Redis caches the script. Reuse one long-lived client per process (or warm serverless instance) where possible—per-request connections add latency and can reduce script-cache hits on the Redis side.

Multi-window: The limits: [{ windowMs, max }, …] option (see Multi-window limits (limits)) defaults to one MemoryStore per window. Pass a sliding/fixed-window RedisStore as store together with limits to reuse connection settings (and optional resilience, cloned per slot) and get one Redis-backed slot per window with distinct key prefixes. A MemoryStore with limits is accepted and ignored (same as omitting store). Alternatively use compose.windows(redisTemplate, …), multiWindowPreset, or groupedWindowStores.

When to use PgStore

Use PgStore when:

  • You already run PostgreSQL and prefer not to add Redis
  • You want exact sliding, fixed-window, and token-bucket semantics with atomic INSERT … ON CONFLICT / JSONB sliding arrays
  • A small amount of background sweep work for expired rows is acceptable
import { expressRateLimiter, postgresPreset } from 'ratelimit-flex';
import { pgStoreSchema } from 'ratelimit-flex/postgres';
import { Pool } from 'pg';

const pool = new Pool({ connectionString: process.env.DATABASE_URL! });
await pool.query(pgStoreSchema);

app.use(expressRateLimiter(postgresPreset({ pool })));

Run pgStoreSchema (or your migration) once at deploy. Tune autoSweepIntervalMs / worker estimates if you shard many app processes—see docs/stores/postgres.md.

When to use MongoStore

Use MongoStore when:

  • Your system of record is MongoDB (Atlas or self-hosted)
  • You are on MongoDB 4.2+ (aggregation pipelines in findOneAndUpdate)
  • You can maintain a TTL index on the reset field for passive expiry
import { expressRateLimiter, mongoPreset } from 'ratelimit-flex';
import { MongoClient } from 'mongodb';

const client = new MongoClient(process.env.MONGODB_URI!);
await client.connect();

app.use(expressRateLimiter(mongoPreset({ client, dbName: 'myapp' })));

See docs/stores/mongo.md for index requirements and failure modes.

When to use DynamoStore

Use DynamoStore when:

  • You deploy on AWS and want a managed, serverless-friendly store
  • Fixed window and token bucket must be exact on DynamoDB
  • Sliding window can be approximate (weighted sub-windows; typically <2% error, higher near window edges—see docs/stores/dynamo.md)
import { dynamoPreset, expressRateLimiter } from 'ratelimit-flex';
import { DynamoDBDocumentClient } from '@aws-sdk/lib-dynamodb';

const doc = DynamoDBDocumentClient.from(/* DynamoDBClient */);
app.use(expressRateLimiter(dynamoPreset({ client: doc, tableName: 'rate_limits' })));

Create the table and enable TTL on the ttl attribute once (CDK / Terraform / console). dynamoPreset defaults to fixed window so out-of-the-box counting is exact; pass strategy: RateLimitStrategy.SLIDING_WINDOW when you accept the weighted approximation.

Deployment topology

| Setup | Store | What’s shared | What’s per-process | |-------|--------|----------------|---------------------| | Single process | MemoryStore | Everything (one process) | N/A | | Node.js native cluster (same host, forked workers) | ClusterStore + ClusterStorePrimary | Rate limit counters (on primary) | Allowlist, blocklist, penalty | | PM2 cluster (same host) | RedisStore | Rate limit counters | Allowlist, blocklist, penalty | | Multiple servers + LB | RedisStore | Rate limit counters | Allowlist, blocklist, penalty | | Multiple servers + LB | PgStore / MongoStore (shared DB) | Rate limit rows in the database | Allowlist, blocklist, penalty | | Kubernetes pods | RedisStore | Rate limit counters | Allowlist, blocklist, penalty | | Kubernetes pods | PgStore / MongoStore | Rate limit rows | Allowlist, blocklist, penalty | | AWS Lambda / Fargate / multi-AZ | DynamoStore | DynamoDB table & TTL | Allowlist, blocklist, penalty | | Microservices (one global limit) | RedisStore (same namespace/prefix) | Rate limit counters | Allowlist, blocklist, penalty | | Microservices (per-service limits) | RedisStore (different prefix/DB) | Per-service counters | Allowlist, blocklist, penalty |

PM2 vs Node cluster: ClusterStore (Node’s native cluster IPC with ClusterStorePrimary on the primary) is not for PM2 cluster mode. PM2 runs independent worker processes and uses its own IPC to the daemon, not a Node cluster primary/worker tree. For PM2, use RedisStore (or another shared store). At startup, ClusterStore detects PM2 (PM2_HOME or pm_id) and throws a clear error if the process is not a Node cluster worker.

Sticky sessions: If your load balancer uses sticky sessions, MemoryStore can appear to work, but it is fragile—deploys and restarts reset counters per instance. RedisStore survives restarts and stays consistent across nodes.

Auto-detection and warnings

detectEnvironment() returns flags such as isKubernetes, isDocker, isCluster, isMultiInstance, and a recommended store ('memory' | 'redis'). Use it in your own startup logging or configuration.

import { detectEnvironment } from 'ratelimit-flex';

const env = detectEnvironment();
if (env.recommended === 'redis' && !process.env.REDIS_URL) {
  console.warn('Production-like environment detected; consider Redis for shared limits.');
}

Express and Fastify integrations also call warnIfMemoryStoreInCluster once at startup: if a MemoryStore is used and the process looks like a multi-instance environment (e.g. Docker, Kubernetes, PM2), a one-time stderr warning is printed.

Suppress with:

RATELIMIT_FLEX_NO_MEMORY_WARN=1

Similarly, if RedisStore is used without an insurance limiter (resilience.insuranceLimiter) in a multi-instance-looking environment, a one-time stderr reminder suggests resilientRedisPreset or configuring insurance for failover protection.

Suppress with:

RATELIMIT_FLEX_NO_RESILIENCE_WARN=1

Presets

Presets return a Partial<RateLimitOptions> you can pass to expressRateLimiter / fastifyRateLimiter (or spread and override).

singleInstancePreset(options?)

When: Dev, tests, single-process apps.

  • Sliding window, 100 req / min (defaults), in-memory (no store in preset—middleware builds MemoryStore).
import { expressRateLimiter, singleInstancePreset } from 'ratelimit-flex';

app.use(expressRateLimiter(singleInstancePreset({ maxRequests: 200 })));

multiInstancePreset(redisOptions, options?)

When: Production with Redis, multiple workers or nodes.

  • RedisStore, sliding window, 100 req / min
  • onRedisError: fail-open by default (override via redisOptions.onRedisError)
import { expressRateLimiter, multiInstancePreset } from 'ratelimit-flex';

app.use(
  expressRateLimiter(
    multiInstancePreset({ url: process.env.REDIS_URL! }, { maxRequests: 500 }),
  ),
);

resilientRedisPreset(redisOptions, options?)

When: Production Redis with insurance (in-memory fallback), circuit breaker, optional counter sync on recovery, and per-worker limit scaling. See Redis resilience for behavior, examples, and comparison with fail-open / fail-closed.

clusterPreset(options?)

When: Node.js native cluster module (not PM2), single server with multiple CPU cores, no Redis.

  • ClusterStore, sliding window, 100 req / min
  • Requires ClusterStorePrimary.init() on the primary process
// primary.ts
import cluster from 'node:cluster';
import { ClusterStorePrimary } from 'ratelimit-flex/cluster';

if (cluster.isPrimary) {
  ClusterStorePrimary.init();
  for (let i = 0; i < 4; i++) cluster.fork();
} else {
  await import('./app.js');
}
// app.ts (worker)
import { expressRateLimiter, clusterPreset } from 'ratelimit-flex';

app.use(expressRateLimiter(clusterPreset({ maxRequests: 100, windowMs: 60_000 })));

queuedClusterPreset(options?)

When: Node.js native cluster + request queuing (queue over-limit requests instead of rejecting them).

  • ClusterStore + expressQueuedRateLimiter / fastifyQueuedRateLimiter
  • Sliding window, 100 req / min, queue size 100, 30s max wait
  • Requires ClusterStorePrimary.init() on the primary process
// primary.ts
import cluster from 'node:cluster';
import { ClusterStorePrimary } from 'ratelimit-flex/cluster';

if (cluster.isPrimary) {
  ClusterStorePrimary.init();
  for (let i = 0; i < 4; i++) cluster.fork();
} else {
  await import('./app.js');
}
// app.ts (worker)
import { expressQueuedRateLimiter, queuedClusterPreset } from 'ratelimit-flex';

app.use('/api', expressQueuedRateLimiter(queuedClusterPreset({
  maxRequests: 50,
  windowMs: 60_000,
  maxQueueSize: 200,
})));

apiGatewayPreset(redisOptions, options?)

When: API gateway–style traffic, key per client credential.

  • Token bucket (~30 tokens/min, burst 60), x-api-key key generator
  • fail-closed when Redis is down (override possible)
import { expressRateLimiter, apiGatewayPreset } from 'ratelimit-flex';

app.use('/v1', expressRateLimiter(apiGatewayPreset({ url: process.env.REDIS_URL! })));

authEndpointPreset(redisOptions, options?)

When: Login, signup, password reset—brute-force protection.

  • Fixed window, 5 req / min per IP (default), IP-based key
  • fail-closed when Redis is down
import { expressRateLimiter, authEndpointPreset } from 'ratelimit-flex';

app.post(
  '/login',
  expressRateLimiter(authEndpointPreset({ url: process.env.REDIS_URL! }, { maxRequests: 10 })),
  loginHandler,
);

publicApiPreset(options?)

When: Public HTTP APIs with a simple in-memory limit and structured JSON errors.

  • Sliding window, 60 req / min, default message object
import { expressRateLimiter, publicApiPreset } from 'ratelimit-flex';

app.use('/public', expressRateLimiter(publicApiPreset()));

Redis failure handling

| Mode | Behavior if Redis errors during quota check | |------|-----------------------------------------------| | fail-open (default for RedisStore) | Request is allowed; warning logged | | fail-closed | Request is treated as blocked; middleware responds 503 with { error: 'Service temporarily unavailable' } |

Recommendation: fail-open for most general APIs (availability over strict quota). fail-closed for auth, payments, or when you must not serve traffic without a working limiter.

// Fail-open (default)
new RedisStore({ url: REDIS_URL, strategy: RateLimitStrategy.SLIDING_WINDOW, windowMs: 60_000, maxRequests: 100 });

// Fail-closed
new RedisStore({
  url: REDIS_URL,
  strategy: RateLimitStrategy.SLIDING_WINDOW,
  windowMs: 60_000,
  maxRequests: 100,
  onRedisError: 'fail-closed',
});

Policy vs counters: Allowlist, blocklist, and penalty box are enforced in the RateLimitEngine (in-memory) before the store runs. They still apply when Redis is down. Only quota / window / bucket counting depends on RedisStore.increment.

Redis resilience

When Redis is unavailable, the default fail-open / fail-closed modes either allow every request or block every request globally—there is no per-client quota during the outage. An insurance limiter fixes that: a dedicated MemoryStore that activates automatically when the circuit breaker decides Redis is unhealthy, so each process still enforces per-process limits. Configure that in-memory cap as roughly total shared limit ÷ expected worker count (e.g. 300 requests/minute across 5 replicas → 60 per process) so failover traffic stays in the same ballpark as your global Redis budget.

Manual setup (RedisStore + resilience)

import { expressRateLimiter, RedisStore, MemoryStore, RateLimitStrategy } from 'ratelimit-flex';

const insuranceStore = new MemoryStore({
  strategy: RateLimitStrategy.SLIDING_WINDOW,
  windowMs: 60_000,
  maxRequests: 60, // 300 / 5 workers
});

const store = new RedisStore({
  strategy: RateLimitStrategy.SLIDING_WINDOW,
  windowMs: 60_000,
  maxRequests: 300,
  url: process.env.REDIS_URL!,
  resilience: {
    insuranceLimiter: { store: insuranceStore },
    circuitBreaker: { failureThreshold: 3, recoveryTimeMs: 5000 },
    hooks: {
      onFailover: (err) => console.error('Redis down, using fallback', err),
      onRecovery: (ms) => console.log(`Redis recovered after ${ms}ms`),
    },
  },
});

app.use(expressRateLimiter({ store, strategy: RateLimitStrategy.SLIDING_WINDOW }));

Preset (resilientRedisPreset)

resilientRedisPreset wires the same idea—Redis + insurance MemoryStore + circuit breaker—and estimates worker count from the environment (or estimatedWorkers) so you do not hand-divide limits yourself:

import { expressRateLimiter, resilientRedisPreset } from 'ratelimit-flex';

app.use(expressRateLimiter(
  resilientRedisPreset(
    { url: process.env.REDIS_URL! },
    { maxRequests: 300, estimatedWorkers: 5 }
  )
));

Circuit breaker

The breaker around Redis has three states:

  • Closed — Redis is used; successes reset failure streaks.
  • Open — Too many consecutive failures; requests are not sent to Redis (they go to the insurance store instead), avoiding wasted round-trips to a dead server.
  • Half-open — After a recovery window, a probe allows one Redis attempt; success closes the circuit, failure reopens it.

Counter sync

When the circuit closes again after an outage, accumulated hits in the insurance MemoryStore can be replayed into Redis (INCRBY-style paths per strategy) so shared state catches up. This is syncOnRecovery: true by default on resilience.insuranceLimiter and can be set to false if you do not want that merge step.

Sliding window note: replay bulk-inserts synthetic hits with timestamps at recovery time (counts match; the visible window is not time-smoothed across the outage — see JSDoc on RedisStore sync). Fixed window and token bucket sync paths behave as described in code comments.

Comparison: fail-open / fail-closed vs insurance limiter

| Feature | fail-open / fail-closed | Insurance limiter | |---------|------------------------|-------------------| | Redis down behavior | Allow all or block all | Fallback to in-memory rate limiting | | Rate limiting during outage | None (open) or total block (closed) | Per-process limits enforced | | Circuit breaker | No | Yes — avoids wasted Redis round-trips | | Counter sync on recovery | No | Yes — replays in-memory hits to Redis | | Observability hooks | onRedisError only | onFailover, onRecovery, onCircuitOpen, onCircuitClose, onInsuranceHit, onCounterSync |

When insurance is configured, it replaces the binary fail-open/fail-closed behavior for quota operations (see Redis failure handling).

Redis resilience

Handle Redis outages gracefully with insurance limiters and circuit breakers. When Redis is unavailable, an insurance limiter (dedicated MemoryStore) activates automatically, so each process still enforces per-process limits.

Quick Start

Manual setup:

import { expressRateLimiter, RedisStore, MemoryStore, RateLimitStrategy } from 'ratelimit-flex';

const insuranceStore = new MemoryStore({
  strategy: RateLimitStrategy.SLIDING_WINDOW,
  windowMs: 60_000,
  maxRequests: 60, // 300 / 5 workers
});

const store = new RedisStore({
  strategy: RateLimitStrategy.SLIDING_WINDOW,
  windowMs: 60_000,
  maxRequests: 300,
  url: process.env.REDIS_URL!,
  resilience: {
    insuranceLimiter: { store: insuranceStore },
    circuitBreaker: { failureThreshold: 3, recoveryTimeMs: 5000 },
    hooks: {
      onFailover: (err) => console.error('Redis down, using fallback', err),
      onRecovery: (ms) => console.log(`Redis recovered after ${ms}ms`),
    },
  },
});

app.use(expressRateLimiter({ store }));

Preset:

import { expressRateLimiter, resilientRedisPreset } from 'ratelimit-flex';

app.use(expressRateLimiter(
  resilientRedisPreset(
    { url: process.env.REDIS_URL! },
    { maxRequests: 300, estimatedWorkers: 5 }
  )
));

How It Works

Circuit Breaker States:

  • Closed — Redis is used; successes reset failure streaks
  • Open — Too many failures; requests use insurance store instead
  • Half-open — After recovery window, probe Redis; success closes circuit

Counter Sync: When Redis recovers, accumulated hits in insurance MemoryStore are replayed to Redis (syncOnRecovery: true by default).

Full documentation: See [docs/REDIS_RESILIENCE.md][doc-redis-resilience] for:

  • Circuit breaker configuration
  • Counter synchronization details
  • Observability hooks
  • Comparison with fail-open/fail-closed
  • Best practices and monitoring

Options are merged with strategy defaults. Omit store to get an auto-created MemoryStore (unless you use limits, which builds grouped in-memory stores).

| Option | Type | Default | Description | |--------|------|---------|-------------| | strategy | RateLimitStrategy | SLIDING_WINDOW | SLIDING_WINDOW, FIXED_WINDOW, TOKEN_BUCKET | | store | RateLimitStore | auto MemoryStore | Backing store | | windowMs | number | 60000 | Window length (sliding / fixed) | | maxRequests | number | (req) => number | 100 | Max requests per window (sliding / fixed) | | incrementCost | number | (req) => number | — | Quota units per request (1 if omitted); use with weighted store.increment semantics | | limits | { windowMs, max }[] | — | Multiple windows; block if any exceeded (details) | | tokensPerInterval | number | 10 | Token bucket refill rate | | interval | number | 60000 | Refill interval (token bucket) | | bucketSize | number | 100 | Max tokens / burst (token bucket) | | keyGenerator | (req) => string | IP / socket fallback | Storage key (Client IP & reverse proxies) | | headers | boolean | true | Legacy X-RateLimit-* when standardHeaders is omitted; see Standard headers | | standardHeaders | boolean | 'legacy' | 'draft-6' | 'draft-7' | 'draft-8' | (see defaults) | Which response header profile to send (Standard headers) | | identifier | string | {limit}-per-{windowSeconds} | Policy name for draft-8 / draft-7 policy strings | | legacyHeaders | boolean | (profile-dependent) | Also emit X-RateLimit-* alongside draft profiles | | statusCode | number | 429 | Status when rate-limited | | message | string | object | "Too many requests" | Response body ({ error: message }) | | skip | (req) => boolean | — | Skip limiting | | skipFailedRequests | boolean | false | Decrement on >= 400 responses | | skipSuccessfulRequests | boolean | false | Decrement on `< 40