asillios-limiter

v2.0.0

Published

10 days ago

rate limiting and usage tracking for llm-powered apps

0High
0Medium
0Low

jia_seed

rate-limit llm openai anthropic tokens usage

asillios-limiter

YOU DON'T WANT A BIG SURPRISE BILL FROM OPENAI OR ANTHROPIC API WHEN U GIVE FREE TIERS

asillios.com

token-based rate limiting for llm apps. tracks usage per user, supports openai and anthropic, includes cost tracking.

asillios-limiter is a typescript library for controlling and monitoring api usage in apps that use large language models like claude or gpt. when you build an app that lets users interact with an llm, you're paying for every token they use. this library wraps your llm calls and automatically tracks how many tokens each user consumes, letting you set limits and get alerts when users approach their quotas. it works with both anthropic and openai response formats out of the box.

install

npm install asillios-limiter

quick start

import { createLimiter } from "asillios-limiter";

const limiter = createLimiter({
  limit: 100000,
  window: 60 * 60 * 1000, // 1 hour
  onThreshold: (userId, percent) => {
    console.log(`user ${userId} hit ${percent}%`);
  },
});

// wrap your llm calls - tokens tracked automatically
const response = await limiter.wrap("user-123", () =>
  openai.chat.completions.create({
    model: "gpt-4",
    messages: [{ role: "user", content: "hello" }],
  })
);

// check limits and get stats
const canProceed = await limiter.check("user-123");
const remaining = await limiter.getRemainingTokens("user-123");
const stats = await limiter.stats("user-123");

features

sliding window

uses a rolling window instead of fixed reset times. smoother limiting and harder to game.

multiple limits

enforce multiple limits simultaneously (e.g., per-minute AND per-day):

const limiter = createLimiter({
  limits: [
    { tokens: 10000, window: 60 * 1000 },       // 10k per minute
    { tokens: 100000, window: 24 * 60 * 60 * 1000 }, // 100k per day
  ],
});

burst allowance

let users temporarily exceed limits:

const limiter = createLimiter({
  limit: 50000,
  window: 60 * 60 * 1000,
  burstPercent: 20, // allow 20% overage
});

cost tracking

track estimated api costs alongside tokens:

const limiter = createLimiter({
  limit: 100000,
  window: 60 * 60 * 1000,
  trackCost: true,
  costLimit: 5.0, // $5 limit
});

// pass model name to calculate cost
const response = await limiter.wrap(
  "user-123",
  () => anthropic.messages.create({ ... }),
  { model: "claude-3-sonnet" }
);

const stats = await limiter.stats("user-123");
// { tokensUsed, remaining, costUsed, costRemaining, ... }

supported models: gpt-4, gpt-4-turbo, gpt-4o, gpt-4o-mini, gpt-3.5-turbo, claude-3-opus, claude-3-sonnet, claude-3-haiku, claude-sonnet-4

middleware helpers

express

import { createLimiter, expressMiddleware } from "asillios-limiter";

const limiter = createLimiter({ limit: 50000, window: 3600000 });

app.use(
  "/api/chat",
  expressMiddleware(limiter, (req) => req.user?.id ?? null)
);

next.js

import { createLimiter, nextMiddleware } from "asillios-limiter";

const limiter = createLimiter({ limit: 50000, window: 3600000 });
const checkLimit = nextMiddleware(limiter, (req) => req.headers.get("x-user-id"));

export async function POST(req: Request) {
  const { allowed, response } = await checkLimit(req);
  if (!allowed) return response;

  // proceed with llm call...
}

redis storage

built-in redis adapter (bring your own client):

import { createLimiter, RedisStorage } from "asillios-limiter";
import Redis from "ioredis";

const redis = new Redis();

const limiter = createLimiter({
  limit: 100000,
  window: 60 * 60 * 1000,
  storage: new RedisStorage(redis),
});

api

`createLimiter(config)`

interface LimiterConfig {
  // simple: single limit
  limit?: number;
  window?: number;
  // advanced: multiple limits
  limits?: { tokens: number; window: number }[];
  // burst allowance percentage
  burstPercent?: number;
  // cost tracking
  trackCost?: boolean;
  costLimit?: number;
  // storage and callbacks
  storage?: StorageAdapter;
  onThreshold?: (userId: string, percent: number) => void;
  thresholds?: number[]; // default: [80, 90, 100]
}

`limiter.wrap(userId, fn, options?)`

wraps async function and tracks tokens from response.

await limiter.wrap("user-123", () => llmCall(), {
  throwOnLimit: true, // throw if over limit
  model: "gpt-4",     // for cost calculation
});

`limiter.check(userId)`

returns true if user is within all limits.

`limiter.getRemainingTokens(userId)`

returns remaining tokens for primary limit.

`limiter.stats(userId)`

returns full usage stats:

{
  tokensUsed: number;
  remaining: number;
  resetAt: Date;
  percentUsed: number;
  costUsed?: number;      // if trackCost enabled
  costRemaining?: number; // if costLimit set
}

`limiter.addTokens(userId, tokens, cost?)`

manually add tokens (for streaming, etc).

`limiter.reset(userId)`

reset user's usage.

license

mit

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

asillios-limiter

install

quick start

features

sliding window

multiple limits

burst allowance

cost tracking

middleware helpers

express

next.js

redis storage

api

createLimiter(config)

limiter.wrap(userId, fn, options?)

limiter.check(userId)

limiter.getRemainingTokens(userId)

limiter.stats(userId)

limiter.addTokens(userId, tokens, cost?)

limiter.reset(userId)

license

`createLimiter(config)`

`limiter.wrap(userId, fn, options?)`

`limiter.check(userId)`

`limiter.getRemainingTokens(userId)`

`limiter.stats(userId)`

`limiter.addTokens(userId, tokens, cost?)`

`limiter.reset(userId)`