npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@mcptoolshop/throttleai

v1.1.2

Published

Lightweight, token-based AI governance for TypeScript

Downloads

181

Readme

ThrottleAI 是一款零依赖的速率控制工具,用于管理并发、速率和令牌预算,并提供适用于 fetch、OpenAI、工具、Express 和 Hono 的适配器。


60 秒快速入门

pnpm add throttleai
import { createGovernor, withLease, presets } from "throttleai";

const gov = createGovernor(presets.balanced());

const result = await withLease(
  gov,
  { actorId: "user-1", action: "chat" },
  async () => await callMyModel(),
);

if (result.granted) {
  console.log(result.result);
} else {
  console.log("Throttled:", result.decision.recommendation);
}

就是这样。该工具强制执行并发、速率限制和公平性。 如果您忘记释放,租约将自动过期。

原因

AI 应用程序会超出速率限制,耗尽预算,并造成拥堵。 ThrottleAI 位于您的代码和模型调用之间,强制执行以下内容:

  • 并发性:使用加权槽和交互式预留来限制并发调用。
  • 速率:每分钟请求数和每分钟令牌数,并使用滑动窗口。
  • 公平性:防止任何单个参与者垄断资源。
  • 租约:在获取之前,在释放之后,并在超时时自动过期。
  • 可观察性snapshot()onEventformatEvent() 用于调试。

零依赖。Node.js 18+。可进行树摇优化。

选择您的速率限制器

| 速率限制器 | 它限制的内容 | 使用场景 | | --------- | ------------- | ------------- | | Concurrency | 并发调用数 | 始终 - 这是最重要的参数。 | | Rate | 每分钟请求数 | 当上游 API 具有已记录的速率限制时。 | | Token rate | 每分钟令牌数 | 当您有每分钟的令牌预算时。 | | Fairness | 每个参与者的资源份额 | 多租户应用程序,其中一个用户不应占用过多资源。 | | Adaptive | 自动调整的并发上限 | 当上游延迟不可预测时。 |

首先设置并发性。 仅在需要时添加速率限制。 请参阅调整指南,其中包含基于场景的建议。

预设配置

import { presets } from "throttleai";

// Single user, CLI tools — 1 call at a time, 10 req/min
createGovernor(presets.quiet());

// SaaS backend — 5 concurrent (2 interactive reserve), 60 req/min, fairness
createGovernor(presets.balanced());

// Batch processing — 20 concurrent, 300 req/min, fairness + adaptive tuning
createGovernor(presets.aggressive());

// Override any field
createGovernor({ ...presets.balanced(), leaseTtlMs: 30_000 });

常见模式

服务器端点:429 错误 vs 队列

// Option A: immediate deny with 429
const result = await withLease(gov, request, fn);
// result.granted === false → respond with 429

// Option B: wait with bounded retries
const result = await withLease(gov, request, fn, {
  strategy: "wait-then-deny",
  maxAttempts: 3,
  maxWaitMs: 5_000,
});

UI 交互 vs 后台任务

// User-facing chat gets priority
gov.acquire({ actorId: "user", action: "chat", priority: "interactive" });

// Background embedding can wait
gov.acquire({ actorId: "pipeline", action: "embed", priority: "background" });

使用 interactiveReserve: 2 时,当仅剩 2 个槽时,后台任务将被阻止,这些槽保留用于交互式请求。

流式调用

const decision = gov.acquire({ actorId: "user", action: "stream" });
if (!decision.granted) return;

try {
  const stream = await openai.chat.completions.create({ stream: true, ... });
  for await (const chunk of stream) {
    // process chunk
  }
  gov.release(decision.leaseId, { outcome: "success" });
} catch (err) {
  gov.release(decision.leaseId, { outcome: "error" });
  throw err;
}

获取一次,释放一次——租约在整个流的持续时间内有效。

可观察性:查看其限制的原因

import { createGovernor, formatEvent, formatSnapshot } from "throttleai";

const gov = createGovernor({
  ...presets.balanced(),
  onEvent: (e) => console.log(formatEvent(e)),
  // [deny] actor=user-1 action=chat reason=concurrency retryAfterMs=500 — All 5 slots in use...
});

// Point-in-time view
console.log(formatSnapshot(gov.snapshot()));
// concurrency=3/5 rate=12/60 leases=3

配置

createGovernor({
  // Concurrency (optional)
  concurrency: {
    maxInFlight: 5,          // max simultaneous weight
    interactiveReserve: 1,   // slots reserved for interactive priority
  },

  // Rate limiting (optional)
  rate: {
    requestsPerMinute: 60,   // request-rate cap
    tokensPerMinute: 100_000, // token-rate cap
    windowMs: 60_000,         // rolling window (default 60s)
  },

  // Advanced (optional)
  fairness: true,             // prevent actor monopolization
  adaptive: true,             // auto-tune concurrency from deny rate + latency
  strict: true,               // throw on double release / unknown ID (dev mode)

  // Lease settings
  leaseTtlMs: 60_000,         // auto-expire (default 60s)
  reaperIntervalMs: 5_000,    // sweep interval (default 5s)

  // Observability
  onEvent: (e) => { /* acquire, deny, release, expire, warn */ },
});

API

createGovernor(config): Governor

工厂函数。 返回一个 Governor 实例。

governor.acquire(request): AcquireDecision

请求租约。 返回:

// Granted
{ granted: true, leaseId: string, expiresAt: number }

// Denied
{ granted: false, reason, retryAfterMs, recommendation, limitsHint? }

拒绝原因:"concurrency" | "rate" | "budget" | "policy"

governor.release(leaseId, report?): void

释放租约。 始终调用此方法——即使在发生错误时。

withLease(governor, request, fn, options?)

在租约下执行 fn,并在完成后自动释放。

withLease(gov, request, fn, {
  strategy: "deny",           // default — fail immediately
  strategy: "wait",           // retry with backoff until maxWaitMs
  strategy: "wait-then-deny", // retry up to maxAttempts
  maxWaitMs: 10_000,          // max total wait (default 10s)
  maxAttempts: 3,             // for "wait-then-deny" (default 3)
  initialBackoffMs: 250,      // starting backoff (default 250ms)
});

governor.snapshot(): GovernorSnapshot

当前状态:并发性、速率、令牌数、上次拒绝。

formatEvent(event): string / formatSnapshot(snap): string

单行、易于阅读的格式化器。

状态获取器

gov.activeLeases         // active lease count
gov.concurrencyActive    // in-flight weight
gov.concurrencyAvailable // remaining capacity
gov.rateCount            // requests in current window
gov.tokenRateCount       // tokens in current window

governor.dispose(): void

停止 TTL 清理器。 在关闭时调用。

适配器

可进行树摇优化的包装器——仅导入您使用的内容。 没有运行时依赖。

| 适配器 | 导入 | 自动报告 | | --------- | -------- | ------------- | | fetch | throttleai/adapters/fetch | 结果(来自 HTTP 状态码)+ 延迟 | | OpenAI | throttleai/adapters/openai | 结果 + 延迟 + 令牌使用量 | | Tool | throttleai/adapters/tools | 结果 + 延迟 + 自定义权重 | | Express | throttleai/adapters/express | 结果(来自 res.statusCode)+ 延迟 | | Hono | throttleai/adapters/hono | 结果 + 延迟 |

所有适配器在授权时返回 { ok: true, result, latencyMs },在拒绝时返回 { ok: false, decision }

fetch

import { wrapFetch } from "throttleai/adapters/fetch";
const throttledFetch = wrapFetch(fetch, { governor: gov });
const r = await throttledFetch("https://api.example.com/v1/chat");
if (r.ok) console.log(r.response.status);

与 OpenAI 兼容

import { wrapChatCompletions } from "throttleai/adapters/openai";
const chat = wrapChatCompletions(openai.chat.completions.create, { governor: gov });
const r = await chat({ model: "gpt-4", messages });
if (r.ok) console.log(r.result.choices[0].message.content);

工具调用

import { wrapTool } from "throttleai/adapters/tools";
const embed = wrapTool(myEmbedFn, { governor: gov, toolId: "embed", costWeight: 2 });
const r = await embed("hello");
if (r.ok) console.log(r.result);

Express

import { throttleMiddleware } from "throttleai/adapters/express";
app.use("/ai", throttleMiddleware({ governor: gov }));
// 429 + Retry-After header + JSON body on deny

请参考 examples/express-adaptive/ 目录,了解一个完整的可运行服务器,该服务器具有自适应调整功能。

Hono

import { throttle } from "throttleai/adapters/hono";
app.use("/ai/*", throttle({ governor: gov }));
// 429 JSON on deny, leaseId stored on context

文档

| 文档 | 内容概要 | | ---------- | --------------- | | Tuning cheatsheet | 基于场景的配置指南、决策树、参数参考 | | Troubleshooting | 常见问题:始终被拒绝、系统卡顿、自适应震荡 | | Release manifest | 发布流程和构建包详情 | | Repo hygiene | 资源策略和历史重写日志 |

参数快速参考

| 您会看到这个 | 调整这个 | |---|---| | reason: "concurrency" | 增加 maxInFlight 或减少调用时长 | | reason: "rate" | 增加 requestsPerMinute / tokensPerMinute | | reason: "policy" (公平性) | 降低 softCapRatio 或增加 maxInFlight | | retryAfterMs 值过高 | 减少 leaseTtlMs,以便过期的租约更快释放 | | 后台任务被阻塞 | 增加 maxInFlight 或减少 interactiveReserve | | 交互式延迟过高 | 增加 interactiveReserve | | 自适应调整速度过快 | 降低 alpha 或提高 targetDenyRate |

如需更详细的指导,请参阅 参数优化技巧

示例

请参考 examples/ 目录,了解可运行的演示示例:

npx tsx examples/node-basic.ts

稳定性

ThrottleAI 遵循 语义版本控制。公共 API — 从 throttleaithrottleai/adapters/* 导出的所有内容 — 从 v1.0.0 开始,稳定。 破坏性更改需要进行主版本升级。

有关公共 API 与内部 API 的详细信息,请参阅 API 稳定性。 如有安全漏洞报告,请参阅 SECURITY.md

许可证

MIT