@wildix/xbees-conversations-utils
v1.3.0
Published
This package wraps a `StreamChat` client into a proxy that adds shared rate-limit protection around Stream SDK calls without changing the normal sync API contract. Async calls go through retry / cooldown handling, while sync methods stay sync. If a proxie
Maintainers
Keywords
Readme
streamRateLimitHandlingProxy
This package wraps a StreamChat client into a proxy that adds shared rate-limit protection around Stream SDK calls without changing the normal sync API contract. Async calls go through retry / cooldown handling, while sync methods stay sync. If a proxied call returns a channel-like object, that object is wrapped too, so the same protection continues on nested channel operations.
Public entry points
createRateLimitedStreamProxy(stream, {redis, logger})Creates the proxy, attaches response interceptors once, and returns a rate-limit-awareStreamChatinstance. Code:src/streamRateLimitHandlingProxy/createRateLimitedStreamProxy.tswithStreamRateLimitOptions(options)Builds a marker token that can be passed as the last argument of a proxied async call to override retry behavior for that single call. Code:src/streamRateLimitHandlingProxy/withStreamRateLimitOptions.ts- Public exports are re-exported from
src/streamRateLimitHandlingProxy/index.tsand then fromsrc/index.ts.
How it works
createRateLimitedStreamProxy(...)tries to register axios response interceptors on the underlying Stream transport.- The proxy classifies Stream methods into groups generated in
generatedMethodNames.ts:- pure sync methods: returned as-is;
- sync methods returning channel/client objects: returned sync, but results are re-wrapped;
- async methods: executed through the rate-limit handler.
- Every async call is normalized through
executeWithRateLimitHandling(...). - Returned channel-like objects are detected and wrapped again, so calls like
stream.channel(...).sendMessage(...)stay protected end to end.
If the underlying Stream SDK transport does not expose an axios instance, the proxy still works, but shared protection driven by response headers becomes less effective because no interceptors can be attached.
Protection layers
1. Soft global throttle
The interceptor reads x-ratelimit-limit and x-ratelimit-remaining. When usage becomes high, it enables a short shared Redis cooldown before real 429 responses start happening. This smooths bursts instead of waiting for hard throttling.
- Trigger source: response headers on successful or failed requests
- Effect:
- usage
>= 95%: delay next requests by3000ms; - usage between
85%and95%: delay by1500ms; - usage between
70%and85%: delay by500ms; - usage
< 70%: clear soft throttle.
- usage
- Scope: global shared throttle
- Code:
helpers/processStreamRateLimitHeaders.ts
2. Hard per-operation 429 cooldown
If Stream returns HTTP 429, the handler converts that error into RateLimitExceededException, calculates retry delay from retry-after or fallback values, stores a Redis cooldown for the specific operation, and optionally retries the call.
If another worker hits the same operation while that cooldown is still active, the request is short-circuited locally before touching Stream. To keep that synthetic error useful, the latest real x-ratelimit-* snapshot is cached in Redis next to the cooldown and reused in the locally generated exception.
- Trigger source: HTTP 429 /
retry-after/x-ratelimit-* - Effect: per-operation block, optional retry, synthetic local short-circuit when cooldown is already known
- Scope: operation-level
- Code:
- execution flow:
helpers/executeWithRateLimitHandling.ts - 429 parsing:
helpers/processStreamRateLimitException.ts - exception creation:
helpers/createRateLimitExceededException.ts - snapshot serialization:
helpers/rateLimitSnapshot.ts
- execution flow:
3. App-wide budget cooldown
The interceptor also reads x-budget-* headers. If overall Stream app budget usage becomes high, it enables a global Redis cooldown. While this cooldown is active, requests are delayed in small sleep chunks so they can resume early if a fresher response lowers the cooldown.
- Trigger source:
x-budget-limit-ms,x-budget-remaining-ms,x-budget-used-ms - Effect:
- usage
>= 80%: enable strong cooldown (30s-60s); - usage between
70%and80%: keep a relaxed cooldown (5s-10s); - usage between
60%and70%: keep a minimal cooldown (1s-2s); - usage
< 60%: clear cooldown.
- usage
- Scope: whole application, not a single operation
- Code:
helpers/processStreamBudgetHeaders.ts
Retry and cooldown behavior
- Local preflight checks happen before request execution when
enableCooldownis on:- active per-operation cooldown: fail fast with a synthetic
RateLimitExceededException; - active soft throttle: delay request briefly;
- active budget cooldown: delay request globally.
- active per-operation cooldown: fail fast with a synthetic
- Real 429 responses can still be retried if:
attempt < maxAttempts;- computed delay
<= maxRetryableDelayMs.
- Delay calculation rules:
- if Stream provides
Retry-After/ reset information, that server value is preferred even if it is larger thanmaxDelayMs; maxDelayMsonly caps fallback delay when server retry timing is missing or malformed;maxRetryableDelayMsdecides whether that computed delay is still retryable or should be propagated immediately.
- if Stream provides
- If retry is not allowed, the processed rate-limit exception is propagated to the caller.
Core logic: src/streamRateLimitHandlingProxy/helpers/executeWithRateLimitHandling.ts
Configuration
Default behavior is defined in src/streamRateLimitHandlingProxy/constants.ts:
maxAttempts = 3enableCooldown = truemaxDelayMs = 5000maxRetryableDelayMs = 10000
The same file also contains:
- Redis key prefixes for hard cooldown, soft throttle, and budget cooldown
- fallback retry values and jitter
- budget thresholds and cooldown ranges
- soft-throttle thresholds and delays
Notes:
enableCooldowncontrols local Redis-based preflight checks and shared cooldown reuse.maxDelayMsis a fallback cap, not a hard upper bound for server-providedRetry-After.- Missing / invalid rate-limit headers do not break the proxy, but they reduce how much shared throttling state can be inferred.
Per-call overrides are passed as the last argument:
stream.queryChannels(filters, sort, options, withStreamRateLimitOptions({
maxAttempts: 1,
maxRetryableDelayMs: 0,
}));Argument extraction is implemented in helpers/splitCallArgsAndOptions.ts.
