@splashcodex/api-key-manager
v4.1.0
Published
Universal API Key Rotation System for rate-limited APIs
Maintainers
Readme
@splashcodex/ApiKeyManager v4.1 — Mastermind Edition
Universal API Key Rotation System with Resilience, Load Balancing, Semantic Caching & AI Gateway Features
Features
- Circuit Breaker — Keys transition through
CLOSED → OPEN → HALF_OPEN → DEAD - Error Classification — Automatic detection of 429 (Quota), 403 (Auth), 5xx (Transient), Timeout, Safety blocks
- Pluggable Strategies —
StandardStrategy,WeightedStrategy,LatencyStrategy execute()Wrapper — Single method: get key → call → latency → retry → fallback- Event Emitter — Typed lifecycle hooks for monitoring & alerting
- Auto-Retry with Backoff — Built-in retry loop with exponential backoff + jitter
- Request Timeout —
AbortController-based timeout per attempt - Fallback Function — Graceful degradation when all keys fail
- Provider Tagging — Multi-provider routing (
openai,gemini, etc.) - Health Checks — Periodic key validation and auto-recovery
- Bulkhead / Concurrency — Limits concurrent
execute()calls - Semantic Cache (v4 NEW) — Cosine-similarity cache with pluggable embeddings
- Streaming Support (v4.1 NEW) —
executeStream()with initial retry + cache replay - Recursion Guard (v4 NEW) — Prevents infinite loops when
getEmbeddingcallsexecute() - State Persistence — Survives restarts via pluggable storage
- 100% Backward Compatible — v1.x, v2.x, and v3.x code works without changes
Installation
npm install @splashcodex/api-key-managerQuick Start
import { ApiKeyManager } from '@splashcodex/api-key-manager';
// Simple (v1/v2 compatible)
const manager = new ApiKeyManager(['key1', 'key2', 'key3']);
const key = manager.getKey();
manager.markSuccess(key!);
// v3+ — Full power
const result = await manager.execute(
(key) => fetch(`https://api.example.com?key=${key}`),
{ maxRetries: 3, timeoutMs: 5000 }
);v4.0 — Semantic Cache (Mastermind Edition)
Automatically cache API responses by semantic similarity. Identical or near-identical prompts return cached results without consuming API quota.
import { ApiKeyManager } from '@splashcodex/api-key-manager';
const manager = new ApiKeyManager(['key1', 'key2'], {
semanticCache: {
threshold: 0.92, // 92% cosine similarity to match
ttlMs: 86400000, // 24-hour TTL (default)
getEmbedding: async (text) => {
// Your embedding function (e.g. OpenAI, Gemini, local model)
return await myEmbeddingModel.embed(text);
}
}
});
// First call → API hit, cached
const r1 = await manager.execute(apiFn, { prompt: 'What is the weather?' });
// Second call → Semantic Cache HIT (no API call)
const r2 = await manager.execute(apiFn, { prompt: 'How is the weather today?' });v4.1 — Streaming Support
Real-time response handling with the same resilience as execute().
const stream = await manager.executeStream(async (key) => {
return await gemini.generateContentStream({ prompt: "..." });
}, { prompt: "..." });
for await (const chunk of stream) {
console.log(chunk.text()); // Zero-latency interaction
}- Smart Retries: If the initial connection fails, it rotates keys and retries.
- Cache Replay: Semantic cache accumulates stream chunks and replays the full stream on a cache hit.
- Event Driven: Emits
executeSuccess,executeFailed, andretryjust like the standard wrapper.
Recursion Guard: If your
getEmbeddingcallback internally callsexecute(), the cache automatically skips on nested calls to prevent infinite recursion.
execute() Wrapper
Wraps the entire lifecycle into one method:
const manager = new ApiKeyManager(keys, {
storage: localStorage,
strategy: new WeightedStrategy(),
fallbackFn: () => cachedResponse,
concurrency: 10
});
const result = await manager.execute(
async (key, signal) => {
const res = await fetch(url, { headers: { 'x-api-key': key }, signal });
return res.json();
},
{ maxRetries: 3, timeoutMs: 10000 }
);
// Handles: key selection → cache → timeout → retry → fallback → latency trackingEvent Emitter
Monitor every state change:
manager.on('keyDead', (key) => alertTeam(`Key ${key} permanently dead`));
manager.on('circuitOpen', (key) => metrics.increment('circuit_opens'));
manager.on('keyRecovered', (key) => log(`Key ${key} recovered`));
manager.on('retry', (key, attempt, delay) => log(`Retry #${attempt} in ${delay}ms`));
manager.on('fallback', (reason) => log(`Fallback triggered: ${reason}`));
manager.on('allKeysExhausted', () => alert('No healthy keys!'));
manager.on('bulkheadRejected', () => metrics.increment('rejected'));
manager.on('healthCheckPassed', (key) => log(`${key} healthy`));
manager.on('healthCheckFailed', (key, err) => log(`${key} unhealthy`));Load Balancing Strategies
Weighted (Cost Optimization)
import { ApiKeyManager, WeightedStrategy } from '@splashcodex/api-key-manager';
const manager = new ApiKeyManager(
[
{ key: 'free-key-1', weight: 1.0 },
{ key: 'free-key-2', weight: 1.0 },
{ key: 'paid-backup', weight: 0.1 },
],
{ strategy: new WeightedStrategy() }
);Latency (Performance)
import { ApiKeyManager, LatencyStrategy } from '@splashcodex/api-key-manager';
const manager = new ApiKeyManager(keys, { strategy: new LatencyStrategy() });
// After execute(), latency is tracked automaticallyProvider Tagging
Route requests to specific providers:
const manager = new ApiKeyManager([
{ key: 'sk-openai-1', weight: 1.0, provider: 'openai' },
{ key: 'sk-openai-2', weight: 1.0, provider: 'openai' },
{ key: 'AIza-gemini', weight: 0.5, provider: 'gemini' },
]);
const openaiKey = manager.getKeyByProvider('openai');
const geminiKey = manager.getKeyByProvider('gemini');Health Checks
Proactively detect recovered keys:
manager.setHealthCheck(async (key) => {
const res = await fetch(`https://api.openai.com/v1/models`, {
headers: { Authorization: `Bearer ${key}` }
});
return res.ok;
});
manager.startHealthChecks(60_000); // Check every 60 seconds
// manager.stopHealthChecks(); // Stop when doneError Handling
try {
const result = await callApi(key);
manager.markSuccess(key, duration);
} catch (error) {
const classification = manager.classifyError(error);
manager.markFailed(key, classification);
if (classification.retryable) {
const delay = manager.calculateBackoff(attempt);
await sleep(delay);
}
}API Reference
Constructor
// Legacy (v1/v2)
new ApiKeyManager(keys, storage?, strategy?)
// v3+ Options
new ApiKeyManager(keys, {
storage?, // Pluggable storage { getItem, setItem }
strategy?, // LoadBalancingStrategy instance
fallbackFn?, // () => any — called when all keys exhausted
concurrency?, // Max concurrent execute() calls
semanticCache?, // v4: { threshold, ttlMs, getEmbedding }
})Methods
| Method | Description |
|--------|-------------|
| getKey() | Returns best available key via strategy |
| getKeyByProvider(provider) | Get key filtered by provider tag |
| markSuccess(key, durationMs?) | Report success + optional latency |
| markFailed(key, classification) | Report failure with error type |
| classifyError(error, finishReason?) | Classify an error automatically |
| execute(fn, options?) | Full lifecycle wrapper with retry/timeout |
| executeStream(fn, options?) | Streaming lifecycle wrapper |
| calculateBackoff(attempt) | Get backoff delay with jitter |
| getStats() | Get pool health statistics |
| getKeyCount() | Count of non-DEAD keys |
| setHealthCheck(fn) | Set health check function |
| startHealthChecks(ms) | Start periodic health checks |
| stopHealthChecks() | Stop health checks |
Events
| Event | Payload | Trigger |
|-------|---------|---------|
| keyDead | key: string | Key marked as permanently dead |
| circuitOpen | key: string | Key circuit opened (cooldown) |
| circuitHalfOpen | key: string | Key entering test phase |
| keyRecovered | key: string | Key recovered from failure |
| fallback | reason: string | Fallback function invoked |
| allKeysExhausted | — | All keys dead, no fallback |
| retry | key, attempt, delayMs | Retry attempt starting |
| executeSuccess | key, durationMs | execute() completed successfully |
| executeFailed | key, error | execute() attempt failed |
| bulkheadRejected | — | Concurrency limit reached |
| healthCheckPassed | key: string | Health check succeeded |
| healthCheckFailed | key, error | Health check failed |
Custom Errors
| Error | When |
|-------|------|
| TimeoutError | Request exceeded timeoutMs |
| BulkheadRejectionError | Concurrency limit exceeded |
| AllKeysExhaustedError | All keys dead, no fallback |
Strategies
| Strategy | Algorithm | Best For |
|----------|-----------|----------|
| StandardStrategy | Least Failures → LRU | General use |
| WeightedStrategy | Probabilistic by weight | Cost optimization |
| LatencyStrategy | Lowest avg latency | Performance |
License
ISC
