ai-resilience
v0.3.1
Published
Axios retry++ primitives for resilient AI and backend systems.
Maintainers
Readme
ai-resilience
Reliable Axios retries, AI response recovery, and provider fallback for TypeScript apps.
ai-resilience helps AI and backend systems survive flaky APIs, malformed model output, rate limits, provider outages, and distributed retry problems.
Highlights
| Capability | What it helps with |
| -------------------------- | ---------------------------------------------------------------------------- |
| Smart Axios retries | Retry temporary network errors, rate limits, and retryable HTTP status codes |
| Semantic AI retry | Retry successful 200 OK responses when the AI output is invalid |
| JSON repair and validation | Repair malformed JSON and validate responses with Zod |
| Provider fallback | Move from a failing provider to a healthy backup provider |
| Circuit breakers | Stop sending traffic to unhealthy providers temporarily |
| Distributed coordination | Coordinate retries, locks, and rate limits across processes |
| Telemetry hooks | Add logs, metrics, lifecycle hooks, and OpenTelemetry-style spans |
| TypeScript-first API | Strong types for retry config, policies, hooks, providers, and metrics |
Installation
npm install ai-resilience axios axios-retryQuick Start
Add retry behavior to any Axios instance.
import axios from "axios";
import { applyAiResilience, ConsoleRetryLogger } from "ai-resilience";
const client = axios.create({
baseURL: "https://api.example.com",
});
applyAiResilience(client, {
retries: 3,
strategy: "exponential",
baseDelayMs: 150,
maxDelayMs: 10_000,
jitter: "full",
logger: new ConsoleRetryLogger("info"),
hooks: {
onRetry: ({ attempt, nextDelayMs }) => {
console.log(`Retry ${attempt} in ${nextDelayMs}ms`);
},
},
});
const response = await client.get("/health");
console.log(response.data);Why This Exists
Traditional HTTP retries only help when the request fails.
AI systems have a different problem: the request can succeed, but the content can still be unusable. A model might return malformed JSON, an empty answer, a refusal, a truncated response, or a payload that does not match your schema.
ai-resilience handles both layers:
| Layer | Example problem | Tool |
| ------------------ | ------------------------------------------- | -------------------------- |
| HTTP failure | 429, 503, timeout, connection reset | applyAiResilience |
| AI content failure | Invalid JSON, schema mismatch, empty answer | requestWithSemanticRetry |
| Provider failure | OpenAI is down or too slow | createProviderFallback |
| Platform failure | Multiple workers retrying the same job | RedisRetryCoordinator |
Semantic Retry for AI Responses
Use requestWithSemanticRetry when the response must be valid, structured, and schema-safe.
import axios from "axios";
import { requestWithSemanticRetry } from "ai-resilience";
import { z } from "zod";
const client = axios.create({
baseURL: "https://api.example.com",
});
const result = await requestWithSemanticRetry(client, {
request: {
method: "post",
url: "/generate",
data: {
prompt: "Return a JSON object with title and tags.",
},
},
schema: z.object({
title: z.string(),
tags: z.array(z.string()),
}),
repairJson: true,
requireJson: true,
hooks: {
onSemanticRetry: ({ issue, attempt }) => {
console.log(`Semantic retry ${attempt}: ${issue.kind}`);
},
},
});
console.log(result.data);Semantic retry can detect:
- Invalid JSON
- Schema mismatches
- Empty responses
- Refusals
- Truncated answers
Provider Fallback
Use provider fallback when one AI provider should automatically hand off to another.
import { createProviderFallback } from "ai-resilience";
const fallback = createProviderFallback(
[
{
id: "openai-primary",
type: "openai",
priority: 1,
request: (input) => openaiClient.responses.create(input),
},
{
id: "anthropic-backup",
type: "anthropic",
priority: 2,
request: (input) => anthropicClient.messages.create(input),
},
],
{
strategy: "priority",
circuitBreaker: {
failureThreshold: 3,
cooldownMs: 30_000,
},
hooks: {
onProviderFallback: ({ fromProviderId, toProviderId }) => {
console.log(`Fallback: ${fromProviderId} -> ${toProviderId}`);
},
},
},
);
const response = await fallback.request({
prompt: "Summarize this text.",
});
const metrics = fallback.snapshot();Distributed Tools
Retry Coordination
RedisRetryCoordinator coordinates retries and locks across multiple processes.
import { RedisRetryCoordinator } from "ai-resilience";
const coordinator = new RedisRetryCoordinator({
namespace: "my-api",
redis,
});
await coordinator.incrementRetry("tenant-a:request-123");
const locked = await coordinator.acquireLock("provider-routing");
console.log(locked);Rate Limiting
DistributedRateLimiter provides Redis-compatible fixed-window rate limiting.
import { DistributedRateLimiter } from "ai-resilience";
const limiter = new DistributedRateLimiter(redis);
const result = await limiter.consume({
key: "tenant-a:openai",
limit: 100,
windowSeconds: 60,
});
console.log(result.allowed);Adaptive Routing
AdaptiveRouter ranks providers by latency, failure rate, and cost.
import { AdaptiveRouter } from "ai-resilience";
const router = new AdaptiveRouter(providers, {
latencyWeight: 1,
failureWeight: 2,
costWeight: 0.5,
});
const provider = router.select(metricsByProvider);Streaming Recovery
recoverStream collects streaming chunks and calls recovery hooks when chunk gaps are longer than your configured threshold.
import { recoverStream } from "ai-resilience";
const chunks = await recoverStream(stream, {
maxChunkGapMs: 5_000,
onRecover: (chunks) => {
console.log(`Stream recovery started after ${chunks.length} chunks`);
},
});Telemetry
withTelemetry wraps an async operation with an OpenTelemetry-style tracer.
The package does not force @opentelemetry/api as a runtime dependency. Pass any tracer object that supports startSpan.
import { withTelemetry } from "ai-resilience";
const result = await withTelemetry(
tracer,
"ai.request",
() => client.post("/chat/completions", payload),
{
provider: "openai",
},
);Feature Overview
| Area | Features |
| ------------------- | ----------------------------------------------------------------------------------------- |
| Retry engine | Fixed, linear, and exponential retry strategies |
| Jitter | none, full, equal, and decorrelated jitter |
| Retry rules | Method controls, status-code controls, and async custom retry conditions |
| Hooks | Retry, success, give-up, semantic retry, and provider fallback hooks |
| Logging | Structured logger interface and console logger |
| AI validation | JSON parsing, JSON repair, Zod schema validation, and AI failure detection |
| Providers | Priority routing, round-robin routing, least-latency routing, and health-weighted routing |
| Protection | Circuit breakers, provider health tracking, and adaptive fallback delay |
| Distributed systems | Redis-compatible retry coordination and fixed-window rate limiting |
| Observability | Metrics snapshots, lifecycle hooks, and OpenTelemetry-style spans |
API Reference
| API | Purpose |
| ---------------------------------------------------- | ------------------------------------------------------------------ |
| applyAiResilience(instance, config) | Installs retry behavior on an existing Axios instance |
| createAiResilienceClient(config) | Creates a new Axios instance with retry behavior already applied |
| requestWithSemanticRetry(instance, config) | Runs an Axios request and retries invalid AI responses |
| semanticRetry(operation, policy) | Wraps any async operation with semantic validation and retry |
| parseJsonResponse(data, options) | Parses, repairs, and validates JSON responses |
| applySemanticRecovery(instance, policy) | Installs a response interceptor for semantic validation and repair |
| createProviderFallback(providers, options) | Creates a provider fallback engine |
| RedisRetryCoordinator | Coordinates retries and locks across processes |
| DistributedRateLimiter | Provides Redis-compatible fixed-window rate limiting |
| AdaptiveRouter | Selects providers using latency, failure, and cost signals |
| recoverStream(stream, options) | Collects stream chunks and triggers stream recovery hooks |
| withTelemetry(tracer, name, operation, attributes) | Wraps an async operation in a telemetry span |
Retry Configuration
type AiResilienceRetryConfig = {
retries?: number;
strategy?: "fixed" | "linear" | "exponential";
baseDelayMs?: number;
maxDelayMs?: number;
jitter?: "none" | "full" | "equal" | "decorrelated";
retryStatusCodes?: number[];
retryMethods?: string[];
retryCondition?: (error) => boolean | Promise<boolean>;
hooks?: RetryHooks;
logger?: RetryLogger;
metadata?: Record<string, unknown>;
};Semantic Retry Policy
type AiRetryPolicy = {
maxSemanticRetries?: number;
retryOnFailureKinds?: Array<
| "invalid_json"
| "schema_mismatch"
| "empty_response"
| "refusal"
| "truncated"
>;
repairJson?: boolean;
requireJson?: boolean;
detectRefusals?: boolean;
detectTruncation?: boolean;
schema?: z.ZodType;
validate?: (
data,
response,
) => SemanticValidationResult | Promise<SemanticValidationResult>;
hooks?: SemanticRetryHooks;
logger?: RetryLogger;
metadata?: Record<string, unknown>;
};Scripts
npm run build
npm test
npm run lint
npm run formatLicense
MIT
