npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

resilient-llm

v1.7.3

Published

ResilientLLM is a resilient, unified LLM interface with in-built circuit breaker, token bucket rate limiting, caching, and adaptive retry with dynamic backoff support.

Readme

Resilient LLM

npm version license

A minimalist but robust LLM integration layer designed to ensure reliable, seamless interactions across multiple LLM providers by intelligently handling failures and rate limits.

banner

ResilientLLM makes your AI Agents or LLM apps production-ready by dealing with challenges such as:

  • ❌ Unstable network conditions
  • ⚠️ Inconsistent errors
  • ⏳ Unpredictable LLM API rate limit errors

Check out examples, ready to ship.

Key Features

  • Unified API: One .chat() works seamlessly across OpenAI, Anthropic, Google, Ollama, and custom providers
  • Built-in Resilience: Automatic retries, exponential backoff, and circuit breakers handle failures gracefully
  • Token Bucket Algorithm: Automatically enforces provider rate limits intelligently
  • Automatic Token Counting: Accurate token estimation for every request, no manual calculation needed
  • Multi-Provider Fallback: Seamlessly switches to alternative providers when one fails

Installation

npm i resilient-llm

Quickstart

import { ResilientLLM } from 'resilient-llm';

const llm = new ResilientLLM({
  aiService: 'openai', // or 'anthropic', 'google', 'ollama'
  model: 'gpt-5-nano',
  maxTokens: 2048,
  temperature: 0.7,
  rateLimitConfig: {
    requestsPerMinute: 60,      // Limit to 60 requests per minute
    llmTokensPerMinute: 90000   // Limit to 90,000 LLM tokens per minute
  },
  retries: 3, // Number of times to retry when req. fails and only if it is possible to fix by retry
  backoffFactor: 2 // Increase delay between retries by this factor
});

const conversationHistory = [
  { role: 'system', content: 'You are a helpful assistant.' },
  { role: 'assistant', content: 'Hi, I am here to help.' },
  { role: 'user', content: 'What is the capital of France?' }
];

(async () => {
  try {
    const { content, toolCalls, metadata } = await llm.chat(conversationHistory);
    console.log('LLM response:', content);
  } catch (err) {
    console.error('Error:', err);
  }
})();

Handling Error: typescript users can import ResilientLLMError and use the error.code to deal with various errors. The canonical list is ResilientLLMErrorCode in the source. Javascript users can use the error object properties { name="ResilientLLMError", code, message, metadata, retryable }.

Key Methods

Instance methods

const llm = new ResilientLLM(llmOptions);
  • llm.chat(conversationHistory, llmOptions?) - Send chat completion requests with automatic retries and rate limiting
  • llm.abort() - Cancel all ongoing requests for this instance

Static public methods

import { ResilientLLM } from 'resilient-llm';
  • ResilientLLM.estimateTokens(text) - Estimate token count for any text string
import { ProviderRegistry } from 'resilient-llm';
  • ProviderRegistry.list(options?) - List all configured LLM providers (AI services such as openai, anthropic, etc.)
  • ProviderRegistry.getModels(providerName?, apiKey?) - Get all models for a provider
  • ProviderRegistry.configure(providerName, config) - Configure or update a provider with custom settings
  • ProviderRegistry.hasApiKey(providerName) - Check if an API key is configured for a provider

See the full API reference for complete documentation.

Structured output (JSON + schema)

Use llm.chat(..., { responseFormat }) when you need the assistant to return machine-readable JSON, optionally matching a specific JSON Schema.

Simple JSON without any specific schema

// JSON mode (single JSON object)
const { content: obj } = await llm.chat(messages, { responseFormat: { type: 'json_object' } });

JSON with a specific schema only

const conversationHistory = [{ role: 'user', content: 'Add 2 and 3 and respond ONLY with JSON having sum and explanationSteps' }];
const { content: result } = await llm.chat(conversationHistory, {
  responseFormat: {
    type: 'json_schema',
    json_schema: {
      name: 'math_answer',
      schema: {
        type: 'object',
        properties: {
            sum: { type: 'number' },
            explanationSteps: {
                type: 'array',
                items: { type: 'string' }
            }
        },
        required: ['sum', 'explanationSteps'],
        // Anthropic Messages structured output requires explicit false here for object roots.
        additionalProperties: false
      }
   }
});

// `result` is a parsed JS object (not a string).
console.log(JSON.stringify(result))
// {
//  content: {
//    sum: 5,
//    explanationSteps: [
//      "Add 2 and 3.",
//      "The result is 5."
//    ]
//  },
//  metadata: { requestId: "unique-id", finishReason: "stop", .... }
//
// If the model returns invalid JSON or fails schema validation,
// `llm.chat(...)` throws a StructuredOutputError with `code` and `validation` details.

For all supported shapes (including plain schema objects) and parsing/validation behavior, see responseFormat docs.

Supported LLM Providers

ResilientLLM comes with built-in support for all text models provided by OpenAI, Anthropic, Google/Gemini, Ollama API, etc.

Adding custom providers: You can add support for other LLM providers (e.g., Together AI, Groq, self-hosted vLLM, or any OpenAI/Anthropic-compatible API) using ProviderRegistry.configure(). See the Custom Provider Guide for detailed instructions and examples.

API Key Setup

API keys are required for most LLM providers. The simplest way is using environment variables:

export OPENAI_API_KEY=sk-your-key-here
export ANTHROPIC_API_KEY=sk-ant-your-key-here
export GOOGLE_API_KEY=your-key-here
export OLLAMA_API_KEY=your-key-here

For more ways to configure API key, see the API Key Configuration guide in the reference documentation.

Examples and Playground

Complete working projects using Resilient LLM as core library to call LLM APIs with resilience.

Playground screenshot

  • Minimal AI Chat
  • React Playground - Interactive playground to test and experience ResilientLLM with multiple LLM providers, conversation management, and version control

Motivation

ResilientLLM is a resilient, unified LLM interface featuring circuit breaker, token bucket rate limiting, caching, and adaptive retry with dynamic backoff support.

In 2023, I developed multiple AI Agents and LLM Apps. I chose to not use the complex tools like Langchain just to make a simple LLM API call. A simple class to encapsulate my LLM call (llm.chat) was enough for me. Each app was using different LLM and configurations. For every new project, I found myself copying the same LLM orchestration code with minor adjustments. With each new release of those projects, I added some bug-fixes and essential features this LLM orchestration code. It was a tiny class, so there was not a major problem in syncing back those improvements to other projects. Soon, I had a class that unified API calls to multiple LLMs in a single interface unifiedLLM.chat(conversationHistory, llmOptions), it was working flawlessly, on my development machine.

When I deployed my AI agents on production, they started facing failures, some predictable (e.g. hitting LLM provider's rate limits), some unpredictable (Anthropic's overload error, network issues, CPU/memory spikes leading to server crash, etc.). Some of these issues were already dealt with a simple exponential backoff and retry strategy. But it was not good enough to put it out there on production. I could have put a rate limit gateway in front of my app server but that wouldn't have the enough user/app context/control to recover from these failures and leave the gap for unpredictable errors. Also it would have been an extra chore and expense to manage that gateway. So for the multiple agentic apps that I was creating, the LLM calls had to be more resilient, and the solution to deal with most of these failures had to be in the app itself.

Vercel AI SDK seemed to offer convenient and unified abstractions. It seemed to even follow a more structured approach than mine (Vercel has adapters for each LLM provider) which enables advanced use cases such as supporting multi-modal APIs out-of-the-box for many providers (for which adapters are created by Vercel). This was a good approach to allow more use cases than what my tiny LLM class was doing, but I wanted to make the interface more production-ready(resilient) and unified (support new LLM API for the same AI agent use cases - chat/tool-calls, etc.). Only after diving deeper, I understood that it does not focus on resilience except for a simple backoff/retry strategy similar to what I had. Langchain was still more complex than needed, and it didn't have everything I needed to make my LLM orchestration more robust.

The final solution was to extract tiny LLM orchestration class out of all my AI Agents and added circuit breakers, adaptive retries with backoff, and token bucket rate limiting while responding dynamically to API signals like retry-after headers. I used JavaScript/Node.js native features such as AbortController to bring control to abort on-demand or timeout.

This library solves my challenges in building production-ready AI Agents such as:

  • unstable network conditions
  • inconsistent error handling
  • unpredictable LLM API rate limit errors

This library aims to solve the same challenges for you by providing a resilient layer that intelligently manages failures and rate limits, enabling you (developers) to integrate LLMs confidently and effortlessly at scale.

Scope

What's in scope

  • Unified LLM Interface: Simple, consistent API across multiple LLM providers (OpenAI, Anthropic, Google Gemini, Ollama)
  • Resilience Features: Circuit breakers, adaptive retries with exponential backoff, and intelligent failure recovery
  • Rate Limiting: Token bucket rate limiting with automatic token estimation and enforcement
  • Production Readiness: Handling of network issues, API rate limits, timeouts, and server overload scenarios
  • Basic Chat Functionality: Support for conversational chat interfaces and message history
  • Request Control: AbortController support for on-demand request cancellation and timeouts
  • Error Recovery: Dynamic response to API signals like retry-after headers and provider-specific error codes

What's not in scope

  • Complex LLM Orchestration: Advanced workflows, chains, or multi-step LLM interactions (use LangChain or similar for complex use cases)
  • Multi-modal Support: Image, audio, or video processing capabilities
  • Tool/Function Calling: Advanced function calling or tool integration features
  • Streaming Responses: Real-time streaming of LLM responses
  • Vector Databases: Embedding storage, similarity search, or RAG (Retrieval-Augmented Generation) capabilities
  • Fine-tuning or Training: Model training, fine-tuning, or custom model deployment
  • UI Components: Frontend widgets, chat interfaces, or user interface elements
  • Data Processing Pipelines: ETL processes, data transformation, or batch processing workflows

License

This project is licensed under the MIT License - see the LICENSE file for details.