smart-ai-cache

v1.0.7

Published

4 months ago

🚀 Lightning-fast AI response caching with Redis support. Reduce OpenAI/Claude/Gemini API costs by 40-80% with sub-millisecond cache lookups.

`smart-ai-cache`

npm version License Downloads Build Status

A lightweight, intelligent caching middleware for AI responses, designed to reduce API costs and improve response times for repetitive LLM queries.

Performance benchmarks

Exceeds all industry requirements:

| Metric | Target | Actual | Performance | |--------|--------|--------|------------| | Cache lookup | < 1ms | 0.0009ms | 1,111x faster | | Memory usage | < 100MB | 2.86MB | 35x more efficient | | Throughput | High | 451,842 req/s | Exceptional |

Purpose

Smart AI Cache is a Node.js package that helps developers building AI applications reduce costs and improve response times. It caches responses from large language models to avoid redundant API calls.

What you get

Lower costs: Save 40-80% on repetitive AI API calls
Better performance: Sub-millisecond response times for cached queries
Simple setup: Works out of the box with zero configuration
Multiple providers: Compatible with OpenAI, Anthropic Claude, and Google Gemini
Production ready: Built-in error handling, retries, and monitoring

Installation

npm install smart-ai-cache
# or
yarn add smart-ai-cache
# or  
pnpm add smart-ai-cache

For Redis support (optional):

npm install smart-ai-cache ioredis

Quick Start

Here's how to get started:

import { AIResponseCache } from 'smart-ai-cache';
import OpenAI from 'openai';

// Initialize with zero configuration
const cache = new AIResponseCache();
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

const response = await cache.wrap(
  () => openai.chat.completions.create({
    model: 'gpt-4',
    messages: [{ role: 'user', content: 'Hello, world!' }],
  }),
  { provider: 'openai', model: 'gpt-4' }
);

// First call hits the API, second call uses cache
const cachedResponse = await cache.wrap(
  () => openai.chat.completions.create({
    model: 'gpt-4', 
    messages: [{ role: 'user', content: 'Hello, world!' }],
  }),
  { provider: 'openai', model: 'gpt-4' }
);

// Check your savings
const stats = cache.getStats();
console.log(`Hit rate: ${stats.hitRate}%, Cost saved: $${stats.totalCostSaved}`);

Storage Options

Memory Storage

Default option - Fast and simple, ideal for single-instance applications:

import { AIResponseCache } from 'smart-ai-cache';

const cache = new AIResponseCache({
  storage: 'memory',          // Default
  maxSize: 1000,             // Max entries (default: 1000)
  ttl: 3600,                 // 1 hour expiration (default)
});

// Automatic LRU eviction when maxSize is exceeded
// Sub-millisecond lookup times
// Zero external dependencies

Pros: Fastest possible performance, no setup required
Cons: Not shared across instances, lost on restart

Redis Storage

Production option - Persistent, distributed caching:

import { AIResponseCache } from 'smart-ai-cache';

const cache = new AIResponseCache({
  storage: 'redis',
  redisOptions: {
    host: 'localhost',
    port: 6379,
    password: 'your-redis-password',    // If required
    db: 0,                              // Redis database number
    connectTimeout: 10000,              // Connection timeout
    retryDelayOnFailover: 1000,        // Failover retry delay
  },
  keyPrefix: 'ai-cache:',              // Namespace your keys
  ttl: 7200,                           // 2 hours
});

// Automatic fallback to memory storage if Redis fails
// Shared across multiple application instances  
// Survives application restarts

Pros: Persistent, scalable, shared across instances
Cons: Requires Redis server, network latency

Redis Production Setup

Docker Compose:

services:
  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    volumes:
      - redis_data:/data
    command: redis-server --appendonly yes
    environment:
      - REDIS_PASSWORD=your-secure-password
      
  your-app:
    build: .
    environment:
      - REDIS_HOST=redis
      - REDIS_PORT=6379  
      - REDIS_PASSWORD=your-secure-password
    depends_on:
      - redis

volumes:
  redis_data:

Environment Variables:

# .env file
REDIS_HOST=your-redis-host.com
REDIS_PORT=6379
REDIS_PASSWORD=your-secure-password
REDIS_DB=0

const cache = new AIResponseCache({
  storage: 'redis',
  redisOptions: {
    host: process.env.REDIS_HOST,
    port: parseInt(process.env.REDIS_PORT),
    password: process.env.REDIS_PASSWORD,
    db: parseInt(process.env.REDIS_DB || '0'),
  },
});

Provider Examples

The package includes specialized classes for each AI provider with automatic cost tracking:

OpenAI

import { OpenAICache } from 'smart-ai-cache';

const cache = new OpenAICache({
  ttl: 7200,                    // 2 hours
  maxSize: 5000,               // 5K entries  
  storage: 'redis',            // Use Redis
  redisOptions: {
    host: 'localhost',
    port: 6379,
  }
});

// Automatically handles OpenAI-specific response types
const response = await cache.chatCompletion({
  model: 'gpt-4o',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Explain quantum computing in simple terms' }
  ],
  temperature: 0.7,
  max_tokens: 500,
});

console.log(response.choices[0].message.content);

// Get OpenAI-specific analytics
const stats = cache.getStats();
console.log(`Cache hit rate: ${stats.hitRate}%`);
console.log(`Cost saved: $${stats.totalCostSaved.toFixed(4)}`);
console.log(`OpenAI requests: ${stats.byProvider.openai?.requests || 0}`);

Anthropic Claude

import { AnthropicCache } from 'smart-ai-cache';

const cache = new AnthropicCache({
  storage: 'memory',
  maxSize: 2000,
  ttl: 3600,
});

const response = await cache.messages({
  model: 'claude-3-5-sonnet-20241022',
  max_tokens: 1000, 
  messages: [
    { 
      role: 'user', 
      content: 'Write a Python function to calculate fibonacci numbers' 
    }
  ],
});

console.log(response.content[0].text);

// Claude-specific cost tracking
const stats = cache.getStats();
console.log(`Claude cost saved: $${stats.byProvider.anthropic?.costSaved || 0}`);

Google Gemini

import { GoogleCache } from 'smart-ai-cache';

const cache = new GoogleCache({
  ttl: 1800,                   // 30 minutes
  storage: 'redis',
}, process.env.GOOGLE_API_KEY);

const response = await cache.generateContent({
  contents: [{ 
    role: 'user', 
    parts: [{ text: 'What are the benefits of renewable energy?' }] 
  }],
}, 'gemini-1.5-pro');

console.log(response.response.text());

Configuration

Complete configuration options:

interface CacheConfig {
  ttl?: number;                    // Time to live in seconds (default: 3600)
  maxSize?: number;                // Maximum cache entries (default: 1000)
  storage?: 'memory' | 'redis';    // Storage backend (default: 'memory')
  redisOptions?: RedisOptions;     // Redis connection options 
  keyPrefix?: string;              // Cache key prefix (default: 'ai-cache:')
  enableStats?: boolean;           // Enable statistics tracking (default: true)
  debug?: boolean;                 // Enable debug logging (default: false)
}

Advanced Configuration:

const cache = new AIResponseCache({
  ttl: 7200,                      // 2 hours expiration
  maxSize: 10000,                 // 10K entries max
  storage: 'redis',
  redisOptions: {
    host: process.env.REDIS_HOST || 'localhost',
    port: parseInt(process.env.REDIS_PORT) || 6379,
    password: process.env.REDIS_PASSWORD,
    db: 0,
    connectTimeout: 10000,
    retryDelayOnFailover: 1000,
    maxRetriesPerRequest: 3,
  },
  keyPrefix: 'myapp:ai-cache:',   // Custom namespace
  enableStats: true,              // Track performance metrics
  debug: process.env.NODE_ENV === 'development',
});

Performance

Run the built-in benchmark to validate performance in your environment:

npm run benchmark

Sample Output:

Cache lookup average: 0.94μs (0.0009ms)
Memory used for 10,000 entries: 2.86MB  
Throughput: 451,842 requests/secon

Cache Management

Manual Cache Operations:

// Check cache size
const size = await cache.getCacheSize();
console.log(`Cache contains ${size} entries`);

// Clear specific entries by pattern
const deleted = await cache.deleteByPattern('openai:gpt-4:*');
console.log(`Deleted ${deleted} OpenAI GPT-4 entries`);

// Clear entire cache
await cache.clear();

// Delete specific key  
const key = cache.generateKey('openai', 'gpt-4', prompt, params);
await cache.delete(key);

// Check if key exists
const exists = await cache.has(key);

Graceful Shutdown:

process.on('SIGTERM', async () => {
  await cache.disconnect(); // Closes Redis connections
  process.exit(0);
});

Migration Guide

From v1.0.4 to v1.0.5+

Breaking Changes:

Cache methods are now async (clear(), delete(), has(), getCacheSize())
Provider classes no longer require passing client instances

Before:

// v1.0.4 and earlier
const cache = new AIResponseCache();
cache.clear();                    // Synchronous
const size = cache.getCacheSize(); // Synchronous

const openaiCache = new OpenAICache(config, openaiClient);

After:

// v1.0.5+
const cache = new AIResponseCache();  
await cache.clear();                    // Async
const size = await cache.getCacheSize(); // Async

const openaiCache = new OpenAICache(config); // No client needed

Migration Steps:

Add await to cache management operations
Remove client instances from provider constructors
Update your error handling to use the new retry logic
Consider upgrading to Redis for production deployments

Adding Redis to Existing Projects

Step 1: Install Redis dependency

npm install ioredis

Step 2: Update your cache configuration

// Before - Memory only
const cache = new AIResponseCache({ ttl: 3600 });

// After - Redis with fallback
const cache = new AIResponseCache({
  storage: 'redis',
  redisOptions: {
    host: process.env.REDIS_HOST || 'localhost', 
    port: parseInt(process.env.REDIS_PORT) || 6379,
  },
  ttl: 3600,
});

Step 3: Add graceful shutdown

process.on('SIGTERM', async () => {
  await cache.disconnect();
  process.exit(0);
});

Advanced Features

Custom Key Generation

const cache = new AIResponseCache();

// Generate custom cache keys
const customKey = cache.generateKey('openai', 'gpt-4', prompt, params);

// Use custom keys for manual cache management
await cache.wrap(apiCall, {
  provider: 'openai',
  model: 'gpt-4', 
  cacheKey: customKey
});

Statistics and Monitoring

const stats = cache.getStats();

console.log('Cache Performance:');
console.log(`├─ Total Requests: ${stats.totalRequests}`);
console.log(`├─ Cache Hits: ${stats.cacheHits} (${stats.hitRate.toFixed(1)}%)`);
console.log(`├─ Cost Saved: $${stats.totalCostSaved.toFixed(4)}`);
console.log(`└─ Avg Response Time: ${stats.averageResponseTime.toFixed(2)}ms`);

// Provider-specific stats
Object.entries(stats.byProvider).forEach(([provider, data]) => {
  console.log(`${provider}: ${data.hits}/${data.requests} hits, $${data.costSaved.toFixed(4)} saved`);
});

// Reset statistics 
cache.resetStats();

Cache Invalidation Patterns

// Delete all OpenAI GPT-4 entries
await cache.deleteByPattern('*openai:gpt-4:*');

// Delete all entries from a specific time period
await cache.deleteByPattern(`*${today}*`);

// Delete provider-specific entries
await cache.deleteByPattern('*anthropic:*');

Error handling and reliability

The package handles errors automatically:

Automatic retries with exponential backoff (3 attempts)
Graceful degradation when cache storage fails
Circuit breaker pattern for Redis connection issues
Fallback to memory when Redis is unavailable

// Errors are handled automatically, but you can catch them
try {
  const response = await cache.wrap(apiCall, options);
} catch (error) {
  console.error('API call failed after retries:', error);
  // Your fallback logic here
}

API Reference

Core Classes

AIResponseCache - Main caching class with storage abstraction
OpenAICache - OpenAI-specific wrapper with cost tracking
AnthropicCache - Anthropic Claude wrapper with cost tracking
GoogleCache - Google Gemini wrapper
MemoryStorage - In-memory storage implementation
RedisStorage - Redis storage implementation

Key Methods

wrap(fn, options) - Cache a function call
clear() - Clear all cache entries
delete(key) - Delete specific entry
deleteByPattern(pattern) - Pattern-based deletion
getStats() - Get performance statistics
disconnect() - Close storage connections

For complete API documentation, visit TypeDoc documentation.

Contributing

We welcome contributions! Here's how to get started:

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Run tests (npm test)
Run benchmarks (npm run benchmark)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

MIT License - see LICENSE file for details.

Made with ❤️ for the AI developer community

⭐ Star on GitHub | Documentation | Report Issues