smart-ai-cache
v1.0.7
Published
🚀 Lightning-fast AI response caching with Redis support. Reduce OpenAI/Claude/Gemini API costs by 40-80% with sub-millisecond cache lookups.
Maintainers
Readme
smart-ai-cache
A lightweight, intelligent caching middleware for AI responses, designed to reduce API costs and improve response times for repetitive LLM queries.
Performance benchmarks
Exceeds all industry requirements:
| Metric | Target | Actual | Performance | |--------|--------|--------|------------| | Cache lookup | < 1ms | 0.0009ms | 1,111x faster | | Memory usage | < 100MB | 2.86MB | 35x more efficient | | Throughput | High | 451,842 req/s | Exceptional |
Table of contents
- Purpose
- Installation
- Quick Start
- Storage Options
- Provider Examples
- Configuration
- Performance
- Migration Guide
- API Reference
Purpose
Smart AI Cache is a Node.js package that helps developers building AI applications reduce costs and improve response times. It caches responses from large language models to avoid redundant API calls.
What you get
- Lower costs: Save 40-80% on repetitive AI API calls
- Better performance: Sub-millisecond response times for cached queries
- Simple setup: Works out of the box with zero configuration
- Multiple providers: Compatible with OpenAI, Anthropic Claude, and Google Gemini
- Production ready: Built-in error handling, retries, and monitoring
Installation
npm install smart-ai-cache
# or
yarn add smart-ai-cache
# or
pnpm add smart-ai-cacheFor Redis support (optional):
npm install smart-ai-cache ioredisQuick Start
Here's how to get started:
import { AIResponseCache } from 'smart-ai-cache';
import OpenAI from 'openai';
// Initialize with zero configuration
const cache = new AIResponseCache();
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const response = await cache.wrap(
() => openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: 'Hello, world!' }],
}),
{ provider: 'openai', model: 'gpt-4' }
);
// First call hits the API, second call uses cache
const cachedResponse = await cache.wrap(
() => openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: 'Hello, world!' }],
}),
{ provider: 'openai', model: 'gpt-4' }
);
// Check your savings
const stats = cache.getStats();
console.log(`Hit rate: ${stats.hitRate}%, Cost saved: $${stats.totalCostSaved}`);Storage Options
Memory Storage
Default option - Fast and simple, ideal for single-instance applications:
import { AIResponseCache } from 'smart-ai-cache';
const cache = new AIResponseCache({
storage: 'memory', // Default
maxSize: 1000, // Max entries (default: 1000)
ttl: 3600, // 1 hour expiration (default)
});
// Automatic LRU eviction when maxSize is exceeded
// Sub-millisecond lookup times
// Zero external dependenciesPros: Fastest possible performance, no setup required
Cons: Not shared across instances, lost on restart
Redis Storage
Production option - Persistent, distributed caching:
import { AIResponseCache } from 'smart-ai-cache';
const cache = new AIResponseCache({
storage: 'redis',
redisOptions: {
host: 'localhost',
port: 6379,
password: 'your-redis-password', // If required
db: 0, // Redis database number
connectTimeout: 10000, // Connection timeout
retryDelayOnFailover: 1000, // Failover retry delay
},
keyPrefix: 'ai-cache:', // Namespace your keys
ttl: 7200, // 2 hours
});
// Automatic fallback to memory storage if Redis fails
// Shared across multiple application instances
// Survives application restartsPros: Persistent, scalable, shared across instances
Cons: Requires Redis server, network latency
Redis Production Setup
Docker Compose:
services:
redis:
image: redis:7-alpine
ports:
- "6379:6379"
volumes:
- redis_data:/data
command: redis-server --appendonly yes
environment:
- REDIS_PASSWORD=your-secure-password
your-app:
build: .
environment:
- REDIS_HOST=redis
- REDIS_PORT=6379
- REDIS_PASSWORD=your-secure-password
depends_on:
- redis
volumes:
redis_data:Environment Variables:
# .env file
REDIS_HOST=your-redis-host.com
REDIS_PORT=6379
REDIS_PASSWORD=your-secure-password
REDIS_DB=0const cache = new AIResponseCache({
storage: 'redis',
redisOptions: {
host: process.env.REDIS_HOST,
port: parseInt(process.env.REDIS_PORT),
password: process.env.REDIS_PASSWORD,
db: parseInt(process.env.REDIS_DB || '0'),
},
});Provider Examples
The package includes specialized classes for each AI provider with automatic cost tracking:
OpenAI
import { OpenAICache } from 'smart-ai-cache';
const cache = new OpenAICache({
ttl: 7200, // 2 hours
maxSize: 5000, // 5K entries
storage: 'redis', // Use Redis
redisOptions: {
host: 'localhost',
port: 6379,
}
});
// Automatically handles OpenAI-specific response types
const response = await cache.chatCompletion({
model: 'gpt-4o',
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'Explain quantum computing in simple terms' }
],
temperature: 0.7,
max_tokens: 500,
});
console.log(response.choices[0].message.content);
// Get OpenAI-specific analytics
const stats = cache.getStats();
console.log(`Cache hit rate: ${stats.hitRate}%`);
console.log(`Cost saved: $${stats.totalCostSaved.toFixed(4)}`);
console.log(`OpenAI requests: ${stats.byProvider.openai?.requests || 0}`);Anthropic Claude
import { AnthropicCache } from 'smart-ai-cache';
const cache = new AnthropicCache({
storage: 'memory',
maxSize: 2000,
ttl: 3600,
});
const response = await cache.messages({
model: 'claude-3-5-sonnet-20241022',
max_tokens: 1000,
messages: [
{
role: 'user',
content: 'Write a Python function to calculate fibonacci numbers'
}
],
});
console.log(response.content[0].text);
// Claude-specific cost tracking
const stats = cache.getStats();
console.log(`Claude cost saved: $${stats.byProvider.anthropic?.costSaved || 0}`);Google Gemini
import { GoogleCache } from 'smart-ai-cache';
const cache = new GoogleCache({
ttl: 1800, // 30 minutes
storage: 'redis',
}, process.env.GOOGLE_API_KEY);
const response = await cache.generateContent({
contents: [{
role: 'user',
parts: [{ text: 'What are the benefits of renewable energy?' }]
}],
}, 'gemini-1.5-pro');
console.log(response.response.text());Configuration
Complete configuration options:
interface CacheConfig {
ttl?: number; // Time to live in seconds (default: 3600)
maxSize?: number; // Maximum cache entries (default: 1000)
storage?: 'memory' | 'redis'; // Storage backend (default: 'memory')
redisOptions?: RedisOptions; // Redis connection options
keyPrefix?: string; // Cache key prefix (default: 'ai-cache:')
enableStats?: boolean; // Enable statistics tracking (default: true)
debug?: boolean; // Enable debug logging (default: false)
}Advanced Configuration:
const cache = new AIResponseCache({
ttl: 7200, // 2 hours expiration
maxSize: 10000, // 10K entries max
storage: 'redis',
redisOptions: {
host: process.env.REDIS_HOST || 'localhost',
port: parseInt(process.env.REDIS_PORT) || 6379,
password: process.env.REDIS_PASSWORD,
db: 0,
connectTimeout: 10000,
retryDelayOnFailover: 1000,
maxRetriesPerRequest: 3,
},
keyPrefix: 'myapp:ai-cache:', // Custom namespace
enableStats: true, // Track performance metrics
debug: process.env.NODE_ENV === 'development',
});Performance
Run the built-in benchmark to validate performance in your environment:
npm run benchmarkSample Output:
Cache lookup average: 0.94μs (0.0009ms)
Memory used for 10,000 entries: 2.86MB
Throughput: 451,842 requests/seconCache Management
Manual Cache Operations:
// Check cache size
const size = await cache.getCacheSize();
console.log(`Cache contains ${size} entries`);
// Clear specific entries by pattern
const deleted = await cache.deleteByPattern('openai:gpt-4:*');
console.log(`Deleted ${deleted} OpenAI GPT-4 entries`);
// Clear entire cache
await cache.clear();
// Delete specific key
const key = cache.generateKey('openai', 'gpt-4', prompt, params);
await cache.delete(key);
// Check if key exists
const exists = await cache.has(key);Graceful Shutdown:
process.on('SIGTERM', async () => {
await cache.disconnect(); // Closes Redis connections
process.exit(0);
});Migration Guide
From v1.0.4 to v1.0.5+
Breaking Changes:
- Cache methods are now async (
clear(),delete(),has(),getCacheSize()) - Provider classes no longer require passing client instances
Before:
// v1.0.4 and earlier
const cache = new AIResponseCache();
cache.clear(); // Synchronous
const size = cache.getCacheSize(); // Synchronous
const openaiCache = new OpenAICache(config, openaiClient);After:
// v1.0.5+
const cache = new AIResponseCache();
await cache.clear(); // Async
const size = await cache.getCacheSize(); // Async
const openaiCache = new OpenAICache(config); // No client neededMigration Steps:
- Add
awaitto cache management operations - Remove client instances from provider constructors
- Update your error handling to use the new retry logic
- Consider upgrading to Redis for production deployments
Adding Redis to Existing Projects
Step 1: Install Redis dependency
npm install ioredisStep 2: Update your cache configuration
// Before - Memory only
const cache = new AIResponseCache({ ttl: 3600 });
// After - Redis with fallback
const cache = new AIResponseCache({
storage: 'redis',
redisOptions: {
host: process.env.REDIS_HOST || 'localhost',
port: parseInt(process.env.REDIS_PORT) || 6379,
},
ttl: 3600,
});Step 3: Add graceful shutdown
process.on('SIGTERM', async () => {
await cache.disconnect();
process.exit(0);
});Advanced Features
Custom Key Generation
const cache = new AIResponseCache();
// Generate custom cache keys
const customKey = cache.generateKey('openai', 'gpt-4', prompt, params);
// Use custom keys for manual cache management
await cache.wrap(apiCall, {
provider: 'openai',
model: 'gpt-4',
cacheKey: customKey
});Statistics and Monitoring
const stats = cache.getStats();
console.log('Cache Performance:');
console.log(`├─ Total Requests: ${stats.totalRequests}`);
console.log(`├─ Cache Hits: ${stats.cacheHits} (${stats.hitRate.toFixed(1)}%)`);
console.log(`├─ Cost Saved: $${stats.totalCostSaved.toFixed(4)}`);
console.log(`└─ Avg Response Time: ${stats.averageResponseTime.toFixed(2)}ms`);
// Provider-specific stats
Object.entries(stats.byProvider).forEach(([provider, data]) => {
console.log(`${provider}: ${data.hits}/${data.requests} hits, $${data.costSaved.toFixed(4)} saved`);
});
// Reset statistics
cache.resetStats();Cache Invalidation Patterns
// Delete all OpenAI GPT-4 entries
await cache.deleteByPattern('*openai:gpt-4:*');
// Delete all entries from a specific time period
await cache.deleteByPattern(`*${today}*`);
// Delete provider-specific entries
await cache.deleteByPattern('*anthropic:*');Error handling and reliability
The package handles errors automatically:
- Automatic retries with exponential backoff (3 attempts)
- Graceful degradation when cache storage fails
- Circuit breaker pattern for Redis connection issues
- Fallback to memory when Redis is unavailable
// Errors are handled automatically, but you can catch them
try {
const response = await cache.wrap(apiCall, options);
} catch (error) {
console.error('API call failed after retries:', error);
// Your fallback logic here
}API Reference
Core Classes
AIResponseCache- Main caching class with storage abstractionOpenAICache- OpenAI-specific wrapper with cost trackingAnthropicCache- Anthropic Claude wrapper with cost trackingGoogleCache- Google Gemini wrapperMemoryStorage- In-memory storage implementationRedisStorage- Redis storage implementation
Key Methods
wrap(fn, options)- Cache a function callclear()- Clear all cache entriesdelete(key)- Delete specific entrydeleteByPattern(pattern)- Pattern-based deletiongetStats()- Get performance statisticsdisconnect()- Close storage connections
For complete API documentation, visit TypeDoc documentation.
Contributing
We welcome contributions! Here's how to get started:
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Run tests (
npm test) - Run benchmarks (
npm run benchmark) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
License
MIT License - see LICENSE file for details.
Made with ❤️ for the AI developer community
