resilient-cache
v0.3.1
Published
Resilient Redis/Valkey cache client with graceful degradation, fast failure detection, and delayed reconnection
Maintainers
Readme
resilient-cache
Resilient Redis/Valkey cache client with graceful degradation, fast failure detection, and circuit breaker-style reconnection.
Features
- Non-critical: App works even if cache is down
- Fail-fast: Quick failure timeout (don't wait for slow connections)
- Circuit breaker: Automatic cooldown after failure, prevents cascade failures
- Graceful degradation: Return default values when unavailable
- Auto-connect: No manual
connect()required, lazy connection on first command - Cache-aside pattern: Built-in
getOrSet()for common fetch-or-compute pattern - Configurable error handling: Graceful (default) or throw exceptions
- Await-friendly API: Modern async/await interface
- Replaceable: Provider pattern for swapping implementations
- TypeScript: Full type safety with comprehensive type definitions
Installation
npm install resilient-cacheRequirements: Node.js >= 20.0.0
Usage
Basic Usage (Graceful Mode - Default)
import { ResilientCacheClient, CacheKeyBuilder, CacheProvider } from 'resilient-cache';
// Initialize at app startup - no connect() needed, auto-connects on first command
const client = new ResilientCacheClient({
host: process.env.REDIS_HOST!,
port: 6379,
password: process.env.REDIS_PASSWORD,
// onError: 'graceful' is the default
// autoConnect: true is the default
});
CacheProvider.initialize(client);
// Key builder for consistent key naming
const keys = new CacheKeyBuilder({ app: 'myapp', env: 'production' });
// In service code - returns null if cache unavailable (no try/catch needed)
const orgKeys = keys.forTenant('legal').forOrg('org123');
const cached = await client.get(orgKeys.key('prompts'), null);
if (!cached) {
// Cache miss or unavailable - fetch from DB
const fresh = await loadFromDatabase();
await client.set(orgKeys.key('prompts'), fresh, 7200); // Returns false if unavailable
}Cache-Aside Pattern with getOrSet()
// Simplify fetch-or-compute pattern with getOrSet()
const userData = await client.getOrSet(
keys.forUser(userId).key('profile'),
async () => {
// Called only on cache miss
return await fetchUserFromDatabase(userId);
},
3600 // TTL in seconds
);
// Returns cached value if available, otherwise calls factory and caches result
// With staleness validation - refetch if cached value is stale
const prompts = await client.getOrSet(
keys.forTenant(tenant).key('prompts'),
async () => fetchPromptsFromDatabase(tenant),
7200,
{
isValid: (cached) => cached.version >= requiredVersion
}
);
// If isValid returns false, factory is called and cache is updatedBatch Operations
// Fetch multiple keys in one round trip
const keys = ['user:1', 'user:2', 'user:3'];
const users = await client.getMany<User>(keys);
// Returns [User | null, User | null, User | null]
// Set multiple keys with same TTL
await client.setMany([
{ key: 'user:1', value: user1 },
{ key: 'user:2', value: user2 },
], 3600);Distributed Locks with setIfNotExists
// Simple lock - graceful mode (cache down = lock not acquired)
const acquired = await client.setIfNotExists(lockKey, 'owner-id', 30);
if (acquired) {
try {
// Do critical work
} finally {
await client.remove(lockKey);
}
}
// For mutex where you need to distinguish "lock held" vs "cache down":
try {
const acquired = await client.setIfNotExists(lockKey, 'owner-id', 30, { onError: 'throw' });
if (!acquired) {
console.log('Lock held by another process');
}
} catch (e) {
if (e instanceof CacheUnavailableError) {
console.log('Cache unavailable - cannot acquire lock');
// Handle differently: retry, fail, or proceed without lock
}
}Note: In graceful mode (default), setIfNotExists returns false both when the key already exists AND when cache is unavailable. For mutex patterns where this distinction matters, use { onError: 'throw' }.
Throw Mode for Specific Calls
import { CacheProvider, CacheUnavailableError } from 'resilient-cache';
// In admin action handler where you want to show errors
export async function clearCache() {
const client = CacheProvider.getClient();
try {
// Override to throw - we want to show errors in UI
const keysDeleted = await client.removeByPrefix('myapp:*', { onError: 'throw' });
return { success: true, keysDeleted };
} catch (error) {
if (error instanceof CacheUnavailableError) {
return { success: false, error: 'Cache is unavailable' };
}
throw error;
}
}Rate Limiting (Fail-Open)
const rateLimitKey = keys.forOrg(orgId).key('ratelimit:chat');
// Graceful: returns defaultValue (30) if cache unavailable
// This means rate limiting is disabled when cache is down (fail-open)
const remaining = await client.decrementOrInit(rateLimitKey, 30, 90);
if (remaining < 0) {
throw new Error('Too many requests');
}Health Check Endpoint
// GET /api/health/cache
export async function cacheHealthCheck() {
const client = CacheProvider.getClient();
try {
// Throw mode to detect actual connectivity
await client.ping({ onError: 'throw' });
return { status: 'healthy', connected: true };
} catch (error) {
return { status: 'unhealthy', connected: false, error: error.message };
}
}Checking Key Existence and TTL
// Check if a key exists before expensive operations
if (await client.exists(lockKey)) {
throw new Error('Operation already in progress');
}
// Check remaining TTL for cache warming decisions
const remaining = await client.ttl(cacheKey);
if (remaining > 0 && remaining < 60) {
// Cache expires soon - trigger background refresh
refreshInBackground(cacheKey);
}
// Note: ttl() returns -1 if key has no TTL, -2 if key doesn't existMonitoring Connection Status
// Get detailed connection status for dashboards/monitoring
const status = client.getStatus();
console.log({
state: status.state, // 'connected', 'cooldown', 'failed', etc.
lastSuccessAt: status.lastSuccessAt, // Last successful operation
lastError: status.lastError?.message,
reconnectAttempts: status.reconnectAttempts,
cooldownEndsAt: status.cooldownEndsAt, // When cooldown expires (if in cooldown)
});
// Register for state change notifications
client.onStateChange((status) => {
if (status.state === 'cooldown') {
logger.warn('Cache entered cooldown', { error: status.lastError });
} else if (status.state === 'connected') {
logger.info('Cache reconnected');
}
});Testing with MockCacheClient
import { MockCacheClient, CacheProvider } from 'resilient-cache';
describe('MyService', () => {
let mockClient: MockCacheClient;
beforeEach(() => {
mockClient = new MockCacheClient();
CacheProvider.reset();
CacheProvider.initialize(mockClient);
});
it('should handle cache failure gracefully', async () => {
mockClient.setSimulateFailure(true);
// Your service should still work
const result = await myService.getData();
expect(result).toBeDefined();
});
it('should use cached value when available', async () => {
await mockClient.set('user:123', { name: 'John' });
const result = await myService.getUser('123');
expect(result.name).toBe('John');
});
it('should verify cache was populated', async () => {
await myService.getUser('123'); // Should cache the result
// Access internal store for assertions
const store = mockClient.getStore();
expect(store.has('user:123')).toBe(true);
});
});API Reference
ResilientCacheClient
const client = new ResilientCacheClient({
host: string; // Redis/Valkey host
port: number; // Redis/Valkey port
password?: string; // Password (optional)
connectTimeout?: number; // Connection timeout in ms (default: 1000)
commandTimeout?: number; // Command timeout in ms (default: 500)
reconnectDelay?: number; // Delay before reconnect in ms (default: 10000)
maxReconnectAttempts?: number; // Max reconnect attempts (default: Infinity)
enableOfflineQueue?: boolean; // Queue commands when disconnected (default: false)
onError?: 'graceful' | 'throw'; // Error handling mode (default: 'graceful')
autoConnect?: boolean; // Auto-connect on first command (default: true)
});Methods
| Method | Return (graceful) | Return (throw) |
|--------|-------------------|----------------|
| connect() | Promise<void> | Promise<void> |
| disconnect() | Promise<void> | Promise<void> |
| isReady() | boolean | boolean |
| getStatus() | ConnectionStatus | ConnectionStatus |
| ping(options?) | false | throws CacheUnavailableError |
| get<T>(key, defaultValue?, options?) | defaultValue \| null | throws CacheUnavailableError |
| set<T>(key, value, ttlSeconds?, options?) | false | throws CacheUnavailableError |
| remove(key, options?) | false | throws CacheUnavailableError |
| removeByPrefix(prefix, options?) | -1 | throws CacheUnavailableError |
| removeAll(options?) | false | throws CacheUnavailableError |
| increment(key, amount?, defaultValue?, options?) | defaultValue | throws CacheUnavailableError |
| decrement(key, amount?, defaultValue?, options?) | defaultValue | throws CacheUnavailableError |
| decrementOrInit(key, defaultValue, ttlSeconds, options?) | defaultValue | throws CacheUnavailableError |
| getOrSet<T>(key, factory, ttlSeconds?, options?) | factory result | throws CacheUnavailableError |
| exists(key, options?) | false | throws CacheUnavailableError |
| ttl(key, options?) | -2 | throws CacheUnavailableError |
| setIfNotExists<T>(key, value, ttlSeconds?, options?) | false | throws CacheUnavailableError |
| getMany<T>(keys, options?) | null[] (same length as input) | throws CacheUnavailableError |
| setMany<T>(entries, ttlSeconds?, options?) | false | throws CacheUnavailableError |
| expire(key, ttlSeconds, options?) | false | throws CacheUnavailableError |
getOrSet options:
isValid?: (value: T) => boolean | Promise<boolean>- Optional validator called when cache hits. Returnfalseto treat value as stale (triggers factory call and cache update).
setIfNotExists note:
- In graceful mode, returns
falseboth when key exists AND when cache is unavailable. For mutex/lock patterns, use{ onError: 'throw' }to distinguish these cases.
CacheKeyBuilder
const keys = new CacheKeyBuilder({ app: 'myapp', env: 'production' });
keys.forTenant('legal'); // Add tenant scope
keys.forOrg('org123'); // Add org scope
keys.forUser('user456'); // Add user scope
keys.build(); // Get prefix string
keys.key('suffix'); // Get full key with suffix
keys.toPattern(); // Get pattern for removeByPrefix
keys.clone(); // Create independent copyCacheProvider
CacheProvider.initialize(client); // Initialize with client
CacheProvider.getClient(); // Get initialized client
CacheProvider.isInitialized(); // Check if initialized
CacheProvider.reset(); // Reset (for testing)Error Classes
import { CacheUnavailableError, CacheTimeoutError } from 'resilient-cache';
// CacheUnavailableError - thrown when cache operation fails
error.cause; // Original error
error.operation; // Operation that failed ('get', 'set', etc.)
// CacheTimeoutError extends CacheUnavailableError
error.timeoutMs; // Timeout durationDesign Principles
This library is built around the principle that cache is non-critical infrastructure. Your application should continue working even when Redis/Valkey is completely unavailable.
Key Design Decisions
| Principle | Behavior |
|-----------|----------|
| Always available | All commands work even when Redis is down - they return graceful defaults |
| No manual connect | Auto-connects on first command, auto-reconnects after failures |
| Fail fast | Commands never block waiting for connection - return defaults immediately |
| Command-driven reconnect | Reconnection only happens when commands need it, not via background timers |
| Circuit breaker | After failure, enters cooldown to prevent retry storms |
| Status transparency | getStatus() provides real connection state for monitoring |
Connection Behavior
Scenario: Redis starts down, comes up later
Request 1 → triggers connect attempt → fails → enters cooldown → returns default
Request 2 → cooldown active → returns default immediately (no retry)
Request 3 → cooldown active → returns default immediately (no retry)
...
[cooldown expires after 10s]
Request N → cooldown expired → triggers reconnect → Redis is back! → returns real value
Request N+1 → connected → returns real valueScenario: Many concurrent requests during connection
Request 1 → triggers connect attempt → waiting...
Request 2 → sees "connecting" state → returns default immediately (fail fast)
Request 3 → sees "connecting" state → returns default immediately (fail fast)
...
Request 1 → connection succeeds → returns real value
Request N → now connected → returns real valueForcing Reconnection During Cooldown
If you need to bypass the circuit breaker and force an immediate reconnection (e.g., after a deployment or manual intervention), call connect() explicitly:
// Force reconnection even during cooldown
await client.connect();This clears the cooldown state and attempts to connect immediately. If it fails, the client enters cooldown again.
When to Use Throw Mode
Use { onError: 'throw' } only when you need to know if cache failed:
- Health check endpoints
- Admin cache management UIs
- Critical operations where cache failure should block the operation
For normal application code, use graceful mode (default) - your app keeps working.
Connection State Machine
┌─────────────┐
first │disconnected │
command ───►└──────┬──────┘
│ auto-connect
▼
┌─────────────┐
┌────►│ connecting │◄────┐
│ └──────┬──────┘ │
│ │ │
│ success │ failure │
│ ▼ │
│ ┌─────────────┐ │
│ │ connected │ │
│ └──────┬──────┘ │
│ │ │
│ lost │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ cooldown │─────┤ failure
│ └──────┬──────┘ │
│ │ │
│ cooldown │ │
│ expired │ │
│ + command│ │
│ ▼ │
│ ┌─────────────┐ │
└─────│reconnecting │─────┘
└──────┬──────┘
│
max retries │
▼
┌─────────────┐
│ failed │───► next command resets & retries
└─────────────┘This is a simplified circuit breaker pattern:
| Circuit Breaker | This Library | Behavior |
|-----------------|--------------|----------|
| CLOSED | connected | Normal operation |
| OPEN | cooldown | Reject immediately, don't attempt connection |
| HALF-OPEN | reconnecting | Test if service recovered |
Key behaviors:
- No timers: Reconnection is command-driven, not timer-driven. If no commands are issued, no reconnection attempts are made.
- Immediate cooldown: Unlike traditional circuit breakers that trip after N failures, this enters cooldown on any connection failure (appropriate for cache where one timeout means trouble).
- Fail fast during connecting: Concurrent requests during connection don't pile up - they get graceful defaults immediately.
Development
Prerequisites
- Node.js >= 20.0.0
- Docker (for integration tests)
Setup
# Clone the repository
git clone https://github.com/able-wong/resilient-cache.git
cd resilient-cache
# Install dependencies
npm installScripts
| Command | Description |
|---------|-------------|
| npm run build | Compile TypeScript to dist/ |
| npm test | Run unit tests |
| npm run test:watch | Run unit tests in watch mode |
| npm run test:coverage | Run unit tests with coverage |
| npm run test:integration | Run integration tests (starts Docker Redis automatically) |
| npm run test:all | Run both unit and integration tests |
| npm run lint | Check code style |
| npm run lint:fix | Fix code style issues |
Project Structure
resilient-cache/
├── src/
│ ├── index.ts # Public exports
│ ├── types.ts # Type definitions
│ ├── errors.ts # Custom error classes
│ ├── ResilientCacheClient.ts # Main client implementation
│ ├── CacheKeyBuilder.ts # Key prefix utility
│ ├── CacheProvider.ts # Factory/DI pattern
│ └── MockCacheClient.ts # In-memory mock for testing
├── tests/
│ ├── *.test.ts # Unit tests
│ └── integration/ # Integration tests (require Docker)
├── dist/ # Compiled output (generated)
├── docker-compose.yml # Redis for integration tests
├── vitest.config.ts # Unit test config
└── vitest.integration.config.ts # Integration test configRunning Integration Tests
Integration tests run against a real Redis instance via Docker:
npm run test:integrationThis command automatically:
- Starts a Redis container (
docker compose up -d --wait) - Waits for Redis health check to pass
- Runs the integration test suite
- Stops and removes the container (
docker compose down)
To run Redis manually for development:
# Start Redis
docker compose up -d
# Run tests (multiple times if needed)
npx vitest run --config vitest.integration.config.ts
# Stop Redis when done
docker compose downCode Style
- ESLint + Prettier for formatting
- Single quotes, semicolons, trailing commas
- Run
npm run lint:fixbefore committing
Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/my-feature) - Make your changes
- Run tests (
npm run test:all) - Run linter (
npm run lint:fix) - Commit your changes
- Push to your fork
- Open a Pull Request
Changelog
0.3.1
- Security: patched high-severity vulnerabilities in
flatted,minimatch, androllup - Security: upgraded
vitestfrom v2 to v4 (and related packages) to patch moderate esbuild vulnerability
0.3.0
- Added batch operations and distributed lock support
0.2.0
- Added manual
connect()support to bypass cooldown
License
Apache-2.0
