@nodellmcache/memory
v1.0.0
Published
In-memory LRU storage adapter for NodeLLMCache with TTL and optional compression
Maintainers
Readme
@nodellmcache/memory
In-memory LRU storage adapter for NodeLLMCache. The Tier-0 "hot" backend: sub-millisecond reads, byte-budgeted LRU eviction, TTL expiry, and optional per-entry compression.
Implements StorageAdapter<T> from @nodellmcache/core, so any feature cache (prompt, embedding, semantic, …) can use it by injection.
Install
npm install @nodellmcache/memory @nodellmcache/core
# only if you enable compression and don't inject your own engine:
npm install @nodellmcache/compressionQuick start
import { MemoryAdapter } from '@nodellmcache/memory'
const store = new MemoryAdapter({
maxSize: 256 * 1024 * 1024, // 256 MB; default is 500 MB
defaultTTL: 3_600_000, // 1 hour
})
await store.set('key', {
key: 'key',
value: { answer: 42 },
createdAt: Date.now(),
metadata: { compressed: false, originalSize: 0, cacheType: 'prompt' },
})
const entry = await store.get('key') // { value: { answer: 42 }, ... } or nullTypically you don't call the adapter directly — you inject it into a cache manager:
import { PromptCache } from '@nodellmcache/prompt-cache'
const cache = new PromptCache({ adapter: new MemoryAdapter() })Behavior
LRU eviction — entries are budgeted by an approximate byte size (
estimateSize). When asetwould exceedmaxSize, least-recently-used entries are evicted (and acache.evictmetric is emitted). A read counts as recent use. A single value larger thanmaxSizeis not stored.TTL —
set(key, entry, ttl)takes a relative TTL in ms; otherwise the entry's ownexpiresAtor the adapterdefaultTTLapplies. Expiry is enforced both by a self-unref'ing timer and a defensive check on read.Compression (optional) — set
compressionto'auto'or a specific algorithm to serialize + compress each value (trading CPU for memory). The engine is lazily loaded from@nodellmcache/compression, or you can inject one:import { CompressionEngine } from '@nodellmcache/compression' new MemoryAdapter({ compression: 'auto', compressionEngine: new CompressionEngine() })Compression serializes values (JSON by default), so non-JSON-safe types (e.g.
Float32Array) round-trip as plain arrays. For binary embeddings,@nodellmcache/embedding-cachehandles buffers directly.
Options
| Option | Default | Description |
|--------|---------|-------------|
| maxSize | 500 MB | Byte budget before LRU eviction |
| defaultTTL | none | Fallback relative TTL (ms) |
| compression | false | 'auto' | CompressionAlgo | false/'none' |
| compressionEngine | lazy-loaded | Inject a CompressionEngine to avoid the optional dep |
| serializer | JsonSerializer | Used only when compression is enabled |
| metrics | no-op | Receives cache.evict events |
stats()
const { entryCount, sizeBytes, evictions } = await store.stats()License
MIT
