@mukundakatta/agentmemory
v0.2.0
Published
Honest pull-model alternative to Anthropic Dreaming. Time-bucketed episodic store + on-demand summarizer for LLM agents. Reversible deletes, no silent context injection, no background consolidation. Includes Postgres adapter and Claude demo.
Downloads
221
Maintainers
Readme
agentmemory
Honest pull-model memory for LLM agents. The open-source alternative to background-consolidation systems like Anthropic Dreaming, with a different shape: nothing happens in the background, every retrieval shows its work, and deletes are real deletes.
Why this exists
Anthropic shipped Dreaming on May 6, 2026: a managed background consolidation pass that turns episodic conversation traces into semantic memory the next session can use. The OSS reflex is to clone it next weekend with Llama or Qwen. I sat with that and walked away. Full reasoning in Why I refused to build a Dreaming clone for OSS Claude.
The short version: the consolidator IS the model. Run a smaller LLM to summarize, you get a different feature with the same name and lower quality. Plus deletion gets harder once memories are baked.
agentmemory is a different shape that solves the same job: pull-on-demand instead of push-in-background. The latency tax is real (200ms-2s on cold start). In exchange you get full reversibility, no derived artifacts, and the user can see exactly what was retrieved before it goes into the context.
Install
npm install @mukundakatta/agentmemoryRequires Node 20+. Pure ESM, zero runtime dependencies.
Three pieces
1. EpisodicStore
Append-only event log of agent interactions. Embedded at write time when an embedder is configured. Real deletes, no tombstones, no derived artifacts.
import { EpisodicStore } from "@mukundakatta/agentmemory";
const store = new EpisodicStore({
embedder: async (text) => myEmbedder(text), // optional; falls back to keyword overlap
});
await store.append({
sessionId: "user-42",
kind: "user_message",
text: "I prefer Postgres for the new project",
});
const hits = await store.retrieve("which database should I use", {
sessionId: "user-42",
topK: 5,
});
// Real delete: gone, no trace.
const eventId = hits[0].id;
store.deleteEvent(eventId);
// Retention policies are first-class:
store.deleteOlderThan(Date.now() - 30 * 24 * 60 * 60 * 1000);
store.deleteSession("user-42");2. OnDemandSummarizer
The pull-model context builder. Bring your own LLM. The summary is shown in the trace, never silently injected.
import { OnDemandSummarizer } from "@mukundakatta/agentmemory";
import Anthropic from "@anthropic-ai/sdk";
const claude = new Anthropic();
const summarizer = new OnDemandSummarizer({
llm: async (prompt) => {
const r = await claude.messages.create({
model: "claude-3-5-haiku-latest",
max_tokens: 400,
messages: [{ role: "user", content: prompt }],
});
return r.content[0].text;
},
maxTokens: 300,
});
const events = await store.retrieve("pick a database", { topK: 5 });
const { summary, trace } = await summarizer.summarize(events, "pick a database");
console.log("Summary:", summary);
console.log("Built from event ids:", trace.eventIds);
console.log("Prompt sent to LLM:", trace.prompt);The trace lets you show the user (or log to your audit trail) exactly which events fed the summary. This is the key honesty property: nothing silent, nothing magical.
3. MemoryDriftWatcher
Watches retrieval quality over time. If yesterday's "remember when we discussed X" stops returning anything because user intent has drifted, you get a signal instead of a silent regression.
import { MemoryDriftWatcher } from "@mukundakatta/agentmemory";
const watcher = new MemoryDriftWatcher({
windowSize: 20,
dropThreshold: 0.15, // 15% mean-score drop alerts
});
// After every retrieval call:
watcher.record({ ts: Date.now(), scores: hits.map((h) => h.score) });
const state = watcher.state();
if (state.alert) {
console.warn("Memory drift alert:", state.reason);
}For the heavy-duty drift math (MMD, sliced Wasserstein, KS, PSI, k-means cluster shift across five dimensions) see the sibling library ragdrift.
Design rules
- No background work. Everything is synchronous-from-the-caller's-perspective. No cron, no consolidation pass, no "memories are being baked" race conditions.
- Real deletes. No tombstones. No derived artifacts that survive after the source is deleted. If a user asks you to forget something, you can.
- Pull, never push. The summarizer is called explicitly from the agent's main loop. Nothing gets injected without a call.
- Show the trace. Every summary returns the event ids and the exact prompt that produced it.
- BYO LLM. No assumption about which model summarizes. Use Claude, GPT, Gemini, or a local model. The library is the same.
- Zero runtime dependencies. The whole library is < 500 lines. Easy to read end-to-end.
What this is not
- Not a Dreaming clone. Different shape on purpose.
- Not a vector database. The default in-memory store is for tests and small agents. For production, swap for a persistent backend that satisfies the same interface.
- Not a "memory framework." Three small classes you compose into your existing agent loop.
Compatibility with the @mukundakatta/agent* reliability stack
agentmemory pairs cleanly with the existing zero-dep agent stack:
| Library | What it does |
|---|---|
| @mukundakatta/agentfit | Token-aware truncation. Use to fit a summary plus the new turn into your context budget. |
| @mukundakatta/agentguard | Network egress allowlist. Use to keep retrieved memories from triggering unrelated tool calls. |
| @mukundakatta/agentsnap | Tool-call trace snapshots. Snapshot the agent's behavior with and without memory. |
| @mukundakatta/agentvet | Tool arg validation before execution. |
| @mukundakatta/agentcast | Structured output enforcer. Use to make the summarizer return JSON when needed. |
End-to-end Claude demo
A small runnable demo wires EpisodicStore, OnDemandSummarizer, and the Anthropic SDK together:
npm install @anthropic-ai/sdk
ANTHROPIC_API_KEY=sk-ant-... node examples/claude-agent.jsThe demo shows two sessions, retrieval across them, the summary printed before injection (so you can see exactly what's going into Claude's context), and a real delete that removes a memory with no tombstone left behind. Source: examples/claude-agent.js.
Postgres adapter (production backend)
The default EpisodicStore is in-memory. For production, swap in PostgresEpisodicStore (same interface, real deletes via DELETE):
import pg from "pg";
import { PostgresEpisodicStore } from "@mukundakatta/agentmemory/postgres";
const pool = new pg.Pool({ connectionString: process.env.DATABASE_URL });
const store = new PostgresEpisodicStore({ pool, embedder: myEmbedder });
await store.init(); // creates `agentmemory_events` table + indexes if missing
await store.append({ sessionId: "user-42", kind: "user_message", text: "hi" });
const hits = await store.retrieve("greetings", { sessionId: "user-42", topK: 5 });
await store.deleteEvent(hits[0].id); // real delete, no tombstoneSchema is documented in src/adapters/postgres.js. Works on plain Postgres; if you have pgvector you can swap the embedding FLOAT8[] column for vector(N) and rewrite the retrieve ORDER BY for indexed cosine.
Peer dependency: npm install pg.
Testing
npm test # in-memory store + summarizer + drift (23 tests)
npm run test:postgres # Postgres adapter (skipped unless DATABASE_URL set)
npm run test:all # everything23 in-memory tests + 9 Postgres tests, all passing. Tests cover:
- EpisodicStore: append, embed, retrieve (cosine + keyword fallback), filters (session, time, kind), deleteEvent, deleteSession, deleteOlderThan, sessions
- OnDemandSummarizer: requires LLM, empty-events shortcut, prompt structure, summary trim, custom system prompt
- MemoryDriftWatcher: cold-start, stable scores, drop alert, sliding window, reset
- Integration: end-to-end flow + drift watcher catching memory-quality decay
License
MIT. See LICENSE.
Related
- Companion essay: Why I refused to build a Dreaming clone for OSS Claude
- Sibling library for full drift math: ragdrift
- The rest of the agent reliability stack: @mukundakatta on npm
