@mukundakatta/agentmemory

v0.2.0

Published

8 days ago

Honest pull-model alternative to Anthropic Dreaming. Time-bucketed episodic store + on-demand summarizer for LLM agents. Reversible deletes, no silent context injection, no background consolidation. Includes Postgres adapter and Claude demo.

Downloads

221

0High
0Medium
0Low

mukundakatta

ai agents llm memory claude anthropic rag episodic summarization dreaming-alternative

agentmemory

Honest pull-model memory for LLM agents. The open-source alternative to background-consolidation systems like Anthropic Dreaming, with a different shape: nothing happens in the background, every retrieval shows its work, and deletes are real deletes.

Why this exists

Anthropic shipped Dreaming on May 6, 2026: a managed background consolidation pass that turns episodic conversation traces into semantic memory the next session can use. The OSS reflex is to clone it next weekend with Llama or Qwen. I sat with that and walked away. Full reasoning in Why I refused to build a Dreaming clone for OSS Claude.

The short version: the consolidator IS the model. Run a smaller LLM to summarize, you get a different feature with the same name and lower quality. Plus deletion gets harder once memories are baked.

agentmemory is a different shape that solves the same job: pull-on-demand instead of push-in-background. The latency tax is real (200ms-2s on cold start). In exchange you get full reversibility, no derived artifacts, and the user can see exactly what was retrieved before it goes into the context.

Install

npm install @mukundakatta/agentmemory

Requires Node 20+. Pure ESM, zero runtime dependencies.

Three pieces

1. EpisodicStore

Append-only event log of agent interactions. Embedded at write time when an embedder is configured. Real deletes, no tombstones, no derived artifacts.

import { EpisodicStore } from "@mukundakatta/agentmemory";

const store = new EpisodicStore({
  embedder: async (text) => myEmbedder(text), // optional; falls back to keyword overlap
});

await store.append({
  sessionId: "user-42",
  kind: "user_message",
  text: "I prefer Postgres for the new project",
});

const hits = await store.retrieve("which database should I use", {
  sessionId: "user-42",
  topK: 5,
});

// Real delete: gone, no trace.
const eventId = hits[0].id;
store.deleteEvent(eventId);

// Retention policies are first-class:
store.deleteOlderThan(Date.now() - 30 * 24 * 60 * 60 * 1000);
store.deleteSession("user-42");

2. OnDemandSummarizer

The pull-model context builder. Bring your own LLM. The summary is shown in the trace, never silently injected.

import { OnDemandSummarizer } from "@mukundakatta/agentmemory";
import Anthropic from "@anthropic-ai/sdk";

const claude = new Anthropic();

const summarizer = new OnDemandSummarizer({
  llm: async (prompt) => {
    const r = await claude.messages.create({
      model: "claude-3-5-haiku-latest",
      max_tokens: 400,
      messages: [{ role: "user", content: prompt }],
    });
    return r.content[0].text;
  },
  maxTokens: 300,
});

const events = await store.retrieve("pick a database", { topK: 5 });
const { summary, trace } = await summarizer.summarize(events, "pick a database");

console.log("Summary:", summary);
console.log("Built from event ids:", trace.eventIds);
console.log("Prompt sent to LLM:", trace.prompt);

The trace lets you show the user (or log to your audit trail) exactly which events fed the summary. This is the key honesty property: nothing silent, nothing magical.

3. MemoryDriftWatcher

Watches retrieval quality over time. If yesterday's "remember when we discussed X" stops returning anything because user intent has drifted, you get a signal instead of a silent regression.

import { MemoryDriftWatcher } from "@mukundakatta/agentmemory";

const watcher = new MemoryDriftWatcher({
  windowSize: 20,
  dropThreshold: 0.15, // 15% mean-score drop alerts
});

// After every retrieval call:
watcher.record({ ts: Date.now(), scores: hits.map((h) => h.score) });

const state = watcher.state();
if (state.alert) {
  console.warn("Memory drift alert:", state.reason);
}

For the heavy-duty drift math (MMD, sliced Wasserstein, KS, PSI, k-means cluster shift across five dimensions) see the sibling library ragdrift.

Design rules

No background work. Everything is synchronous-from-the-caller's-perspective. No cron, no consolidation pass, no "memories are being baked" race conditions.
Real deletes. No tombstones. No derived artifacts that survive after the source is deleted. If a user asks you to forget something, you can.
Pull, never push. The summarizer is called explicitly from the agent's main loop. Nothing gets injected without a call.
Show the trace. Every summary returns the event ids and the exact prompt that produced it.
BYO LLM. No assumption about which model summarizes. Use Claude, GPT, Gemini, or a local model. The library is the same.
Zero runtime dependencies. The whole library is < 500 lines. Easy to read end-to-end.

What this is not

Not a Dreaming clone. Different shape on purpose.
Not a vector database. The default in-memory store is for tests and small agents. For production, swap for a persistent backend that satisfies the same interface.
Not a "memory framework." Three small classes you compose into your existing agent loop.

Compatibility with the @mukundakatta/agent* reliability stack

agentmemory pairs cleanly with the existing zero-dep agent stack:

| Library | What it does | |---|---| | @mukundakatta/agentfit | Token-aware truncation. Use to fit a summary plus the new turn into your context budget. | | @mukundakatta/agentguard | Network egress allowlist. Use to keep retrieved memories from triggering unrelated tool calls. | | @mukundakatta/agentsnap | Tool-call trace snapshots. Snapshot the agent's behavior with and without memory. | | @mukundakatta/agentvet | Tool arg validation before execution. | | @mukundakatta/agentcast | Structured output enforcer. Use to make the summarizer return JSON when needed. |

End-to-end Claude demo

A small runnable demo wires EpisodicStore, OnDemandSummarizer, and the Anthropic SDK together:

npm install @anthropic-ai/sdk
ANTHROPIC_API_KEY=sk-ant-... node examples/claude-agent.js

The demo shows two sessions, retrieval across them, the summary printed before injection (so you can see exactly what's going into Claude's context), and a real delete that removes a memory with no tombstone left behind. Source: examples/claude-agent.js.

Postgres adapter (production backend)

The default EpisodicStore is in-memory. For production, swap in PostgresEpisodicStore (same interface, real deletes via DELETE):

import pg from "pg";
import { PostgresEpisodicStore } from "@mukundakatta/agentmemory/postgres";

const pool = new pg.Pool({ connectionString: process.env.DATABASE_URL });
const store = new PostgresEpisodicStore({ pool, embedder: myEmbedder });
await store.init();  // creates `agentmemory_events` table + indexes if missing

await store.append({ sessionId: "user-42", kind: "user_message", text: "hi" });
const hits = await store.retrieve("greetings", { sessionId: "user-42", topK: 5 });
await store.deleteEvent(hits[0].id);  // real delete, no tombstone

Schema is documented in src/adapters/postgres.js. Works on plain Postgres; if you have pgvector you can swap the embedding FLOAT8[] column for vector(N) and rewrite the retrieve ORDER BY for indexed cosine.

Peer dependency: npm install pg.

Testing

npm test            # in-memory store + summarizer + drift (23 tests)
npm run test:postgres   # Postgres adapter (skipped unless DATABASE_URL set)
npm run test:all    # everything

23 in-memory tests + 9 Postgres tests, all passing. Tests cover:

EpisodicStore: append, embed, retrieve (cosine + keyword fallback), filters (session, time, kind), deleteEvent, deleteSession, deleteOlderThan, sessions
OnDemandSummarizer: requires LLM, empty-events shortcut, prompt structure, summary trim, custom system prompt
MemoryDriftWatcher: cold-start, stable scores, drop alert, sliding window, reset
Integration: end-to-end flow + drift watcher catching memory-quality decay

License

MIT. See LICENSE.

Companion essay: Why I refused to build a Dreaming clone for OSS Claude
Sibling library for full drift math: ragdrift
The rest of the agent reliability stack: @mukundakatta on npm