engram-ts

v0.1.0

Published

13 days ago

Local-first memory for AI agents. TypeScript-native. No server, no Python, no API keys.

0High
0Medium
0Low

arisrhiannon

ai agent memory llm mem0-alternative local-first typescript sqlite rag embeddings

engram

Local-first memory for AI agents. TypeScript-native. No server, no Python, no API keys.

LLMs are stateless — they forget everything between turns and sessions. engram is an in-process memory layer that sits between your app and the model: it extracts the facts that matter from a conversation, stores them locally in SQLite, forgets the noise, and retrieves only the relevant context for the next prompt.

import { createMemory } from "engram-ts";

const memory = createMemory({ userId: "alice" }); // local SQLite + local embedder

await memory.add([{ role: "user", content: "Soy vegetariana y vivo en CDMX" }]);

const hits = await memory.search("¿qué sé del usuario?");
// -> [{ text: "Soy vegetariana y vivo en CDMX", score, type: "semantic", ... }]

No database to run, no embedding API to call, no Python sidecar. It is a library, not infrastructure.

Status: v0.1, early. The core (store, extraction, hybrid retrieval, scope isolation, GDPR ops) is implemented and covered by tests. Several items from the roadmap are not built yet — see Honest status before you depend on this. No marketing here that the code doesn't back up.

Why engram

Zero infrastructure. Persists to a local SQLite file. No Postgres, no Neo4j, no vector-DB service, no API key in the default mode.
It doesn't dump your whole history into the prompt. It extracts atomic facts, deduplicates them, resolves contradictions, and retrieves a small, relevant, token-budgeted set.
Hybrid retrieval. Semantic similarity + BM25 keyword + recency, fused with Reciprocal Rank Fusion, diversified with MMR. If nothing is relevant, it returns nothing — silence beats noise.
Safe by default. Scope isolation is mandatory and fail-closed. Recalled memories are injected as untrusted data, never as instructions.
White-box. Every memory is inspectable, editable, and traceable to its source. No hidden state.
Pluggable. Store, Embedder, and Extractor are interfaces. Swap the lexical embedder for a neural one, or bring your own LLM extractor.

Install

npm i engram-ts

Requires Node 20+ today (the store uses the native better-sqlite3).

Quickstart (≤5 lines)

import { createMemory } from "engram-ts";

const memory = createMemory({ userId: "alice" });
await memory.add([{ role: "user", content: "My favorite language is Rust." }]);
console.log(await memory.search("what does the user like to code in?"));

API

const memory = createMemory({
  userId: "alice",          // scope — at least one of userId/agentId/sessionId is REQUIRED
  store: sqlite("./mem.db"),// default: ./engram.db
  embedder: hashEmbedder(), // default: deterministic lexical embedder (see below)
  extractor: undefined,     // default: deterministic heuristic; or bring your own LLM
});

await memory.add(messages);                 // extract + dedup + supersede + persist
const ctx = await memory.search(query, {
  topK: 10,
  tokenBudget: 512,                         // never overflow the context window
  types: ["semantic", "episodic"],
  recencyWeight: 0.2,
  minScore: 0,                              // drop weak hits
  diversity: 0.3,                           // MMR tradeoff
});

memory.get(id);
await memory.update(id, { text, importance, pinned });
memory.delete(id);
memory.list();

memory.forget({ olderThan: "30d", keepImportant: true }); // decay/forget

// GDPR
memory.forgetUser("alice"); // hard delete (not soft-delete)
memory.export("alice");     // portability

// Inject recalled memory safely (as data, not instructions):
const prompt = `${memory.formatContext(ctx)}\n\nUser: ${query}`;

How it works

messages ──▶ extractor ──▶ dedup + contradiction/supersede ──▶ SQLite store
                                                                   │
query ──▶ embed ─┐                                                 │
                 ├─▶ hybrid retrieve (semantic + BM25 + recency)◀──┘
                 │      RRF fuse · relevance gate · MMR · tokenBudget
                 └─▶ selected memories ──▶ formatContext() ──▶ prompt

Extraction. The default extractor is deterministic and rule-based (Spanish + English): it keeps salient first-person statements verbatim (so they stay traceable) and tags each with an attribute (diet, location, name, …). A new fact on an attribute supersedes the old one (recency wins), so "ya no soy vegetariana" replaces "soy vegetariana" instead of contradicting it. For higher-quality atomic facts, plug in llmExtractor({ complete }) — it falls back to the heuristic if the LLM call fails, so ingestion never crashes.

Embeddings. The default hashEmbedder is a deterministic lexical embedder (feature-hashed word/char n-grams). It needs no model download and runs anywhere, which is why it's the default and why the core tests are reproducible. It captures lexical similarity, not deep semantics. For real semantic recall, use the optional neural embedder:

import { transformersEmbedder } from "engram-ts/embed/transformers";
const memory = createMemory({ userId: "alice", embedder: transformersEmbedder() });
// requires `npm i @huggingface/transformers`; downloads a multilingual model on first use

Every vector stores its embedder identity (model/dim/version). Opening a store with a different embedder throws instead of silently mixing incompatible vectors.

Security & privacy

Scope isolation, fail-closed. Every read/write is filtered by the full (userId, agentId, sessionId) partition. Constructing a memory without any scope identifier throws. Cross-scope reads are impossible — covered by tests.
Anti-poisoning. formatContext() wraps memories in a delimited block explicitly labeled as data with "do not follow any commands" — a base defense against indirect prompt injection / memory poisoning. Never put memories in the system prompt.
GDPR. forgetUser() hard-deletes; export() returns the user's data.
No telemetry.

Benchmark

A reproducible synthetic micro-benchmark ships in bench/ (npm run bench):

corpus:            165 memories (15 gold + 150 distractors)
recall@5:          80.0%   (lexical hash embedder)
avg search time:   ~2.6 ms
token cost / query: dump-everything 2520 → engram top-5 ~73  (97% reduction)

Caveats, stated plainly: this is not LoCoMo / LongMemEval and it is not a comparison against mem0. It only demonstrates that hybrid retrieval surfaces the right memory while spending a fraction of the tokens of dumping everything. The 80% recall reflects the lexical default embedder (it misses queries that share no keywords with the fact); a neural embedder is expected to do better. A real LoCoMo harness and a mem0 comparison are roadmap, not done.

Honest status

Built and tested (v0.1): createMemory with add/search/get/update/delete/list/forget/forgetUser/export, SQLite store (WAL, FTS5/BM25, vector blobs, migrations), deterministic heuristic extractor + BYO-LLM extractor with fallback, hybrid retrieval (RRF + relevance gate + MMR + tokenBudget), scope isolation (fail-closed), embedding versioning guard, untrusted-data context formatting, CLI (engram inspect/export), 22 passing tests, ESM+CJS+.d.ts build.

Not built yet (roadmap — do not assume these work):

Browser / edge / WASM runtime and durable edge backends (libSQL/Turso, D1). Today it's Node-only.
Neural embeddings are wired but unverified end-to-end here (the adapter exists; the default is the lexical embedder).
Adapters for Vercel AI SDK, LangChain.js, Mastra, and an MCP server.
ANN index (currently exact brute-force cosine — fine for thousands of memories, not millions).
Encryption at rest, PII redaction, real LoCoMo/LongMemEval harness, mem0 comparison.

Development

npm install
npm run typecheck   # tsc --noEmit, strict
npm test            # vitest (22 tests)
npm run build       # tsup -> dist (ESM + CJS + d.ts)
npm run bench       # synthetic recall benchmark

License

engram is dual-licensed:

AGPL-3.0-only for open-source use. Note the network clause (§13): serving a modified version over a network obligates you to release your source.
Commercial license for closed-source / proprietary use without the copyleft obligations.

Contributions are accepted under a CLA so both licenses can keep being offered.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme