@dex-ai/memory

v0.4.1

Published

7 days ago

LanceDB-backed memory Extension for @dex-ai/sdk — episodic, semantic, and procedural memory in one package.

0High
0Medium
0Low

@dex-ai/memory

LanceDB-backed memory Extension for @dex-ai/sdk. Three memory types — episodic, semantic, procedural — in one extension, one LanceDB directory.

Install

npm install @dex-ai/memory

The package includes @xenova/transformers for the default local embedder (ONNX via Transformers.js). You can avoid loading it at runtime by passing your own embed() function.

Usage

import { Agent } from '@dex-ai/sdk';
import { openai } from '@dex-ai/openai';
import { memoryExtension } from '@dex-ai/memory';

const agent = await Agent.create({
  provider: openai({ modelId: 'gpt-4.1' }),
  extensions: [
    memoryExtension({
      path: '~/.dex/memory.lancedb',
      userId: 'alice',

      // Optional: a cheaper model for memory's summarize + extract calls.
      llm: { model: 'gpt-4o-mini' },

      // Optional: bring your own embedder (e.g. a remote endpoint).
      // If omitted, a local Transformers.js embedder (all-MiniLM-L6-v2) is used.
      // embed: async (texts) => await myRemoteEmbedder(texts),
    }),
  ],
});

What lives where

Episodic memory — past turns, auto-summarized

Every generate() iteration, the extension fires a background task:

Summarize the turn's new messages into 1-3 sentences via the LLM.
Embed the summary and write both to LanceDB.

Background writes are tracked; agent.dispose() awaits them.

Episodic memories are scoped to the active project root so unrelated projects do not share turn history. The extension derives the scope from Dex coding-agent environment state (env.cwd/rootDirs), workspace state, agent metadata (root_cwd/rootCwd), or finally process.cwd(); when possible it resolves cwd to the containing Git root. Stored episodes include metadata.projectRoot and metadata.cwd.

At recall time, the extension fetches:

The most-recent episodes for the user within the current project scope.
The most-similar episodes within the current project scope via LanceDB cosine vector search against the user's last message.
De-duplicated by id, sorted newest-first.

Semantic memory — durable facts

Facts are (subject, predicate, object) tuples, unique by deterministic id from (userId, subject, predicate). Written in two ways:

Automatic extraction at each iteration stop: the LLM extracts durable claims from the turn and upserts them.

Model-driven via tools:

memory_remember_fact({ subject, predicate, object }) — upsert by key.
memory_forget_fact({ subject, predicate }) — delete by key.

At recall time, semantic facts are selected with LanceDB vector similarity when a query embedding is available. Without an embedding, recall falls back to the most-recent facts.

Procedural memory — runbooks

Long-form how-to content. Stored by unique title, tagged, with an embedding over title + body.

Tools:

memory_store_procedure({ title, body, tags? }) — upsert by title.
memory_forget_procedure({ title }) — delete by exact title.
memory_list_procedures({ query?, tag?, limit? }) — with query, returns vector-ranked results; with tag, filters; with neither, returns most-recently-updated.
memory_get_procedure({ title }) — fetch full body.

Auto-inject: at recall time, the extension does a vector similarity lookup against the user's last message. Matching procedures above the threshold (default 0.5) are prepended to the prompt as memory context.

LLM configuration

memoryExtension({
  // ...
  llm: {
    provider: customProvider,   // optional — overrides the agent's provider entirely
    model: 'gpt-4o-mini',       // optional — passes via providerOptions.model per call
  },
});

Resolution rule:

If llm.provider is set, use it.
Otherwise use the agent's provider.
If llm.model is set, it's passed via providerOptions.model on every memory-internal request.

Extension options

interface MemoryExtensionOptions {
  path: string;                      // LanceDB directory or URI
  userId: string;                    // owner for episodic + semantic (procedural is global)

  llm?: { provider?: Provider; model?: string };
  embed?: (texts: string[]) => Promise<number[][]>; // 384-dim

  episodicRecent?: number;           // default 5
  episodicSimilar?: number;          // default 3
  semanticLimit?: number;            // default 10
  proceduralThreshold?: number;      // default 0.5
  autoWrite?: boolean;               // default true; set false to disable auto-summarize+extract
  autoWriteMinMessages?: number;     // default 2
}

Injected prompt shape

When memory is available, the extension injects a single synthetic memory message before the latest user message. Example:

<memory>
Recent context:
- 5m ago: discussed the auth flow; chose JWT over session cookies.

Relevant facts:
- user prefers TypeScript
- project uses PostgreSQL 15

Relevant procedures (use memory_get_procedure to fetch details):
- [deploy-dex] (id: abc123, similarity: 0.71)
</memory>

This message is not persisted to actx.messages — it's a per-turn request rewrite.

Requirements + gotchas

LanceDB native package: @lancedb/lancedb ships platform-specific optional packages. Install dependencies on the deployment platform so the matching native package is present.
Transformers.js first-run download: ~25 MB ONNX model, cached in ~/.cache/huggingface. Initial embed() takes 10-30s; subsequent calls are fast.
Background writes are fire-and-forget. If the process is killed mid-turn, that turn's memory may not be persisted. agent.dispose() awaits in-flight writes normally.
Prompt quality of auto-extraction depends on the configured model. Cheap models work but may produce noisier facts. Review with memory tools occasionally and use memory_forget_fact to prune.
Procedural is global, not user-scoped.

Schema

LanceDB tables:

episodic   id | user_id | summary | metadata_json | created_at | vector(FLOAT[384])
semantic   id | user_id | subject | predicate | object | source | created_at | updated_at | has_vector | vector(FLOAT[384])
procedural id | title | body | tags[] | created_at | updated_at | vector(FLOAT[384])

Testing

npm test

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@dex-ai/memory

Install

Usage

What lives where

Episodic memory — past turns, auto-summarized

Semantic memory — durable facts

Procedural memory — runbooks

LLM configuration

Extension options

Injected prompt shape

Requirements + gotchas

Schema

Testing