@vortex-os/memory-extended

v0.7.5

Published

a month ago

Add-on cluster — extended memory namespaces (sqlite, vector, sessionArchive, recall, consolidate) layered on @vortex-os/base.

Downloads

330

0High
0Medium
0Low

dydan77

@vortex-os/memory-extended

Extended memory layer for VortEX — an opt-in add-on that lives on top of @vortex-os/base. Adds derived indexes and a conversation archive while keeping the base's markdown _memory/ layer as the single source of truth.

What it is

memory-extended ships as a single npm package with six cooperating namespaces:

| Namespace | Purpose | Status | |---|---|---| | sessionArchive | Append-only JSONL log of agent sessions + SQLite metadata. Four first-party host adapters (Claude Code CLI, Codex CLI, Gemini CLI, Claude Desktop). | Shipped | | consolidate | Post-session fact-extraction proposer — read past sessionArchive events, propose memory candidates, operator confirms. | Shipped | | sqlite | Structured store derived from markdown memories — hard-filter queries (byType / byTag / byPrivacy / updatedSince), drift detection (log + skip, non-destructive). | Shipped | | vector | Dense retrieval over memories and conversation sessions. Default backend: in-process brute-force cosine; vectors in the shared memory.sqlite. Host-injected EmbedFn (multilingual local default). Sessions are vectorized at topic-chunk granularity via embedding-similarity segmentation. | Shipped | | recall | Two-stage hybrid retrieval engine (SQLite loose hard-filter → cosine rerank). Returns data, not a report. Hits carry source = memory | session-archive. /recall <query> command lives in session-rituals. | Shipped | | mcp | MCP server for any MCP host (Claude Desktop, chatbots) over stdio: read tools (recall, list_memories, get_memory) + document tools (suggest_document → write_document / decline_document, a propose-then-write pair), plus a one-line install into the host config. Writes are gated behind an explicit second tool call. SDK is an optional dependency. Bin: vortex-mcp-recall. | Shipped |

The companion module proactive-curator (shipped inside @vortex-os/base) handles the in-session counterpart to consolidate — live "this looks worth capturing" prompts during an ongoing conversation. memory-extended/consolidate emits proposals shaped identically to proactive-curator's Proposal / LLMJudge types (that small surface is inlined under src/internal/, not imported), so a single host UX renders both surfaces.

What it is not

It does not replace @vortex-os/base's memorySystem. The markdown layer remains the source of truth; this package adds derived indexes that can be deleted and regenerated at any time.
It does not write to _memory/ directly. The consolidate namespace proposes; the operator confirms; writes go through base's memorySystem.
It does not require all of its namespaces to be present. Use what you need; drop the rest.

Installation

npm install @vortex-os/memory-extended @vortex-os/base better-sqlite3
# Optional — only needed if you ingest from Claude Desktop:
npm install classic-level

@vortex-os/base is a required peer because this package is an add-on layered onto a base instance — it expects a base-scaffolded data/ layout alongside it, not because its code imports base. (memory-extended is self-contained: its only runtime dependency is yaml; the proactive-curator types consolidate mirrors are inlined under src/internal/.) better-sqlite3 is required for the SQLite metadata layer. classic-level is optional — required only when the Claude Desktop adapter is registered.

The local embedder (@huggingface/transformers) is an optional dependency — it installs automatically with this package; nothing to fetch by hand. The default model is Xenova/multilingual-e5-small (384-dim, 50+ languages incl. Korean, 512-token input, ~470 MB) — a retrieval-tuned multilingual model that downloads once on first vector/recall use and is cached. It is asymmetric: a "query: " prefix is applied to search text and "passage: " to indexed text automatically (the store passes the right kind). To use a symmetric model instead (e.g. Xenova/all-MiniLM-L6-v2, ~90 MB) pass { model: "...", prefixes: null }. To skip the local model entirely, pass your own EmbedFn (e.g. an OpenAI/Voyage adapter).

Changing the embedder requires a vector rebuild. Vectors are model-specific; even when the dimension matches (384-dim) the embedding spaces differ, so run npx rebuild-memory-vector after switching models.

Quick usage — `sessionArchive`

import {
  ingest,
  claudeCodeAdapter,
  codexAdapter,
  geminiAdapter,
  claudeDesktopAdapter,
} from "@vortex-os/memory-extended/sessionArchive";

await ingest({
  adapters: [claudeCodeAdapter, codexAdapter, geminiAdapter, claudeDesktopAdapter],
  cwd: process.cwd(),
  dataDir: "./data",
  since: "2026-05-01T00:00:00Z",
});

The orchestrator runs each adapter's detect() in parallel, calls list() on the ones that resolve true, reads each session end-to-end, normalizes events to VortexTranscriptEvent, and writes raw + normalized JSONL into data/_session-archive/ along with SQLite metadata for incremental re-scan.

Quick usage — `sqlite`

import { MemorySqliteStore, rebuildFromMemoryDir, driftCheck } from "@vortex-os/memory-extended/sqlite";

const store = new MemorySqliteStore("./data/_indexes/memory.sqlite");

// One-shot rebuild from data/_memory/*.md (or run `npx rebuild-memory-sqlite`).
// Mirrors the directory: upserts live files and prunes the orphan rows a
// deleted/renamed memory leaves behind (FTS + tags pruned in lockstep), in one
// transaction. → { scanned, upserted, skipped, deleted }
await rebuildFromMemoryDir(store, "./data/_memory");

// Four hard-filter helpers.
const rules = store.byType("feedback");
const taggedRules = store.byTag("onboarding");
const visiblePublic = store.byPrivacy("public");
const fresh = store.updatedSince("2026-05-01");

// Composite query (AND across fields, OR within a list).
const recent = store.query({
  type: ["feedback", "project"],
  tags: ["vortex"],
  updatedSinceMs: Date.parse("2026-05-20"),
});

// Drift check — non-destructive. Returns the drifted ids; the operator
// decides whether to re-run rebuild or treat the markdown as new truth.
const report = await driftCheck(store, "./data/_memory");
console.log(`drifted ${report.drifted.length}/${report.rowsScanned}`);

store.close();

The sqlite file lives under data/_indexes/ (gitignored). Markdown remains the source of truth — the sqlite index is derived and rebuildable. See the operator decision table in docs/memory-extended-design.md for the rationale.

Quick usage — `vector` + `recall`

import { MemorySqliteStore } from "@vortex-os/memory-extended/sqlite";
import { MemoryVectorStore, createLocalEmbedder } from "@vortex-os/memory-extended/vector";
import { recall, renderRecallHits } from "@vortex-os/memory-extended/recall";

const dbPath = "./data/_indexes/memory.sqlite";
const sqlite = new MemorySqliteStore(dbPath);
const vector = new MemoryVectorStore({ db: dbPath }); // default brute-force cosine backend
const embed = createLocalEmbedder();                  // or your own EmbedFn (OpenAI/Voyage)

// One-shot rebuild from the sqlite store (or run `npx rebuild-memory-vector`).
// Re-embeds only memories whose embeddable text changed (hash of the CAPPED embed input),
// keeps unchanged ones, and prunes deleted ones. → { indexed, skipped, pruned }
// Pass { force: true } to re-embed everything (the escape hatch after an
// embedder/model change — `npx rebuild-memory-vector` does this).
await vector.rebuild(sqlite, embed);

// The engine returns DATA — list it, or phrase one hit in conversation.
const result = await recall(
  { query: "tone feedback from last May", k: 5 },     // intent: type/tag/month parsed loosely
  { sqlite, vector, embed },
);
console.log(renderRecallHits(result));                // optional compact list render

vector.close();
sqlite.close();

The vector index shares the same memory.sqlite (a memory_vectors table) — one file, one rebuild path. The /recall <query> slash command wraps this engine; register it via createRitualRegistry({ recall: { embed } }) in @vortex-os/session-rituals. The engine returns structured hits rather than a report so the host can either render a list or weave a hit into conversation (operator decision 5). See docs/memory-extended-design.md for the backend/embedder rationale.

Conversation sessions are searchable too. vector.rebuildSessions(sessionArchiveStore, embed) vectorizes each archived session and stores chunks under source: "session-archive" plus a session_chunks metadata row for hydration. Run npx rebuild-memory-vector --sessions. recall then returns session hits alongside memory hits (pass a SessionChunkStore as sessionChunks to hydrate them; without it session hits are skipped). Filter to one corpus with /recall <q> --source session-archive.

Granularity is "turn" by default — one chunk per user+assistant exchange — which a real-data tuning run found equals or beats topic segmentation while staying simple ("segment" mode is available for coarser topic chunks). The bigger quality lever is the content filter: only user/assistant text is indexed; tool output (git/ls/npm dumps) and host system-reminder blocks are stripped so the index reflects what was discussed. Recall precision on short conversational turns is bounded by the embedder — the multilingual-e5-small default reliably ranks the on-topic hit first (it fixed a discrimination failure the earlier MiniLM default had) at the same weight class; for sharper still, swap in a stronger EmbedFn (e.g. bge-m3, heavier — best hosted on a GPU and called remotely).

Quick usage — `mcp` (memory tools over stdio)

The mcp namespace exposes six tools so any MCP-capable host (Claude Desktop, chatbots, other agents) can work with the user's memory:

Read tools — recall (semantic search), list_memories (filtered summaries), get_memory (one entry in full).

Document tools — a deliberate propose-then-write pair so the only write surface is gated behind an explicit second call (honoring "auto the plumbing, propose the prose" — operational records are auto-kept; formal documents are proposed, then written only on acceptance):

suggest_document(topic, name, content, …) — returns a preview + target path
- fingerprint, writes nothing. Flags if the target exists or was declined.
write_document(…) — the explicit second step; creates data/<topic>/<name>.md with frontmatter (never overwrites) and records the acceptance.
decline_document(topic, name) — suppresses the same suggestion for ~30 days.

Because these are memory features, they ship inside this package rather than as a separate install. The MCP SDK (@modelcontextprotocol/sdk) is an optional dependency, loaded at runtime only when the server runs — consumers who never run it pay nothing for it.

Run the bundled stdio server (usually launched by the host, not by hand):

npx vortex-mcp-recall --data-dir ./data

It self-resolves the data dir from --data-dir, then VORTEX_DATA_DIR, then the cwd, and reads the recall index at <dataDir>/_indexes/memory.sqlite. The embedder loads once at startup and is reused for every query.

Claude Desktop (claude_desktop_config.json, one-time setup):

{
  "mcpServers": {
    "vortex-recall": {
      "command": "npx",
      "args": ["vortex-mcp-recall", "--data-dir", "/absolute/path/to/instance/data"]
    }
  }
}

Programmatic use (attach your own transport):

import { vector, mcp } from "@vortex-os/memory-extended";

const server = await mcp.createRecallServer({
  embed: vector.createLocalEmbedder(),
  dbPath: "./data/_indexes/memory.sqlite",
});
// await server.connect(transport)

Quick usage — `consolidate`

import { SessionArchiveStore } from "@vortex-os/memory-extended/sessionArchive";
import { Consolidator } from "@vortex-os/memory-extended/consolidate";
import { proactiveCurator } from "@vortex-os/base";

const store = new SessionArchiveStore("./data");
const consolidator = new Consolidator(store, process.cwd(), {
  defaultLookbackDays: 7,
});

// Host-supplied LLM adapter — see @vortex-os/base "Opt-in /curate surface"
// for the Claude Code wiring example.
const llm = new proactiveCurator.ClaudeCodeLLMJudge(async ({ prompt }) => {
  return await invokeMySubAgent(prompt);
});

const proposals = await consolidator.propose({}, { llm, now: new Date() });
// Each proposal carries onAccept / onDecline thunks. Surface them to the
// operator; on accept the memory file is written under data/_memory/<name>.md
// via a single create-file action.

The consolidator queries recently-ingested sessions from sessionArchive, asks the LLM for memory candidates with explicit source-event citations, drops candidates whose fingerprint is on the rejected list (30-day expiry), and returns proposals shaped exactly like proactive-curator outputs — so a single /curate-style host UX renders both in-session capture and post-session consolidation.

Privacy & network behavior

Embedding model download. The vector / recall namespaces use the e5-small sentence-embedding model via @huggingface/transformers. On first use the model weights are fetched (HTTP GET, read-only) from huggingface.co and cached locally. To run fully offline: set HF_HUB_OFFLINE=1, pre-seed the Hugging Face cache directory, or pass your own EmbedFn to skip the bundled model entirely. No data is uploaded — the download is one-directional.
Consolidation and the host LLM. The consolidate namespace sends excerpts of your archived session transcripts to whatever LLM adapter the host wires in (the LLMJudge you supply). Those excerpts leave the process only through that adapter; choose one whose data-handling you trust.
Local storage permissions. Archived raw transcripts, the normalized JSONL, and the SQLite indexes are written under data/_session-archive/ with owner-only permissions (0o700 dirs / 0o600 files) on POSIX hosts. This is a no-op on Windows.
Verbatim transcript content + git tracking. The normalized JSONL under data/_session-archive/normalized/ stores verbatim conversation content — every user and assistant message as it was written. If you enable git tracking for that path (the normalized layer is the cross-machine source of truth, so this is a reasonable choice), any secret that appeared in a transcript — an API token pasted into a prompt, a password in a tool result — is committed and synced along with it. To scrub such patterns before the normalized files are written, supply an ingest-time redaction hook via AdapterOptions.redact (passed as options.redact to ingest): it runs on every event after normalization and before storage, so a regex that masks token-shaped strings keeps them out of the archive entirely.

Design documents

docs/memory-extended-design.md — cluster overview, sub-phase breakdown, industry pattern alignment
docs/transcript-adapter-design.md — sessionArchive adapter interface, per-host schema mapping, on-disk storage, SQLite schema, lock-handling UX
docs/proactive-curator-design.md — in-session counterpart (separate module)
docs/architecture.md — multi-layer memory architecture, source-of-truth policy

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@vortex-os/memory-extended

What it is

What it is not

Installation

Quick usage — sessionArchive

Quick usage — sqlite

Quick usage — vector + recall

Quick usage — mcp (memory tools over stdio)

Quick usage — consolidate

Privacy & network behavior

Design documents

Quick usage — `sessionArchive`

Quick usage — `sqlite`

Quick usage — `vector` + `recall`

Quick usage — `mcp` (memory tools over stdio)

Quick usage — `consolidate`