@vortex-os/memory-extended
v0.5.3
Published
Add-on cluster — extended memory namespaces (sqlite, vector, sessionArchive, recall, consolidate) layered on @vortex-os/base.
Downloads
1,045
Readme
@vortex-os/memory-extended
Extended memory layer for VortEX — an opt-in add-on that lives on top of @vortex-os/base. Adds derived indexes and a conversation archive while keeping the base's markdown _memory/ layer as the single source of truth.
What it is
memory-extended ships as a single npm package with six cooperating namespaces:
| Namespace | Purpose | Status |
|---|---|---|
| sessionArchive | Append-only JSONL log of agent sessions + SQLite metadata. Four first-party host adapters (Claude Code CLI, Codex CLI, Gemini CLI, Claude Desktop). | Shipped |
| consolidate | Post-session fact-extraction proposer — read past sessionArchive events, propose memory candidates, operator confirms. | Shipped |
| sqlite | Structured store derived from markdown memories — hard-filter queries (byType / byTag / byPrivacy / updatedSince), drift detection (log + skip, non-destructive). | Shipped |
| vector | Dense retrieval over memories and conversation sessions. Default backend: in-process brute-force cosine; vectors in the shared memory.sqlite. Host-injected EmbedFn (multilingual local default). Sessions are vectorized at topic-chunk granularity via embedding-similarity segmentation. | Shipped |
| recall | Two-stage hybrid retrieval engine (SQLite loose hard-filter → cosine rerank). Returns data, not a report. Hits carry source = memory | session-archive. /recall <query> command lives in session-rituals. | Shipped |
| mcp | MCP server for any MCP host (Claude Desktop, chatbots) over stdio: read tools (recall, list_memories, get_memory) + document tools (suggest_document → write_document / decline_document, a propose-then-write pair), plus a one-line install into the host config. Writes are gated behind an explicit second tool call. SDK is an optional dependency. Bin: vortex-mcp-recall. | Shipped |
The companion module proactive-curator (shipped inside @vortex-os/base) handles the in-session counterpart to consolidate — live "this looks worth capturing" prompts during an ongoing conversation. memory-extended/consolidate emits proposals shaped identically to proactive-curator's Proposal / LLMJudge types (that small surface is inlined under src/internal/, not imported), so a single host UX renders both surfaces.
What it is not
- It does not replace
@vortex-os/base'smemorySystem. The markdown layer remains the source of truth; this package adds derived indexes that can be deleted and regenerated at any time. - It does not write to
_memory/directly. Theconsolidatenamespace proposes; the operator confirms; writes go throughbase'smemorySystem. - It does not require all of its namespaces to be present. Use what you need; drop the rest.
Installation
npm install @vortex-os/memory-extended @vortex-os/base better-sqlite3
# Optional — only needed if you ingest from Claude Desktop:
npm install classic-level@vortex-os/base is a required peer because this package is an add-on layered onto a base instance — it expects a base-scaffolded data/ layout alongside it, not because its code imports base. (memory-extended is self-contained: its only runtime dependency is yaml; the proactive-curator types consolidate mirrors are inlined under src/internal/.) better-sqlite3 is required for the SQLite metadata layer. classic-level is optional — required only when the Claude Desktop adapter is registered.
The local embedder (@huggingface/transformers) is an optional dependency — it installs automatically with this package; nothing to fetch by hand. The default model is Xenova/multilingual-e5-small (384-dim, 50+ languages incl. Korean, 512-token input, ~470 MB) — a retrieval-tuned multilingual model that downloads once on first vector/recall use and is cached. It is asymmetric: a "query: " prefix is applied to search text and "passage: " to indexed text automatically (the store passes the right kind). To use a symmetric model instead (e.g. Xenova/all-MiniLM-L6-v2, ~90 MB) pass { model: "...", prefixes: null }. To skip the local model entirely, pass your own EmbedFn (e.g. an OpenAI/Voyage adapter).
Changing the embedder requires a vector rebuild. Vectors are model-specific; even when the dimension matches (384-dim) the embedding spaces differ, so run
npx rebuild-memory-vectorafter switching models.
Quick usage — sessionArchive
import {
ingest,
claudeCodeAdapter,
codexAdapter,
geminiAdapter,
claudeDesktopAdapter,
} from "@vortex-os/memory-extended/sessionArchive";
await ingest({
adapters: [claudeCodeAdapter, codexAdapter, geminiAdapter, claudeDesktopAdapter],
cwd: process.cwd(),
dataDir: "./data",
since: "2026-05-01T00:00:00Z",
});The orchestrator runs each adapter's detect() in parallel, calls list() on the ones that resolve true, reads each session end-to-end, normalizes events to VortexTranscriptEvent, and writes raw + normalized JSONL into data/_session-archive/ along with SQLite metadata for incremental re-scan.
Quick usage — sqlite
import { MemorySqliteStore, rebuildFromMemoryDir, driftCheck } from "@vortex-os/memory-extended/sqlite";
const store = new MemorySqliteStore("./data/_indexes/memory.sqlite");
// One-shot rebuild from data/_memory/*.md (or run `npx rebuild-memory-sqlite`).
await rebuildFromMemoryDir(store, "./data/_memory");
// Four hard-filter helpers.
const rules = store.byType("feedback");
const taggedRules = store.byTag("onboarding");
const visiblePublic = store.byPrivacy("public");
const fresh = store.updatedSince("2026-05-01");
// Composite query (AND across fields, OR within a list).
const recent = store.query({
type: ["feedback", "project"],
tags: ["vortex"],
updatedSinceMs: Date.parse("2026-05-20"),
});
// Drift check — non-destructive. Returns the drifted ids; the operator
// decides whether to re-run rebuild or treat the markdown as new truth.
const report = await driftCheck(store, "./data/_memory");
console.log(`drifted ${report.drifted.length}/${report.rowsScanned}`);
store.close();The sqlite file lives under data/_indexes/ (gitignored). Markdown remains the source of truth — the sqlite index is derived and rebuildable. See the operator decision table in docs/memory-extended-design.md for the rationale.
Quick usage — vector + recall
import { MemorySqliteStore } from "@vortex-os/memory-extended/sqlite";
import { MemoryVectorStore, createLocalEmbedder } from "@vortex-os/memory-extended/vector";
import { recall, renderRecallHits } from "@vortex-os/memory-extended/recall";
const dbPath = "./data/_indexes/memory.sqlite";
const sqlite = new MemorySqliteStore(dbPath);
const vector = new MemoryVectorStore({ db: dbPath }); // default brute-force cosine backend
const embed = createLocalEmbedder(); // or your own EmbedFn (OpenAI/Voyage)
// One-shot rebuild from the sqlite store (or run `npx rebuild-memory-vector`).
await vector.rebuild(sqlite, embed); // → { indexed, skipped, pruned }
// The engine returns DATA — list it, or phrase one hit in conversation.
const result = await recall(
{ query: "tone feedback from last May", k: 5 }, // intent: type/tag/month parsed loosely
{ sqlite, vector, embed },
);
console.log(renderRecallHits(result)); // optional compact list render
vector.close();
sqlite.close();The vector index shares the same memory.sqlite (a memory_vectors table) — one file, one rebuild path. The /recall <query> slash command wraps this engine; register it via createRitualRegistry({ recall: { embed } }) in @vortex-os/session-rituals. The engine returns structured hits rather than a report so the host can either render a list or weave a hit into conversation (operator decision 5). See docs/memory-extended-design.md for the backend/embedder rationale.
Conversation sessions are searchable too. vector.rebuildSessions(sessionArchiveStore, embed) vectorizes each archived session and stores chunks under source: "session-archive" plus a session_chunks metadata row for hydration. Run npx rebuild-memory-vector --sessions. recall then returns session hits alongside memory hits (pass a SessionChunkStore as sessionChunks to hydrate them; without it session hits are skipped). Filter to one corpus with /recall <q> --source session-archive.
Granularity is "turn" by default — one chunk per user+assistant exchange — which a real-data tuning run found equals or beats topic segmentation while staying simple ("segment" mode is available for coarser topic chunks). The bigger quality lever is the content filter: only user/assistant text is indexed; tool output (git/ls/npm dumps) and host system-reminder blocks are stripped so the index reflects what was discussed. Recall precision on short conversational turns is bounded by the embedder — the multilingual-e5-small default reliably ranks the on-topic hit first (it fixed a discrimination failure the earlier MiniLM default had) at the same weight class; for sharper still, swap in a stronger EmbedFn (e.g. bge-m3, heavier — best hosted on a GPU and called remotely).
Quick usage — mcp (memory tools over stdio)
The mcp namespace exposes six tools so any MCP-capable host (Claude Desktop,
chatbots, other agents) can work with the user's memory:
Read tools — recall (semantic search), list_memories (filtered
summaries), get_memory (one entry in full).
Document tools — a deliberate propose-then-write pair so the only write surface is gated behind an explicit second call (honoring "auto the plumbing, propose the prose" — operational records are auto-kept; formal documents are proposed, then written only on acceptance):
suggest_document(topic, name, content, …)— returns a preview + target path- fingerprint, writes nothing. Flags if the target exists or was declined.
write_document(…)— the explicit second step; createsdata/<topic>/<name>.mdwith frontmatter (never overwrites) and records the acceptance.decline_document(topic, name)— suppresses the same suggestion for ~30 days.
Because these are memory features, they ship inside this package rather than
as a separate install. The MCP SDK (@modelcontextprotocol/sdk) is an
optional dependency, loaded at runtime only when the server runs — consumers
who never run it pay nothing for it.
Run the bundled stdio server (usually launched by the host, not by hand):
npx vortex-mcp-recall --data-dir ./dataIt self-resolves the data dir from --data-dir, then VORTEX_DATA_DIR, then the
cwd, and reads the recall index at <dataDir>/_indexes/memory.sqlite. The
embedder loads once at startup and is reused for every query.
Claude Desktop (claude_desktop_config.json, one-time setup):
{
"mcpServers": {
"vortex-recall": {
"command": "npx",
"args": ["vortex-mcp-recall", "--data-dir", "/absolute/path/to/instance/data"]
}
}
}Programmatic use (attach your own transport):
import { vector, mcp } from "@vortex-os/memory-extended";
const server = await mcp.createRecallServer({
embed: vector.createLocalEmbedder(),
dbPath: "./data/_indexes/memory.sqlite",
});
// await server.connect(transport)Quick usage — consolidate
import { SessionArchiveStore } from "@vortex-os/memory-extended/sessionArchive";
import { Consolidator } from "@vortex-os/memory-extended/consolidate";
import { proactiveCurator } from "@vortex-os/base";
const store = new SessionArchiveStore("./data");
const consolidator = new Consolidator(store, process.cwd(), {
defaultLookbackDays: 7,
});
// Host-supplied LLM adapter — see @vortex-os/base "Opt-in /curate surface"
// for the Claude Code wiring example.
const llm = new proactiveCurator.ClaudeCodeLLMJudge(async ({ prompt }) => {
return await invokeMySubAgent(prompt);
});
const proposals = await consolidator.propose({}, { llm, now: new Date() });
// Each proposal carries onAccept / onDecline thunks. Surface them to the
// operator; on accept the memory file is written under data/_memory/<name>.md
// via a single create-file action.The consolidator queries recently-ingested sessions from sessionArchive, asks the LLM for memory candidates with explicit source-event citations, drops candidates whose fingerprint is on the rejected list (30-day expiry), and returns proposals shaped exactly like proactive-curator outputs — so a single /curate-style host UX renders both in-session capture and post-session consolidation.
Privacy & network behavior
- Embedding model download. The
vector/recallnamespaces use thee5-smallsentence-embedding model via@huggingface/transformers. On first use the model weights are fetched (HTTPGET, read-only) fromhuggingface.coand cached locally. To run fully offline: setHF_HUB_OFFLINE=1, pre-seed the Hugging Face cache directory, or pass your ownEmbedFnto skip the bundled model entirely. No data is uploaded — the download is one-directional. - Consolidation and the host LLM. The
consolidatenamespace sends excerpts of your archived session transcripts to whatever LLM adapter the host wires in (theLLMJudgeyou supply). Those excerpts leave the process only through that adapter; choose one whose data-handling you trust. - Local storage permissions. Archived raw transcripts, the normalized JSONL,
and the SQLite indexes are written under
data/_session-archive/with owner-only permissions (0o700dirs /0o600files) on POSIX hosts. This is a no-op on Windows. - Verbatim transcript content + git tracking. The normalized JSONL under
data/_session-archive/normalized/stores verbatim conversation content — every user and assistant message as it was written. If you enable git tracking for that path (the normalized layer is the cross-machine source of truth, so this is a reasonable choice), any secret that appeared in a transcript — an API token pasted into a prompt, a password in a tool result — is committed and synced along with it. To scrub such patterns before the normalized files are written, supply an ingest-time redaction hook viaAdapterOptions.redact(passed asoptions.redacttoingest): it runs on every event after normalization and before storage, so a regex that masks token-shaped strings keeps them out of the archive entirely.
Design documents
docs/memory-extended-design.md— cluster overview, sub-phase breakdown, industry pattern alignmentdocs/transcript-adapter-design.md—sessionArchiveadapter interface, per-host schema mapping, on-disk storage, SQLite schema, lock-handling UXdocs/proactive-curator-design.md— in-session counterpart (separate module)docs/architecture.md— multi-layer memory architecture, source-of-truth policy
