@cosmiclasagnadev/zmem
v0.1.1
Published
Local-first hybrid memory with zvec + FTS + MCP
Downloads
128
Maintainers
Readme
zmem
Local-first hybrid memory for engineering workflows.
zmem is a small POC/experiment inspired by QMD-style document search, built to explore whether we can get strong practical recall by combining:
- dense retrieval via
@zvec/zvec - lexical retrieval via SQLite FTS5/BM25
- a simple fusion pipeline (RRF)
- a local MCP server for coding-agent integration
- graph data for relationships between memories (wip)
overall it is a local-first, agent-oriented memory substrate for engineering decisions and evolving project context
Quickstart
Get zmem running locally in a few minutes.
1. Install
npm install -g @cosmiclasagnadev/zmem
zmem helpOr from this repo:
npm install
npm run build
npm install -g .2. Initialize a workspace
Create or update a starter config for the workspace you want to index:
zmem init --workspace=default --root=/absolute/path/to/your/project --yesThis creates a config and sets up zmem to use user-scoped storage by default.
3. Check setup
Validate config, storage, and local model settings:
zmem doctor --workspace=default
zmem models status4. Prepare local query-expansion models
zmem uses local-first query expansion by default.
See the exact model pull commands:
zmem models pullThis will print commands for the default local models:
- primary:
hf:mradermacher/qmd-query-expansion-qwen3.5-2B-GGUF:Q4_K_M - fallback:
hf:mradermacher/qmd-query-expansion-qwen3.5-2B-GGUF:Q4_K_S
Run the printed node-llama-cpp pull commands, or let zmem download on first query.
5. Ingest your docs
zmem ingest /absolute/path/to/your/project --workspace=default6. Query your memory
zmem query "database migration" --workspace=default
zmem query "" --mode=recent --workspace=default
zmem query "" --mode=typed --types=decision,preference --workspace=default7. Use with MCP / OpenCode
Start the MCP server:
zmem mcp --workspace=defaultThen point your MCP client, such as OpenCode, at the zmem command.
First-run notes
- Default storage is outside your repo:
- macOS/Linux:
~/.local/share/zmem/workspaces/<workspace-slug>/ - Windows:
%APPDATA%/zmem/workspaces/<workspace-slug>/
- macOS/Linux:
- Query expansion is enabled by default for hybrid retrieval.
- If the local query-expansion model is unavailable, zmem warns once and falls back to deterministic expansion.
- You can inspect the fully resolved config with:
zmem config show --workspace=defaultReferences:
What this does differently
- explicit memory types and lifecycle (pending, active, archived, deleted) plus supersedes_id
- if no explicit type is defined, it defaults to 'fact'
- if no explicit tags are defined, it defaults to '[]'
- local-first query expansion is enabled by default for hybrid retrieval, with deterministic fallback when the local model is unavailable
- uses
@zvec/zvecfor dense retrieval when doing vector search
Architecture (high level)
- Language/runtime: TypeScript + Node.js
- Vectors:
@zvec/zvec - Lexical:
better-sqlite3+ FTS5/BM25 - Embeddings:
node-llama-cppby default, with explicit provider hooks for Gemini and offline mock testing - Validation:
zod - Agent integration:
@modelcontextprotocol/sdk
Hybrid retrieval flow (follows qmd's style closely):
- lexical search (FTS/BM25)
- vector search (zvec)
- reciprocal rank fusion (RRF)
- bounded query expansion
- heuristic memory-item reranking
Indexing pipeline
flowchart LR
A[Markdown/Text Sources] --> B[Ingestion Pipeline]
B --> C[Parse + Chunk]
C --> D[(SQLite + FTS5)]
C --> E[Generate Embeddings]
E --> F[(zvec Vector Index)]Retrieval pipeline
flowchart LR
Q[User Query] --> M{Mode}
M -->|lexical| L[FTS/BM25 Search]
M -->|vector| V[Embedding + zvec Search]
M -->|hybrid| H[FTS + Vector in parallel]
L --> O[Top Results]
V --> O
H --> R[RRF Fusion]
R --> O
O --> C1[CLI Output]
O --> C2[MCP Tools]At runtime, the same core API powers both CLI commands and MCP tools, so behavior stays consistent across direct terminal use and agent workflows.
Project layout
src/ingest/- discovery, parsing, chunking, orchestrationsrc/search/- lexical/vector retrieval + fusionsrc/core/- shared API (save,recall,get,list,delete,reindex,status)src/mcp/- MCP server + tool registrationsrc/db/- SQLite handling + migrationsconfig.example.json- sample configuration
Getting started
1) Install
npm install2) Configure
Create your local config file:
cp config.example.json config.jsonThen update:
workspaces[].rootto a real absolute path- any desired
patternsfor ingestion - storage paths (
storage.dbPath,storage.zvecPath) if needed ai.embedding.providerif you want to switch from localllamacpptogeminiai.embedding.apiKey/ZMD_EMBED_API_KEYwhen usinggeministorage.baseDironly if you want to override the default XDG-style storage rootai.queryExpansion.*if you want to tune or disable default local-first query expansion
If config.json is missing, zmem falls back to defaults.
Default storage is outside your repo:
- macOS/Linux:
~/.local/share/zmem/workspaces/<workspace-slug>/ - Windows:
%APPDATA%/zmem/workspaces/<workspace-slug>/
Within each workspace directory, zmem stores:
memory.dbvectors/
Advanced overrides still work through storage.baseDir, storage.dbPath, storage.zvecPath, ZMEM_STORAGE_BASE_DIR, ZMEM_DB_PATH, and ZMEM_ZVEC_PATH.
Query expansion can also be tuned with env vars such as:
ZMEM_QUERY_EXPANSION_ENABLEDZMEM_QUERY_EXPANSION_PROVIDERZMEM_QUERY_EXPANSION_MODELZMEM_QUERY_EXPANSION_FALLBACK_MODELZMEM_QUERY_EXPANSION_MAX_EXPANSIONS
3) Ingest documents
npm run dev -- ingest ./docs --workspace=default4) Query memory
npm run dev -- query "database decisions" --mode=hybrid --workspace=default5) Check status
npm run dev -- status --workspace=defaultCLI usage
Run help:
npm run dev -- helpPrimary commands:
ingest <path> [--workspace=<name>] [--logs=true|false]query <query> [--workspace=<name>] [--mode=hybrid|lexical|vector|recent|important|typed] [--scopes=a,b] [--types=a,b] [--expansion-mode=off|deterministic|llm] [--logs=true|false]status [--workspace=<name>] [--logs=true|false]init [--config=./config.json] [--workspace=<name>] [--root=<path>] [--storage-base-dir=<path>] [--enable-query-expansion=true|false] [--yes]doctor [--config=./config.json] [--workspace=<name>]config show [--config=./config.json] [--workspace=<name>]models <status|check|pull> [--config=./config.json]mcp [--config=./config.json] [--workspace=<name>] [--verbose=true|false]
Examples:
npm run dev -- ingest ./test-docs/search --workspace=default
npm run dev -- query "sqlite" --mode=lexical --workspace=default
npm run dev -- query "vector embeddings" --mode=vector --workspace=default
npm run dev -- init --workspace=default --root=/absolute/path/to/repo --yes
npm run dev -- doctor --workspace=default
npm run dev -- models pullLocal query expansion
zmem now defaults to local-first query expansion for hybrid retrieval.
- primary model:
hf:mradermacher/qmd-query-expansion-qwen3.5-2B-GGUF:Q4_K_M - fallback model:
hf:mradermacher/qmd-query-expansion-qwen3.5-2B-GGUF:Q4_K_S - expansion is skipped when lexical exact-match signal is already strong
- if the local model is unavailable, zmem warns once and falls back to deterministic expansion
To inspect or prepare local models:
npm run dev -- models status
npm run dev -- models pullMCP usage
Start the MCP stdio server:
npm run dev -- mcp --workspace=defaultImplemented tools:
memory_querymemory_searchmemory_getmemory_listmemory_savememory_deletememory_statusmemory_linkmemory_neighborsmemory_edge_update
Optional admin tool:
memory_reindex(enabled withZMEM_ENABLE_REINDEX_TOOL=true)
Verbose MCP logs:
ZMEM_MCP_VERBOSE=true npm run dev -- mcp --workspace=defaultLocal development
Useful scripts:
npm run dev- run CLI viatsxnpm run build- build TypeScript todist/npm start- run built CLInpm run typecheck- type-check without emittingnpm test- run testsnpm run smoke- build + smoke script
Typical dev loop:
npm run typecheck
npm test
npm run smokeEnvironment variables
ZMD_EMBED_MODEL- override embedding modelZMD_EMBED_DIMENSIONS- override embedding dimensionsZMD_EMBED_PROVIDER- override embedding provider (llamacpp,openai,ollama,gemini,mock)ZMD_EMBED_API_KEY- embedding API key override for remote providers such as GeminiZMD_EMBED_BASE_URL- optional embedding API base URL overrideZMD_EMBED_TASK_TYPE- optional Gemini task type override such asRETRIEVAL_DOCUMENTZMEM_STORAGE_BASE_DIR- override the XDG-style storage rootZMEM_DB_PATH- override the resolved database path directlyZMEM_ZVEC_PATH- override the resolved vector storage path directlyZMEM_WORKSPACE- default workspace for MCP resolutionZMEM_MCP_VERBOSE=true- verbose MCP logs to stderrZMEM_ENABLE_REINDEX_TOOL=true- exposememory_reindexMCP tool
Roadmap
- [ ] graph traversal poc
- [ ] Rust implementation and comparison with metrics
- [ ] policy/compliance/audit/retention layer
- [ ] added unit tests for error paths
- [ ] improve batching controls and recall latency metrics
- [ ] more integration ergonomics and ideas
- [ ] reranking improvements (position-aware blend)
- [ ] query expansion strategies
- [ ] deeper retrieval tuning and eval harnesses
