@aabadin/agent-memory-mcp
v2.0.0
Published
MCP server for persistent agent memory — hybrid BM25 + vector search backed by LanceDB with local ONNX embeddings. No API keys, fully local.
Maintainers
Readme
@aabadin/agent-memory-mcp
MCP server for persistent agent memory, backed by LanceDB with hybrid BM25 + vector search. Gives AI agents the ability to store, search, and manage memories across sessions using the Model Context Protocol.
All data stays on your machine. Embeddings are generated locally using bge-m3 via ONNX — no API keys, no network dependencies after initial setup.
Features
- Hybrid search — combines BM25 full-text search with cosine vector similarity via Reciprocal Rank Fusion (RRF)
- Local embeddings — runs Xenova/bge-m3 locally via ONNX, no external API calls
- 12 memory categories — structured taxonomy for organising memories
- Batch operations — store multiple memories in a single call
- Hardcopy backup — optional JSON file mirror of all mutations for human-readable backup
- Temporal decay — exponential time-based decay favors recent memories when relevance is similar. Configurable half-life, with
evergreenandnever-forgettag exemptions - Fully local — all data stays on disk, no network dependencies after first model download
Install
npm install -g @aabadin/agent-memory-mcpWarning: the first run downloads Xenova/bge-m3. Keep at least 1 GB free for the local model cache and LanceDB store.
Supported platforms: Linux x64/arm64, macOS x64/arm64, Windows x64.
Then add the server to your MCP client configuration.
Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"agent-memory": {
"command": "agent-memory-mcp",
"env": {
"MEMORY_DB_PATH": "/path/to/your/memory-db"
}
}
}
}Claude Code
Add to your project's .mcp.json:
{
"agent-memory": {
"command": "agent-memory-mcp",
"env": {
"MEMORY_DB_PATH": "/path/to/your/memory-db"
}
}
}Configuration
| Variable | Required | Description |
|---|---|---|
| MEMORY_DB_PATH | Yes | Path to the LanceDB database directory |
| EMBEDDING_MODEL | No | HuggingFace model ID (default: Xenova/bge-m3) |
| EMBEDDING_DIMENSIONS | No | Expected vector size. Default: inferred from model (1024 for Xenova/bge-m3) |
| EMBEDDING_POOLING | No | Pooling strategy. Default: inferred from model (cls for BGE models, mean otherwise) |
| EMBEDDING_CACHE_PATH | No | Directory for content-addressed binary embedding cache. Speeds up repeated runs by skipping ONNX inference for previously-seen text |
| MEMORY_DECAY_HALF_LIFE | No | Decay half-life in days (default: 30). Set to 0 to disable temporal decay |
| ENABLE_HARDCOPY | No | Set to true to enable JSON file backup |
| HARDCOPY_PATH | If hardcopy enabled | Directory for JSON mirror files |
Upgrading From MiniLM
If you already have a LanceDB directory created with a smaller embedding model such as Xenova/all-MiniLM-L6-v2 (384 dimensions), you must rebuild that database before using bge-m3 (1024 dimensions). Mixing vector sizes in the same table is not supported.
Typical upgrade flow:
- Stop the MCP server.
- Backup or delete the old
MEMORY_DB_PATHdirectory. - Start the server again so all memories are re-embedded with the new model.
Tools
| Tool | Description |
|---|---|
| store | Store a single memory with content, category, and tags |
| store_batch | Store multiple memories in one call |
| search | Search memories by meaning and/or keywords. Supports hybrid, keyword, and semantic modes |
| recall | Multi-topic contextual recall — searches multiple topics in parallel and includes recent memories |
| find_related | Find memories similar to a specific memory |
| list_recent | List most recent memories, optionally filtered by category |
| update | Update an existing memory — re-embeds automatically if content changes |
| delete | Permanently remove a memory by ID |
| stats | Get database statistics: total count, breakdown by category, timestamps |
| prune | Preview or prune low-strength and dormant memories. Dry-run by default |
Search Modes
The search tool supports three modes:
hybrid(default) — combines BM25 keyword scoring with vector similarity using RRF reranking. Falls back to semantic-only if the full-text index is unavailable.keyword— BM25 full-text search only.semantic— cosine vector similarity only.
All modes support filtering by category, tags, and date range.
Temporal Decay
Search results are scored with exponential time-based decay so that recent memories surface above older ones when semantic relevance is similar. The decay follows a half-life model: a memory one half-life old has its score halved, two half-lives old gets quartered, and so on.
- Default half-life: 30 days (configurable via
MEMORY_DECAY_HALF_LIFE) - Disable: set
MEMORY_DECAY_HALF_LIFE=0 - Exempt tags: memories tagged
evergreenornever-forgetare never decayed
Memory Categories
code-solution · bug-fix · architecture · learning · tool-usage · debugging · performance · security · observation · personal · relationship · other
Development
git clone https://github.com/adamrdrew/agent-memory-mcp.git
cd agent-memory-mcp
npm install
npm run dev # Run with tsx (no build step)
npm run build # Compile TypeScript to dist/
npm test # Run all tests
npm run test:watch # Run tests in watch modeLicense
This project is licensed under the GNU General Public License v3.0.
