@cablate/memory-lancedb-mcp

v2.0.33

Published

2 months ago

MCP server for LanceDB-backed long-term memory with hybrid retrieval (Vector + BM25), cross-encoder rerank, multi-scope isolation, and memory lifecycle management

0High
0Medium
0Low

cablate

mcp mcp-server memory lancedb vector-search bm25 hybrid-retrieval rerank ai-memory long-term-memory

Persistent, intelligent long-term memory for any MCP-compatible AI agent.

English | 繁體中文

Before / After

Without memory, every session starts from zero. With memory-lancedb-mcp, your agent accumulates knowledge across sessions — automatically.

Before — agent has no context:

User: "Use the same animation style as last time"
Agent: "I don't have any context about previous animations. Could you describe what you'd like?"

After — agent recalls past decisions:

<memories>
1. Remotion spring animation: use duration >= 20, damping 12-15 for smooth easing
2. Video export preset: 1080p, 30fps for social, 60fps for demo
</memories>
<refs>#1=6352a7d2 #2=bed148f0</refs>

Store responses are minimal — no noise, just confirmation:

Stored. [topic: remotion]

Quick Start

1. Install

npm install -g @cablate/memory-lancedb-mcp

2. Configure

Add to your MCP client settings (e.g. Claude Desktop claude_desktop_config.json):

{
  "mcpServers": {
    "memory": {
      "command": "npx",
      "args": ["-y", "@cablate/memory-lancedb-mcp"],
      "env": {
        "EMBEDDING_API_KEY": "your-api-key",
        "EMBEDDING_MODEL": "text-embedding-3-small"
      }
    }
  }
}

{
  "mcpServers": {
    "memory": {
      "command": "npx",
      "args": ["-y", "@cablate/memory-lancedb-mcp"],
      "env": {
        "MEMORY_LANCEDB_CONFIG": "/path/to/config.json"
      }
    }
  }
}

See config.example.json for all options.

How It Works

          store                          recall
            │                              │
   ┌────────▼────────┐           ┌────────▼────────┐
   │  Filter junk     │           │ Search by meaning │
   │  Save + embed    │           │   AND keywords    │
   │  Link related    │           │ Re-rank results   │
   │  Flag conflicts  │           │ Fade stale ones   │
   │  Tag topic       │           │ Pull in related   │
   └────────┬────────┘           │ Merge duplicates  │
            │                    └────────┬────────┘
            ▼                             ▼
   ┌─────────────────────────────────────────────┐
   │          LanceDB (local, zero-config)        │
   └─────────────────────────────────────────────┘

Every memory_store saves to a local database, automatically links related memories, flags contradictions, and assigns topic labels — no extra API calls needed. Every memory_recall searches by both meaning and keywords, pulls in related memories the main search might miss, and includes maintenance hints so the agent can keep its own knowledge base clean.

Features

Retrieval

Finds the right memory even when you use different words — searches by meaning and exact keywords simultaneously, then combines the best of both
More precise results, not just surface matches — an optional second pass re-ranks results by actual relevance (6 providers supported)
Search multiple topics at once — pass a queries array to search several keywords in one call; results are deduplicated and memories that match multiple queries rank higher
Finding A automatically surfaces related B — when a memory is found, its linked neighbors are pulled in too, even if they use completely different words
Minimal token overhead — responses use compact XML tags (<memories>, <hints>, <refs>) with short IDs, no category/scope noise

Storage

Related memories link themselves — when you store something new, it automatically creates bidirectional links to similar existing memories
Conflicts get flagged — if a new memory contradicts an existing one, you get a warning so nothing silently overwrites
Topics assigned automatically — each memory gets a topic label inferred from its content and neighbors; you can also set it explicitly
Junk gets filtered out — greetings, refusals, and meta-questions are rejected before they waste storage

Lifecycle

Frequently used memories stay sharp, stale ones fade — a decay model balances how recent, how often accessed, and how important each memory is
Memories earn their keep — three tiers (Peripheral → Working → Core); the more a memory gets used, the faster it promotes
Full version history — when you update a memory, the old version is preserved in a chain you can trace with memory_history

Maintenance

The agent maintains itself — recall results include inline hints about duplicates, dormant memories, and contradictions
Health checks on demand — memory_lint finds orphaned memories, stale entries, and missing links, then fixes what it can
Merge duplicates — memory_merge combines two redundant memories into one; originals are marked as superseded
See your memory space — memory_visualize generates an interactive HTML graph you can open in any browser

Visualization

Run memory_visualize to generate an interactive knowledge graph of your memory space:

Automatic clustering — related memories group together visually
Similarity edges, duplicate detection, importance sizing
Time filter, growth animation, cluster view
Self-contained HTML — open in any browser

Query → embedQuery() ─┐
                       ├─→ RRF Fusion → Rerank → Lifecycle Decay → Length Norm → Filter
Query → BM25 FTS ─────┘

| Stage | Effect | |-------|--------| | RRF Fusion | Combines semantic and exact-match recall | | Cross-Encoder Rerank | Promotes semantically precise hits | | Lifecycle Decay | Weibull freshness + access frequency + importance | | Length Normalization | Prevents long entries from dominating (anchor: 500 chars) | | Hard Min Score | Removes irrelevant results (default: 0.35) | | MMR Diversity | Cosine similarity > 0.85 → demoted |

Configuration

Environment Variables

| Variable | Required | Description | |----------|----------|-------------| | EMBEDDING_API_KEY | Yes | API key for embedding provider | | EMBEDDING_MODEL | No | Model name (default: text-embedding-3-small) | | EMBEDDING_BASE_URL | No | Custom base URL for non-OpenAI providers | | MEMORY_DB_PATH | No | LanceDB storage directory | | MEMORY_LANCEDB_CONFIG | No | Path to JSON config file |

{
  "embedding": {
    "apiKey": "${EMBEDDING_API_KEY}",
    "model": "jina-embeddings-v5-text-small",
    "baseURL": "https://api.jina.ai/v1",
    "dimensions": 1024,
    "taskQuery": "retrieval.query",
    "taskPassage": "retrieval.passage",
    "normalized": true
  },
  "dbPath": "./memory-data",
  "retrieval": {
    "mode": "hybrid",
    "vectorWeight": 0.7,
    "bm25Weight": 0.3,
    "minScore": 0.3,
    "rerank": "cross-encoder",
    "rerankApiKey": "${JINA_API_KEY}",
    "rerankModel": "jina-reranker-v3",
    "rerankEndpoint": "https://api.jina.ai/v1/rerank",
    "rerankProvider": "jina",
    "candidatePoolSize": 20,
    "hardMinScore": 0.35,
    "filterNoise": true
  },
  "enableManagementTools": true,
  "enableSelfImprovementTools": false,
  "enableVisualizationTools": true,
  "scopes": {
    "default": "global",
    "definitions": {
      "global": { "description": "Shared knowledge" },
      "agent:my-bot": { "description": "Private to my-bot" }
    },
    "agentAccess": {
      "my-bot": ["global", "agent:my-bot"]
    }
  },
  "decay": {
    "recencyHalfLifeDays": 30,
    "frequencyWeight": 0.3,
    "intrinsicWeight": 0.3
  }
}

Works with any OpenAI-compatible embedding API:

| Provider | Model | Base URL | Dimensions | |----------|-------|----------|------------| | OpenAI | text-embedding-3-small | https://api.openai.com/v1 | 1536 | | Jina | jina-embeddings-v5-text-small | https://api.jina.ai/v1 | 1024 | | DeepInfra | Qwen/Qwen3-Embedding-8B | https://api.deepinfra.com/v1/openai | 1024 | | Google Gemini | gemini-embedding-001 | https://generativelanguage.googleapis.com/v1beta/openai/ | 3072 | | Ollama (local) | nomic-embed-text | http://localhost:11434/v1 | varies |

| Provider | rerankProvider | Endpoint | Example Model | |----------|-----------------|----------|---------------| | Jina | jina | https://api.jina.ai/v1/rerank | jina-reranker-v3 | | Hugging Face TEI | tei | http://host:8081/rerank | BAAI/bge-reranker-v2-m3 | | SiliconFlow | siliconflow | https://api.siliconflow.com/v1/rerank | BAAI/bge-reranker-v2-m3 | | Voyage AI | voyage | https://api.voyageai.com/v1/rerank | rerank-2.5 | | Pinecone | pinecone | https://api.pinecone.io/rerank | bge-reranker-v2-m3 | | DashScope | dashscope | https://dashscope.aliyuncs.com/api/v1/services/rerank | gte-rerank |

Core Tools

| Tool | Description | |------|-------------| | memory_recall | Search memories — supports batch queries, relation expansion, topic filtering, and inline maintenance hints | | memory_store | Save a memory — auto-links related ones, flags contradictions, infers topic, filters junk | | memory_forget | Delete by ID or search query | | memory_update | Update a memory; the old version is preserved in a version chain | | memory_merge | Merge two memories into one | | memory_history | Trace version history through update/merge chains |

Management Tools (opt-in)

| Tool | Description | |------|-------------| | memory_stats | Usage statistics by scope and category | | memory_list | List recent memories with filtering | | memory_lint | Health checks + auto-fix missing relations |

Enable: "enableManagementTools": true

Self-Improvement Tools (opt-in)

| Tool | Description | |------|-------------| | self_improvement_log | Log structured learning/error entries | | self_improvement_extract_skill | Create skill scaffolds from learnings | | self_improvement_review | Summarize governance backlog |

Enable: "enableSelfImprovementTools": true

Visualization Tools (on by default)

| Tool | Description | |------|-------------| | memory_visualize | Generate interactive HTML memory graph |

Params: output_path, scope, threshold (default: 0.65), max_neighbors (default: 4)

Disable: "enableVisualizationTools": false

LanceDB table memories:

| Field | Type | Description | |-------|------|-------------| | id | string (UUID) | Primary key | | text | string | Memory text (FTS indexed) | | vector | float[] | Embedding vector | | category | string | preference / fact / decision / entity / skill / lesson / other | | scope | string | Scope identifier | | importance | float | Importance score 0-1 | | timestamp | int64 | Creation timestamp (ms) | | metadata | string (JSON) | Extended metadata (tier, access_count, relations, topic, etc.) |

Development

git clone https://github.com/cablate/memory-lancedb-mcp.git
cd memory-lancedb-mcp
npm install
npm test

Run locally:

EMBEDDING_API_KEY=your-key npx tsx server.ts

Credits

Built on CortexReach/memory-lancedb-pro — original work by win4r and contributors.

License

MIT — see LICENSE for details.