@alex900530/claude-persistent-memory
v1.1.1
Published
Persistent memory system for Claude Code — hybrid BM25 + vector search, LLM-driven structuring, automatic clustering
Downloads
1,146
Readme
Features
🧠 Hybrid Search — BM25 full-text (FTS5) + vector semantic similarity (sqlite-vec), combined ranking
📡 4-Channel Retrieval — Pull (MCP tools) + Push (auto-inject via hooks on user prompt, pre-tool, post-tool)
🏗️ LLM Structuring — Memories auto-structured into <what>/<when>/<do>/<warn> XML format
📦 Automatic Clustering — Similar memories grouped, mature clusters promoted to reusable skills
📊 Confidence Scoring — Memories gain/lose confidence through validation feedback and usage
⏳ Time Decay — Configurable half-lives per memory type (facts: 90d, context: 30d, skills: never)
🔒 Local-First — All data stored locally in SQLite. Your memories never leave your machine.
Architecture
┌─────────────────────────────────────────────────────────────┐
│ Claude Code Session │
├─────────────────────────────────────────────────────────────┤
│ │
│ Pull Channel (on demand) Push Channels (auto) │
│ ┌───────────────────┐ ┌──────────────────────────────┐ │
│ │ MCP Server │ │ UserPromptSubmit Hook │ │
│ │ memory_search │ │ PreToolUse Hook │ │
│ │ memory_save │ │ PostToolUse Hook │ │
│ │ memory_validate │ │ PreCompact Hook (analysis) │ │
│ │ memory_stats │ │ SessionEnd Hook (clustering) │ │
│ └────────┬──────────┘ └──────────────┬───────────────┘ │
│ │ │ │
│ └──────────┬───────────────────┘ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────┐│
│ │ SQLite + FTS5 + sqlite-vec ││
│ │ (memory.db) ││
│ └─────────────────────────────────────────────────────────┘│
│ ▲ │
│ ┌──────────┴───────────────────┐ │
│ │ │ │
│ ┌────────┴──────────┐ ┌──────────────┴───────────────┐ │
│ │ Embedding Server │ │ LLM Server │ │
│ │ TCP :23811 │ │ TCP :23812 │ │
│ │ bge-m3 (1024d) │ │ Azure OpenAI GPT-4.1 │ │
│ └───────────────────┘ └──────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘Quick Start
One-command install
# Set Azure OpenAI credentials (optional — skip to configure later)
export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com"
export AZURE_OPENAI_KEY="your-api-key"
# Install in any project
npm install @alex900530/claude-persistent-memoryThat's it. The postinstall script automatically:
- Configures
.mcp.json(MCP server) - Configures
.claude/settings.json(5 hooks) - Generates
.claude-memory.config.js(config) - Registers background services (embedding + LLM)
- Updates
.gitignore
Open Claude Code in the project directory — memory is ready.
Configure later
If you skipped Azure credentials during install:
npx claude-persistent-memoryThis runs an interactive setup to configure credentials and start services.
Manual install (from source)
git clone https://github.com/MIMI180306/claude-persistent-memory.git
cd claude-persistent-memory
npm install
cp config.default.js config.js
# Edit config.js with your Azure credentials
# Start services
npm run embedding-server # Terminal 1
npm run llm-server # Terminal 2Then manually configure .mcp.json and .claude/settings.json — see Configuration.
MCP Tools
| Tool | Description |
|------|-------------|
| memory_search | Hybrid BM25 + vector search. Params: query, limit?, type?, domain? |
| memory_save | Save a new memory. Params: content, type?, domain?, confidence? |
| memory_validate | Feedback loop — helpful (+0.1) or unhelpful (-0.05). Params: memory_id, is_valid |
| memory_stats | System stats: total memories, type/domain distribution, cluster status |
Hooks
| Hook | Event | Timeout | What it does |
|------|-------|---------|-------------|
| user-prompt-hook.js | UserPromptSubmit | 1500ms | Embeds user query → searches → injects top memories via stdout |
| pre-tool-memory-hook.js | PreToolUse | 300ms | Embeds tool context → searches → injects via additionalContext |
| post-tool-memory-hook.js | PostToolUse | 300ms | Embeds tool context + result → searches → injects via additionalContext |
| pre-compact-hook.js | PreCompact | async | Spawns LLM analysis of full transcript → extracts memories |
| session-end-hook.js | SessionEnd | async | Incremental transcript analysis + clustering + mature cluster merging |
Memory Types
| Type | Half-life | Use case |
|------|-----------|----------|
| fact | 90 days | Stable facts about the codebase |
| decision | 90 days | Architectural decisions and rationale |
| bug | 60 days | Bug fixes and root causes |
| pattern | 90 days | Recurring code patterns |
| context | 30 days | Session-specific context |
| preference | 60 days | User workflow preferences |
| skill | never | Promoted from mature clusters |
Memory Lifecycle
1. Save → memory_save or auto-extract from transcript
2. Structure → LLM converts to <what>/<when>/<do>/<warn> XML
3. Embed → bge-m3 generates 1024-dim vector
4. Search → BM25 + vector similarity, combined ranking
5. Validate → memory_validate adjusts confidence ±
6. Cluster → similar memories auto-grouped
7. Promote → mature clusters → skill memories
8. Decay → low-confidence memories fade over timeUninstall
npx claude-persistent-memory-uninstallOr manually: remove the memory entry from .mcp.json, remove memory hooks from .claude/settings.json, then npm uninstall @alex900530/claude-persistent-memory. The .claude-memory/ data directory is preserved — delete it manually if no longer needed.
Configuration
All settings in config.default.js (auto-loaded, override via .claude-memory.config.js):
module.exports = {
embeddingPort: 23811, // TCP port for embedding server
llmPort: 23812, // TCP port for LLM server
dataDir: './data', // memory.db lives here
azure: {
endpoint: process.env.AZURE_OPENAI_ENDPOINT || '',
apiKey: process.env.AZURE_OPENAI_KEY || '',
deployment: process.env.AZURE_OPENAI_DEPLOYMENT || 'gpt-4.1',
},
embedding: {
model: 'Xenova/bge-m3', // 1024 dimensions, 8192 token context
dimensions: 1024,
},
search: {
maxResults: 3, // top-K results per query
minSimilarity: 0.6, // vector similarity threshold
},
cluster: {
similarityThreshold: 0.70, // min similarity to join a cluster
maturityCount: 5, // memories needed for mature cluster
},
};Project Structure
claude-persistent-memory/
├── bin/ # CLI scripts
│ ├── setup.js # postinstall + interactive setup
│ └── uninstall.js # cleanup script
├── hooks/ # Claude Code lifecycle hooks
│ ├── user-prompt-hook.js # UserPromptSubmit → memory injection
│ ├── pre-tool-memory-hook.js # PreToolUse → memory injection
│ ├── post-tool-memory-hook.js # PostToolUse → memory injection
│ ├── pre-compact-hook.js # PreCompact → transcript analysis
│ └── session-end-hook.js # SessionEnd → clustering
├── lib/ # Core libraries
│ ├── memory-db.js # SQLite + FTS5 + sqlite-vec
│ ├── embedding-client.js # TCP client for embedding server
│ ├── llm-client.js # TCP client for LLM server
│ ├── compact-analyzer.js # Transcript → memory extraction
│ └── utils.js # Minimal utilities
├── services/ # Background services
│ ├── embedding-server.js # TCP embedding service (bge-m3)
│ ├── llm-server.js # TCP LLM proxy (Azure OpenAI)
│ └── memory-mcp-server.js # MCP server for Claude Code
├── tools/
│ └── rebuild-vectors.js # Rebuild all embeddings
├── config.default.js # Configuration template
├── CLAUDE.md # Claude Code project instructions
└── package.jsonRequirements
- Node.js >= 18
- macOS or Linux
- ~2GB RAM for embedding model (bge-m3)
- Azure OpenAI API access (for LLM structuring)
Notes
- LLM provider: Currently supports Azure OpenAI only. For standard OpenAI or other providers, modify
services/llm-server.js. - Ports: Embedding and LLM servers default to TCP 23811 / 23812. Change in
config.jsif needed. - Data: The
.claude-memory/directory (containingmemory.dband logs) is created automatically and gitignored.
Contributing
Contributions are welcome! Please read the Contributing Guide before submitting a PR.
License
MIT © MIMI180306
