code-memory
v0.1.1
Published
Persistent memory for AI coding — semantic search, git history analysis, and intelligent context preservation
Downloads
36
Maintainers
Readme
Code Memory
Persistent memory for AI coding - Never lose context again
Code Memory is an MCP (Model Context Protocol) server that gives AI coding assistants long-term memory of your codebase through semantic search, git history analysis, and intelligent context preservation.
The Problem
AI coding assistants forget:
- 🤔 Context between sessions - "Why did we use JWT instead of sessions?"
- 🔍 Architectural decisions - "What was the rationale for this design?"
- 📚 Historical discussions - "We already tried that approach last month"
- 🕸️ Code relationships - "What else depends on this module?"
The Solution
Code Memory provides:
- 🧠 Semantic search - Find code by meaning, not just keywords
- 📚 Git history analysis - Extract decisions from commit messages and PRs
- 🕸️ Dependency graphs - Understand code relationships across 16 languages
- 📝 Session learning - Learn patterns from your coding sessions
- 💾 Persistent knowledge - Remember important facts across sessions
Quick Start
Installation
npm install -g code-memoryBasic Usage
# Initialize in your project
cd your-project
code-memory init
# Index your codebase
code-memory reindex
# Start the MCP server
code-memory serve
# Search from CLI
code-memory search "authentication flow"Configure with Claude Code
Add to your ~/.claude/config.json:
{
"mcpServers": {
"code-memory": {
"command": "code-memory",
"args": ["serve"]
}
}
}Features
🔍 Semantic Search
Find code by meaning, not just keywords:
code-memory search "user authentication logic"
# Finds auth code even if it doesn't contain the word "user"How it works:
- Uses
fastembedfor local embeddings (all-MiniLM-L6-v2) - No API calls, completely offline
- Understands intent, not just text matching
📝 Full-Text Search
Fast keyword-based search powered by tantivy:
code-memory search --fulltext "async function"📚 Git History Analysis
Extract architectural decisions from your git history:
code-memory trace-decision "why microservices"Finds:
- Explicit decisions in commit messages
- Architectural choices from PR descriptions
- Rationale and "why" statements
- Refactoring decisions
- Breaking changes
🕸️ Dependency Graphs
Understand what depends on what:
code-memory find-related "UserService"Analyzes:
- Import relationships
- Co-change patterns (files changed together)
- Dependency chains
- Most coupled modules
Supported languages: Rust, TypeScript, JavaScript, Python, Go, Java, C++, C#, Ruby, PHP, Swift, Kotlin, Scala, Haskell, Elixir, Clojure
📊 Session Tracking
Learn from your coding sessions:
code-memory sessionsExtracts:
- Architectural decisions made in sessions
- Error-and-fix patterns
- Refactoring approaches
- Testing strategies
💾 Persistent Knowledge
Remember important facts:
# Via MCP server
remember("We use JWT for auth because of scalability requirements")MCP Tools
Code Memory provides 7 MCP tools for AI assistants:
1. search_code
Search codebase with semantic + full-text search:
search_code({
query: "authentication flow",
limit: 10
})2. explain_code
Get detailed explanation of a symbol:
explain_code({
symbol: "UserService",
context_lines: 20
})3. trace_decision
Find why a decision was made:
trace_decision({
topic: "why microservices",
max_results: 5
})4. find_related
Find related code and dependencies:
find_related({
symbol: "AuthController",
relationship_type: "both" // "depends_on" | "depended_by" | "both"
})5. remember
Store persistent knowledge:
remember({
key: "auth-strategy",
value: "We use JWT for stateless auth across microservices"
})6. index_project
Manually trigger reindexing:
index_project({
force: true
})7. get_session_patterns
Retrieve learned patterns:
get_session_patterns({
pattern_type: "architecture", // or "errors", "refactoring", "testing"
min_confidence: 0.7
})CLI Commands
init
Initialize Code Memory in a project:
code-memory initCreates .code-memory/ directory with:
config.toml- Configurationindex/- Search indexknowledge.json- Persistent knowledge
serve
Start the MCP server:
code-memory servereindex
Rebuild the code index:
code-memory reindex # Incremental
code-memory reindex --force # Full rebuildsearch
Search from the command line:
code-memory search "query"
code-memory search "query" --lang rust
code-memory search "query" --limit 20stats
Show index statistics:
code-memory statssessions
View session patterns:
code-memory sessions
code-memory sessions -n 10 # Top 10
code-memory sessions -c 0.8 # Min confidence 0.8
code-memory sessions -f json # JSON outputexport / import
Backup and restore knowledge:
code-memory export knowledge.json
code-memory import knowledge.jsonConfiguration
Configuration is stored in .code-memory/config.toml:
[indexing]
# File patterns to index
include = ["**/*.rs", "**/*.ts", "**/*.js", "**/*.py"]
# File patterns to ignore
exclude = ["**/node_modules/**", "**/target/**", "**/.git/**"]
# Maximum file size (in bytes)
max_file_size = 1048576 # 1MB
[search]
# Number of results to return by default
default_limit = 10
# Minimum relevance score (0.0-1.0)
min_score = 0.5
[git]
# Analyze git history for decisions
analyze_history = true
# How far back to look (in days)
history_depth = 365
[embedding]
# Embedding model (local, no API calls)
model = "all-MiniLM-L6-v2"
# Embedding dimension
dimension = 384Architecture
code-memory/
├── src/
│ ├── indexer/ # Code indexing with tantivy
│ │ ├── walker.rs # File system traversal
│ │ ├── parser.rs # Symbol extraction
│ │ └── code_index.rs
│ ├── search/ # Search engines
│ │ ├── fulltext.rs # Tantivy full-text search
│ │ ├── semantic.rs # Fastembed semantic search
│ │ └── hybrid.rs # Combined ranking
│ ├── git/ # Git history analysis
│ │ ├── history.rs # Commit parsing
│ │ └── decisions.rs # Decision extraction
│ ├── graph/ # Dependency graphs
│ │ ├── imports.rs # Import parsing
│ │ └── analyzer.rs # Graph analysis
│ ├── sessions/ # Session tracking
│ │ ├── tracker.rs # Event extraction
│ │ └── patterns.rs # Pattern learning
│ ├── mcp/ # MCP server
│ │ ├── server.rs # JSON-RPC server
│ │ ├── tools.rs # Tool handlers
│ │ └── protocol.rs # MCP protocol
│ └── cli.rs # CLI interface
└── .code-memory/
├── config.toml # Configuration
├── index/ # Search index
└── knowledge.json # Persistent factsPerformance
- Indexing speed: ~10,000 files/minute
- Search latency: <100ms (full-text), <200ms (semantic)
- Memory usage: ~100MB for 50k files
- Binary size: 8MB (including embedding model)
Pricing
- Free: Up to 5,000 files, basic search
- Pro ($20/month):
- Unlimited files
- Advanced query optimization
- Team knowledge sharing
- Priority support
Supported Languages
Full support (symbol extraction + imports):
- Rust, TypeScript, JavaScript, Python, Go
- Java, C++, C#, Ruby, PHP
- Swift, Kotlin, Scala, Haskell, Elixir, Clojure
Additional languages supported for full-text search only.
Comparison
| Feature | Code Memory | grep/ripgrep | GitHub Copilot | |---------|------------|--------------|----------------| | Semantic search | ✅ | ❌ | ✅ (API) | | Offline | ✅ | ✅ | ❌ | | Git history | ✅ | ❌ | ❌ | | Dependency graphs | ✅ | ❌ | ❌ | | Session learning | ✅ | ❌ | ✅ | | MCP integration | ✅ | ❌ | ❌ | | Cost | Free/$20 | Free | $10+/mo |
FAQ
How is this different from grep/ripgrep?
Code Memory understands meaning, not just text:
- "user auth" finds authentication code even without those exact words
- Extracts architectural decisions from git history
- Understands code relationships across files
- Learns from your coding sessions
Does it send my code to an API?
No. Everything runs locally:
- Embeddings generated on your machine (fastembed)
- Search index stored locally (tantivy)
- No network calls during normal operation
- Your code never leaves your computer
How much disk space does it use?
Approximately:
- ~50MB per 10,000 files indexed
- Embedding model: 90MB (downloaded once)
- Total: ~150-300MB for a typical project
Can I use it with VS Code / other editors?
Yes! Code Memory is an MCP server, so it works with any MCP-compatible tool:
- Claude Code CLI
- Any editor with MCP support
- Custom integrations via MCP protocol
What about private repositories?
Code Memory only accesses files you explicitly index. It never:
- Uploads code to remote servers
- Shares data with third parties
- Requires authentication or accounts (for free tier)
Troubleshooting
Index not building
# Check for errors
code-memory reindex --verbose
# Force rebuild
code-memory reindex --force
# Check config
cat .code-memory/config.tomlSearch returns no results
# Verify index exists
code-memory stats
# Rebuild index
code-memory reindex --force
# Check file patterns in configMCP server not starting
# Check if port is in use
lsof -i :8080
# Start with verbose logging
code-memory serve --verboseDevelopment
Building from source
git clone https://github.com/mstuart/code-memory.git
cd code-memory
cargo build --releaseRunning tests
cargo test # Run all tests
cargo test --lib # Unit tests only
cargo test --test mcp_tools # Integration testsContributing
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new features
- Ensure all tests pass
- Submit a pull request
License
MIT License - see LICENSE for details
Support
Acknowledgments
Built with:
- tantivy - Full-text search
- fastembed - Local embeddings
- git2 - Git integration
- petgraph - Dependency graphs
- tree-sitter - Code parsing
Give your AI assistant a memory. Never lose context again. 🧠
npm install -g code-memory