codevault
v1.8.5
Published
AI-powered semantic code search via Model Context Protocol
Maintainers
Readme
CodeVault
AI-powered semantic code search via Model Context Protocol (MCP)
CodeVault is an intelligent code indexing and search system that enables AI assistants to understand and navigate your codebase using semantic search, symbol-aware ranking, and hybrid retrieval techniques.
🌟 Features
- 🔍 Semantic Search: Find code by meaning, not just keywords using vector embeddings
- 🤖 MCP Integration: Native support for Claude Desktop and other MCP clients (search, synthesize, update tools)
- 💬 LLM-Synthesized Answers: Ask questions in natural language, get markdown responses with code citations
- 🗣️ Interactive Chat Mode: Have multi-turn conversations about your codebase with conversation history
- 🎯 Symbol-Aware Ranking: Boost results based on function signatures, parameters, and relationships
- ⚡ Hybrid Retrieval: Combines vector embeddings with BM25 keyword matching via Reciprocal Rank Fusion
- 🚀 Batch Processing: Efficient API usage with configurable batching (50 chunks/batch by default)
- 🧪 Integration Tests: End-to-end coverage for indexing, search, encryption, and watch flows
- 📦 Smart Chunking: Token-aware semantic code splitting with overlap for optimal context
- 🔄 Context Packs: Save and reuse search scopes for different features/modules
- 🏠 Local-First: Works with local models (Ollama) or cloud APIs (OpenAI, Nebius, OpenRouter)
- 🔐 Optional Encryption: AES-256-GCM encryption for indexed code chunks (multi-key/rotation support)
- ⚙️ Global Configuration: One-time setup with interactive wizard for CLI convenience
- 📊 Multi-Language Support: 25+ programming languages via Tree-sitter
- 🔎 File Watching: Real-time index updates with debounced change detection and provider reuse
- ✅ CI/CD: Typecheck + lint + tests on PR/main; auto-publish to npm on version bumps (with NPM_TOKEN)
- ⏱️ Rate Limiting: Intelligent request/token throttling with automatic retry
- 💾 Memory Efficient: LRU caches with automatic cleanup for long-running processes
📋 Prerequisites
- Node.js: Version 18.0.0 or higher
- npm: Comes with Node.js
- API Key: OpenAI API key or compatible endpoint (Ollama for local models)
🚀 Quick Start
Installation
NPM (Global - Recommended)
# Install latest version
npm install -g codevault
# Interactive configuration setup (one-time)
codevault config init
# Index your project
cd /path/to/your/project
codevault indexFrom Source
git clone https://github.com/shariqriazz/codevault.git
cd codevault
npm install --legacy-peer-deps
npm run build
npm linkConfiguration
CodeVault supports multiple configuration methods with clear priority:
Priority: Environment Variables > Project Config > Global Config > Defaults
Option 1: Interactive Setup (Recommended for CLI)
codevault config initGuides you through:
- Provider selection (OpenAI, Ollama, Custom API)
- API key configuration
- Model selection (preset or custom)
- Advanced settings (rate limits, encryption, reranking)
Configuration saved to ~/.codevault/config.json
Option 2: Quick Setup with Nebius (Qwen Embeddings)
# Set up Nebius for embeddings (Qwen3-Embedding-8B)
codevault config set providers.openai.apiKey your-nebius-api-key
codevault config set providers.openai.baseUrl https://api.studio.nebius.com/v1
codevault config set providers.openai.model Qwen/Qwen3-Embedding-8B
codevault config set providers.openai.dimensions 4096
codevault config set maxTokens 32000
# Set up OpenRouter for chat (Claude Sonnet 4.5)
codevault config set chatLLM.openai.apiKey your-openrouter-api-key
codevault config set chatLLM.openai.baseUrl https://openrouter.ai/api/v1
codevault config set chatLLM.openai.model anthropic/claude-sonnet-4.5
# Optional: Enable reranking with Novita (Qwen3-Reranker)
codevault config set reranker.apiUrl https://api.novita.ai/openai/v1/rerank
codevault config set reranker.apiKey your-novita-api-key
codevault config set reranker.model qwen/qwen3-reranker-8bOption 3: OpenRouter with Provider Routing
# Use OpenRouter with provider routing to control which providers handle requests
# Force Nebius for embeddings (ZDR + best quality)
codevault config set providers.openai.apiKey your-openrouter-api-key
codevault config set providers.openai.baseUrl https://openrouter.ai/api/v1
codevault config set providers.openai.model qwen/qwen3-embedding-8b
codevault config set providers.openai.dimensions 4096
codevault config set providers.openai.routing.only '["nebius"]'
codevault config set providers.openai.routing.allow_fallbacks false
# Prioritize throughput for chat
codevault config set chatLLM.openai.apiKey your-openrouter-api-key
codevault config set chatLLM.openai.baseUrl https://openrouter.ai/api/v1
codevault config set chatLLM.openai.model anthropic/claude-sonnet-4.5
codevault config set chatLLM.openai.routing.sort throughput
# Optional: Enforce ZDR (Zero Data Retention) + deny data collection
codevault config set providers.openai.routing.zdr true
codevault config set providers.openai.routing.data_collection denySee Provider Routing Guide for all available options.
Option 4: Environment Variables (MCP / CI/CD)
# Embedding Provider (Nebius + Qwen)
export CODEVAULT_EMBEDDING_API_KEY=your-nebius-api-key
export CODEVAULT_EMBEDDING_BASE_URL=https://api.studio.nebius.com/v1
export CODEVAULT_EMBEDDING_MODEL=Qwen/Qwen3-Embedding-8B
export CODEVAULT_EMBEDDING_DIMENSIONS=4096
export CODEVAULT_EMBEDDING_MAX_TOKENS=32000
# Chat LLM (OpenRouter + Claude)
export CODEVAULT_CHAT_API_KEY=your-openrouter-api-key
export CODEVAULT_CHAT_BASE_URL=https://openrouter.ai/api/v1
export CODEVAULT_CHAT_MODEL=anthropic/claude-sonnet-4.5
# Reranking (Novita + Qwen)
export CODEVAULT_RERANK_API_URL=https://api.novita.ai/openai/v1/rerank
export CODEVAULT_RERANK_API_KEY=your-novita-api-key
export CODEVAULT_RERANK_MODEL=qwen/qwen3-reranker-8bSee Configuration Guide for complete details.
Index Your Project
# Using global config (if set via codevault config init)
codevault index
# Using Nebius + Qwen embeddings
export CODEVAULT_EMBEDDING_API_KEY=your-key
export CODEVAULT_EMBEDDING_BASE_URL=https://api.studio.nebius.com/v1
export CODEVAULT_EMBEDDING_MODEL=Qwen/Qwen3-Embedding-8B
codevault index
# Using local Ollama
export CODEVAULT_EMBEDDING_BASE_URL=http://localhost:11434/v1
export CODEVAULT_EMBEDDING_MODEL=nomic-embed-text
codevault index
# With encryption
export CODEVAULT_ENCRYPTION_KEY=$(openssl rand -base64 32)
codevault index --encrypt on
# Watch for changes (auto-update index)
codevault watch --debounce 500Search Your Code
# Basic search
codevault search "authentication function"
# Search with filters
codevault search "stripe checkout" --tags stripe --lang php
# Search with full code chunks
codevault search-with-code "database connection" --limit 5
# Ask questions with LLM-synthesized answers
codevault ask "How does authentication work in this codebase?"
codevault ask "How do I add a new payment provider?" --multi-query --stream
# Start interactive chat (NEW!)
codevault chat
# Features:
# - Multi-turn conversations with history
# - Maintains context across questions
# - Commands: /help, /history, /clear, /stats, /exit
# - Configurable history window (--max-history)
# View project stats
codevault infoUse with Claude Desktop
See complete setup guide: MCP Setup Guide
Quick setup - Add to your claude_desktop_config.json:
{
"mcpServers": {
"codevault": {
"command": "npx",
"args": ["-y", "codevault", "mcp"],
"env": {
"CODEVAULT_EMBEDDING_API_KEY": "your-nebius-api-key",
"CODEVAULT_EMBEDDING_BASE_URL": "https://api.studio.nebius.com/v1",
"CODEVAULT_EMBEDDING_MODEL": "Qwen/Qwen3-Embedding-8B",
"CODEVAULT_EMBEDDING_DIMENSIONS": "4096",
"CODEVAULT_CHAT_API_KEY": "your-openrouter-api-key",
"CODEVAULT_CHAT_BASE_URL": "https://openrouter.ai/api/v1",
"CODEVAULT_CHAT_MODEL": "anthropic/claude-sonnet-4.5",
"CODEVAULT_RERANK_API_URL": "https://api.novita.ai/openai/v1/rerank",
"CODEVAULT_RERANK_API_KEY": "your-novita-api-key",
"CODEVAULT_RERANK_MODEL": "qwen/qwen3-reranker-8b"
}
}
}
}🛠️ Development & CI/CD
- CI runs on PRs and pushes to
main:npm run typecheck,npm run lint, andnpm test. - Version bumps on
mainauto-publish to npm whenNPM_TOKENis configured in repo secrets.
Example configs:
📖 Documentation
- Configuration Guide - Complete configuration options
- MCP Setup Guide - Claude Desktop integration
- Ask Feature Guide - LLM-synthesized Q&A
- CLI Reference - All commands and options
- API Providers - Embedding, chat, and reranking providers
- Advanced Features - Chunking, encryption, context packs
Quick Links
# Configuration Management
codevault config init # Interactive setup wizard
codevault config set <key> <value> # Set global config value
codevault config list # Show merged config
# Indexing
codevault index [path] # Index project
codevault update [path] # Update existing index
codevault watch [path] # Watch for changes
# Searching
codevault search <query> # Search code (metadata only)
codevault search-with-code <query> # Search with full code chunks
codevault ask <question> # Ask questions, get synthesized answers
codevault chat # Interactive conversation mode (NEW!)
# Context Packs
codevault context list # List saved contexts
codevault context use <name> # Activate context pack
# Utilities
codevault info # Project statistics
codevault mcp # Start MCP server🏗️ Architecture
How It Works
Indexing Phase
- Parses source files using Tree-sitter
- Extracts symbols, signatures, and relationships
- Creates semantic chunks (token-aware, with overlap)
- Batch generates embeddings (50 chunks/batch)
- Stores in SQLite + compressed chunks on disk
Search Phase
- Generates query embedding
- Performs vector similarity search
- Runs BM25 keyword search (if enabled)
- Applies Reciprocal Rank Fusion
- Boosts results based on symbol matching
- Optionally applies API reranking
- Returns ranked results with metadata
LLM Synthesis Phase (Ask Feature)
- Searches for relevant code chunks
- Retrieves full code content
- Builds context prompt with metadata
- Generates natural language answer via chat LLM
- Returns markdown with code citations
Interactive Chat Phase (Chat Feature)
- Maintains conversation history (last N turns)
- Performs fresh semantic search for each question
- Combines conversation context + new code chunks
- Generates conversational responses with continuity
- Supports commands: /help, /history, /clear, /stats
Supported Languages
- Web: JavaScript, TypeScript, TSX, HTML, CSS, JSON, Markdown
- Backend: Python, PHP, Go, Java, Kotlin, C#, Ruby, Scala, Swift
- Systems: C, C++, Rust
- Functional: Haskell, OCaml, Elixir
- Scripting: Bash, Lua
Recommended Providers
| Purpose | Provider | Model | Context | Best For | |---------|----------|-------|---------|----------| | Embeddings | Nebius | Qwen/Qwen3-Embedding-8B | 32K | High quality, large context | | Embeddings | Ollama | nomic-embed-text | 8K | Local, privacy-focused | | Chat LLM | OpenRouter | anthropic/claude-sonnet-4.5 | 200K | Best code understanding | | Chat LLM | Ollama | qwen2.5-coder:7b | 32K | Local, code-specialized | | Reranking | Novita | qwen/qwen3-reranker-8b | 32K | Best for code reranking |
Advanced: Use OpenRouter Provider Routing to control which providers handle your requests (cost, throughput, data retention, etc.)
🔧 Performance & Optimization
Memory Management
- LRU caches with automatic eviction
- Periodic cache cleanup (configurable interval)
- Graceful shutdown handlers for MCP server
- Token counter caching for repeated operations
Rate Limiting
Intelligent throttling prevents API errors:
- Configurable RPM (requests per minute)
- Configurable TPM (tokens per minute)
- Automatic retry with exponential backoff
- Queue size limits prevent memory exhaustion
Batch Efficiency
- 50 chunks per embedding API call (vs 1 per call)
- Reduces API overhead by ~98%
- Automatic fallback for failed batches
- Preserves partial progress on errors
🐛 Troubleshooting
Common Issues
"Which config is being used?"
codevault config list --sources"MCP not using my global config"
This is correct! MCP uses environment variables by design. Global config is for CLI convenience only.
"Rate limit errors"
# Reduce rate limits
codevault config set rateLimit.rpm 100
codevault config set rateLimit.tpm 10000"Out of memory during indexing"
# Adjust batch size (default 100) via environment
export BATCH_SIZE=50 # Moderate
export BATCH_SIZE=25 # Conservative
codevault index"Encryption key errors"
# Generate valid key (32 bytes)
export CODEVAULT_ENCRYPTION_KEY=$(openssl rand -base64 32)🤝 Contributing
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
📄 License
MIT License - see LICENSE file for details.
🔗 Links
- GitHub: https://github.com/shariqriazz/codevault
- NPM: https://www.npmjs.com/package/codevault
- Issues: https://github.com/shariqriazz/codevault/issues
🙏 Acknowledgments
Built with:
- Model Context Protocol - AI integration framework
- Tree-sitter - Parsing infrastructure
- OpenAI - Embedding models
- Ollama - Local model support
- Nebius AI Studio - Qwen embeddings
- OpenRouter - LLM access
- Novita AI - Reranking API
Version: 1.8.4 Built by: Shariq Riaz Last Updated: November 2025
