deepmatch-mcp
v0.1.0
Published
Semantic code search server for MCP, powered by vector embeddings and Qdrant
Maintainers
Readme
DeepMatch MCP
A Model Context Protocol (MCP) server for semantic code search using vector embeddings. Index your codebase and search with natural language queries.
Features
- Semantic Code Search: Find code by meaning, not just keywords
- Multiple Embedding Providers: OpenAI, Ollama, Gemini, OpenAI-compatible APIs
- Real-time File Watching: Automatically re-index on file changes
- Multi-repository Support: Index multiple directories simultaneously
- Smart Filtering: Respects
.gitignore, skips binary files and common build directories - MCP Protocol: Works with any MCP-compatible client (Claude Desktop, etc.)
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ deepmatch-mcp │
├─────────────────────────────────────────────────────────────────┤
│ CLI Entrypoint (src/cli.ts) │
│ - Parses config from CLI flags and environment variables │
│ - Orchestrates startup: scan → index → watch → serve │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
│ │ Config │ │ Providers │ │ Vector Store │ │
│ │ (Zod) │ │ (Embedders) │ │ (Qdrant) │ │
│ └─────────────┘ └─────────────┘ └─────────────────────────┘ │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
│ │ Scanner │ │ Chunker │ │ Index Manager │ │
│ │ (Directory) │ │ (Line-based)│ │ (Batch Processing) │ │
│ └─────────────┘ └─────────────┘ └─────────────────────────┘ │
│ │
│ ┌─────────────┐ ┌───────────────────────────────────────────┐ │
│ │ Watcher │ │ MCP Server │ │
│ │ (Chokidar) │ │ (stdio transport, 'search' tool) │ │
│ └─────────────┘ └───────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘Module Overview
| Module | Path | Description |
|--------|------|-------------|
| Config | src/config/ | CLI/ENV parsing with Zod validation |
| Providers | src/providers/ | Embedding providers (OpenAI, Ollama, Gemini, OpenAI-compatible) |
| Store | src/store/ | Qdrant vector database wrapper |
| Chunker | src/chunker/ | Line-based text chunking with configurable limits |
| Scanner | src/scanner/ | Directory traversal with .gitignore support |
| Indexer | src/indexer/ | Batch embedding and vector upsert orchestration |
| Watcher | src/watcher/ | File change detection with debouncing |
| MCP | src/mcp/ | MCP stdio server with search tool |
Installation
# Install dependencies
npm install
# Build
npm run buildUsage
Prerequisites
Qdrant - Vector database (default:
http://localhost:6333)# Using Docker docker run -p 6333:6333 qdrant/qdrantEmbedding Provider - One of:
- OpenAI API key
- Ollama running locally
- Gemini API key
- Any OpenAI-compatible API
CLI Options
npx deepmatch-mcp [options]
Options:
--path <path> Repository path to index (repeatable)
--provider <provider> Embedding provider: openai|ollama|gemini|openai-compatible
--model <model> Embedding model name
--embedding-dim <dim> Embedding dimension (auto-detected if not set)
--batch-size <size> Batch size for embeddings (default: 60)
--max-files <count> Maximum files to index (default: 50000)
--qdrant-url <url> Qdrant server URL (default: http://localhost:6333)
--qdrant-key <key> Qdrant API key
--openai-key <key> OpenAI API key
--ollama-url <url> Ollama server URL
--gemini-key <key> Gemini API key
--openai-compat-base-url OpenAI-compatible base URL
--openai-compat-key OpenAI-compatible API keyEnvironment Variables
All CLI options can be set via environment variables:
| Variable | CLI Flag |
|----------|----------|
| DEEPMATCH_PATHS | --path (comma-separated) |
| DEEPMATCH_PROVIDER | --provider |
| DEEPMATCH_MODEL | --model |
| DEEPMATCH_EMBEDDING_DIM | --embedding-dim |
| DEEPMATCH_BATCH_SIZE | --batch-size |
| DEEPMATCH_MAX_FILES | --max-files |
| DEEPMATCH_QDRANT_URL | --qdrant-url |
| DEEPMATCH_QDRANT_API_KEY | --qdrant-key |
| DEEPMATCH_OPENAI_API_KEY | --openai-key |
| DEEPMATCH_OLLAMA_URL | --ollama-url |
| DEEPMATCH_GEMINI_API_KEY | --gemini-key |
| DEEPMATCH_OPENAI_COMPAT_BASE_URL | --openai-compat-base-url |
| DEEPMATCH_OPENAI_COMPAT_API_KEY | --openai-compat-key |
CLI flags take precedence over environment variables.
Examples
With OpenAI:
npx deepmatch-mcp \
--path /path/to/your/repo \
--provider openai \
--openai-key sk-xxxWith Ollama:
npx deepmatch-mcp \
--path /path/to/repo1 \
--path /path/to/repo2 \
--provider ollama \
--ollama-url http://localhost:11434 \
--model nomic-embed-textWith environment variables:
export DEEPMATCH_PATHS="/path/to/repo"
export DEEPMATCH_PROVIDER="openai"
export DEEPMATCH_OPENAI_API_KEY="sk-xxx"
npx deepmatch-mcpMCP Client Configuration
For Claude Desktop, add to your MCP settings:
{
"mcpServers": {
"deepmatch": {
"command": "npx",
"args": ["deepmatch-mcp", "--path", "/path/to/repo", "--provider", "openai"],
"env": {
"DEEPMATCH_OPENAI_API_KEY": "sk-xxx"
}
}
}
}MCP Tools
search
Search for code using semantic similarity.
Input Schema:
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| query | string | Yes | Natural language search query |
| limit | number | No | Max results (1-50, default: 10) |
| paths | string[] | No | Filter to specific repository paths |
| minScore | number | No | Minimum similarity score (0-1) |
Output:
{
"total_count": 5,
"items": [
{
"filePath": "/repo/src/auth.ts",
"repoPath": "/repo",
"startLine": 10,
"endLine": 25,
"codeChunk": "function authenticate(token: string) {...}",
"score": 0.92
}
]
}Local Development
Setup
# Clone and install
git clone https://github.com/657KB/deepmatch-mcp
cd deep-match
npm installDevelopment Workflow
# Run tests (TDD)
npm test
# Run tests in watch mode
npx vitest
# Build TypeScript
npm run build
# Test the CLI
node dist/cli.js --helpProject Structure
src/
├── cli.ts # Main entry point
├── config/
│ ├── schema.ts # Zod schemas and defaults
│ ├── index.ts # CLI/ENV parsing
│ └── config.test.ts
├── providers/
│ ├── types.ts # IEmbedder interface
│ ├── embedders.ts # Provider implementations
│ ├── index.ts
│ └── embedders.test.ts
├── store/
│ ├── types.ts # IVectorStore interface
│ ├── qdrant.ts # Qdrant implementation
│ ├── index.ts
│ └── qdrant.test.ts
├── chunker/
│ ├── extensions.ts # Supported file extensions
│ ├── chunker.ts # Line-based chunking
│ ├── index.ts
│ └── chunker.test.ts
├── scanner/
│ ├── scanner.ts # Directory traversal
│ ├── index.ts
│ └── scanner.test.ts
├── indexer/
│ ├── index-manager.ts # Batch indexing orchestration
│ ├── index.ts
│ └── index-manager.test.ts
├── watcher/
│ ├── file-watcher.ts # Chokidar file watching
│ ├── index.ts
│ └── file-watcher.test.ts
└── mcp/
├── server.ts # MCP server + search tool
├── index.ts
└── server.test.tsRunning Tests
# Run all tests
npm test
# Run specific test file
npx vitest src/chunker/chunker.test.ts
# Run with coverage
npx vitest --coverageConfiguration Defaults (Roo-Code Aligned)
| Parameter | Default | Description |
|-----------|---------|-------------|
| batchSize | 60 | Embedding batch size |
| maxFiles | 50,000 | Maximum files to index |
| chunkMin | 50 | Minimum chunk size (chars) |
| chunkMax | 1,000 | Maximum chunk size (chars) |
| chunkMaxTolerance | 1.15 | Tolerance factor for max size |
| chunkRebalanceMin | 200 | Minimum remainder to trigger rebalance |
| qdrantUrl | http://localhost:6333 | Qdrant server URL |
File Filtering
Supported Extensions
TypeScript, JavaScript, Python, Java, C/C++, C#, Go, Rust, Ruby, PHP, Swift, Kotlin, Scala, Lua, R, Perl, Shell, SQL, HTML, CSS, JSON, YAML, XML, Markdown, Vue, Svelte
Excluded Directories
node_modules, dist, build, target, .git, hidden directories, __pycache__, venv, .next, .nuxt, coverage, vendor, Pods, .gradle, .idea, .vscode
Additional Filters
- Files larger than 1MB are skipped
.gitignorerules are respected (stacked for nested directories)
License
MIT
