codesearch-mcp
v0.0.17-alpha.1
Published
Local semantic codebase search via MCP — indexes projects into pgvector, exposes search to Claude
Readme
codesearch-mcp
Semantic codebase search for AI agents. Indexes your local projects into a PostgreSQL vector database and exposes search via the Model Context Protocol (MCP).
Ask your AI agent things like "where is the payment retry logic?" or "show me how auth tokens are validated" — across all your projects at once.
Alpha — works well locally, API may change before 1.0.
See CHANGELOG.md for version history.
How it works
rag-indexscans your projects, chunks source files, embeds them with bge-m3 via Ollama, and stores vectors in PostgreSQLrag-mcpexposes an MCP server — Claude calls it automatically when you ask about code- Incremental — only changed files are re-indexed on subsequent runs
Stack: TypeScript · Drizzle ORM · PostgreSQL + pgvector · Ollama (bge-m3) · LangChain text splitters · ts-morph
Requirements
- Node.js 18+
- Docker (for local DB + Ollama) — or bring your own PostgreSQL with pgvector + Ollama
Installation
npm install -g codesearch-mcpSetup
rag-setupWalks you through:
- Database — enter connection details; optionally starts Docker containers automatically
- Ollama — configure API URL
- Logging — choose log level and directory (defaults:
errorlevel,~/.config/codesearch-mcp/logs/)
Config is saved to ~/.config/codesearch-mcp/config.json.
Index your projects
rag-indexworkspaceRoot (set by rag-setup or rag-config set workspaceRoot /path) determines which directory is scanned. On first run — or when no projects are selected — the selector launches automatically. Selections are saved in .ragprojects / .ragskip at the workspace root.
rag-index --select # re-run project selection
rag-index --reindex # force re-embed all files (use after model change)
rag-index --select --reindexRegister with Claude Code
claude mcp add codesearch-mcp -s user -- rag-mcpThen ask Claude anything about your code — it will use the tools below automatically.
MCP Tools
| Tool | Description |
|------|-------------|
| search_codebase | Semantic search across all indexed projects. Accepts include_graph_context to append direct imports/dependents to each result, and include_neighbors for surrounding chunks |
| list_projects | List projects with metadata and chunk counts |
| find_symbol | Find a function, class, or type by exact name — faster than semantic search |
| list_files | Browse the directory tree of a project |
| read_file | Read a complete file by absolute path |
| read_file_range | Read a specific line range from an indexed file |
| get_dependencies | Show what files a given file imports (direct or transitive) |
| get_dependents | Show which files import a given file (direct or transitive) |
| get_callers | Find all call sites of a function or method by exact name |
Project metadata
Add .ragmeta.json to a project root to enrich list_projects and generate a visual dependency graph:
{
"description": "Main API server",
"stack": "NestJS, GraphQL, PostgreSQL",
"relations": {
"consumed_by": ["crm-frontend"],
"consumes": ["billing-service", "auth-service"]
}
}After each indexing run an interactive project-graph.html is written to the workspace root.
Graph context
Pass include_graph_context: true to search_codebase to see direct imports and dependents inline with each result — without extra get_dependencies / get_dependents calls:
**File:** src/services/InvoiceService.ts **Line:** 45
...code...
**Relevance:** 0.32
**Imports:** src/repositories/InvoiceRepo.ts · src/dto/CreateInvoiceDto.ts
**Used by:** src/controllers/InvoiceController.tsUseful for immediately understanding a file's role in the dependency graph.
Excluding files
Create .ragignore in a project root — same syntax as .gitignore:
src/gql/graphql.ts
**/__snapshots__/**
src/generated/Logging
Logs are written to ~/.config/codesearch-mcp/logs/ by default (separate subdirectories for indexer and MCP). Only errors are logged unless configured otherwise.
# Read logs
tail -f ~/.config/codesearch-mcp/logs/indexer/current.log
npx pino-pretty < ~/.config/codesearch-mcp/logs/indexer/current.log
# Detailed indexing diagnostics (timing per file: deps / embed / db)
rag-config set logging.level debug
# Disable file logging
rag-config set logging.dir falseConfiguration
All set by rag-setup. Stored in ~/.config/codesearch-mcp/config.json.
rag-config show # print current config (password redacted)
rag-config get <key> # print a single value
rag-config set <key> <value> # update a value
rag-config reset <key> # remove key (reverts to default)| Key | Default | Description |
|-----|---------|-------------|
| db.host | — | PostgreSQL host |
| db.port | 5433 | PostgreSQL port |
| db.user | — | PostgreSQL user |
| db.password | — | PostgreSQL password |
| db.name | rag_<os-username> | Database name — auto-created on first run |
| workspaceRoot | — | Root of your projects directory |
| ollama.host | http://localhost:11434 | Ollama API URL |
| ollama.model | bge-m3 | Embedding model |
| search.similarityThreshold | 0.6 | Max cosine distance — raise if search returns nothing |
| search.autoTranslate | false | Translate non-English queries to English before embedding |
| search.translateModel | llama3.2 | Ollama model for translation (requires ollama pull llama3.2) |
| indexing.astBatchSize | 3000 | Max files per ts-morph batch — lower if indexing large projects causes OOM |
| logging.level | error | Log level written to file: error | warn | info | debug |
| logging.dir | ~/.config/codesearch-mcp/logs | Base log directory, or "false" to disable |
License
MIT
