@tai-io/codesearch
v2026.313.2014
Published
Semantic code search MCP server for Claude Code
Downloads
45
Readme
@tai-io/codesearch
Semantic code search MCP server for Claude Code. Index any codebase and search it by meaning, not just keywords.
How it works
- Index a codebase — files are parsed with tree-sitter (AST-aware chunking for 10 languages), embedded via OpenAI, and stored in a local SQLite database
- Search by natural language — queries are embedded and matched using hybrid search (dense vectors + full-text, RRF fusion)
- Results return in a compact table (~20 tokens/result) so Claude can efficiently decide what to read in full
All data is stored locally under ~/.codesearch/. No external servers required.
Quick start
Prerequisites
- Node.js >= 20
- An OpenAI API key (for embeddings), or Ollama running locally
Install as a Claude Code MCP server
export OPENAI_API_KEY=sk-...
npm install -g @tai-io/codesearch
claude mcp add -s user -e "OPENAI_API_KEY=$OPENAI_API_KEY" -- codesearch npx @tai-io/codesearchOr add the MCP config directly:
{
"codesearch": {
"command": "npx",
"args": ["@tai-io/codesearch"],
"env": {
"OPENAI_API_KEY": "sk-..."
}
}
}Use it
Once configured, Claude Code has 8 new tools:
| Tool | Example | What it does |
|------|---------|--------------|
| index | index(path="/my/project") | Index a codebase (~30s one-time) |
| search | search(query="how does auth work") | Semantic search |
| list | list() | See indexed codebases |
| browse | browse(path="/my/project") | Structural map of classes/functions |
| clear | clear(path="/my/project") | Remove index |
| cleanup | cleanup(path="/my/project") | Remove vectors for deleted files |
| ingest | ingest(content="...", library="react", ...) | Cache external docs |
| lookup | lookup(query="react hooks") | Search cached docs |
Tools
| Tool | Description |
|------|-------------|
| index | Index a codebase for semantic search. Incremental — only re-embeds changed files. |
| search | Search indexed code by natural language. Returns compact results (~20 tokens each). |
| list | List all indexed codebases with status and file/chunk counts. |
| browse | Structural map — classes, functions, methods with signatures, grouped by file. |
| clear | Remove the search index for a codebase. |
| cleanup | Remove orphaned vectors for deleted files. No embedding cost. |
| ingest | Cache external documentation for cheap semantic search later. |
| lookup | Search cached documentation (~20 tokens/result vs ~5K for re-fetching). |
Supported languages
AST-aware chunking (via tree-sitter): TypeScript, JavaScript, Python, Go, Java, Rust, C++, C, C#, TSX.
Line-based fallback for all other text files.
Configuration
All configuration is via environment variables:
| Variable | Default | Description |
|----------|---------|-------------|
| OPENAI_API_KEY | required | OpenAI API key for embeddings |
| EMBEDDING_PROVIDER | openai | openai, ollama, or local |
| EMBEDDING_MODEL | text-embedding-3-small | Embedding model name |
| OPENAI_BASE_URL | — | Override base URL (for proxies or compatible APIs) |
| OLLAMA_BASE_URL | http://localhost:11434/v1 | Ollama server URL |
| EMBEDDING_BATCH_SIZE | 100 | Vectors per API call (1–2048) |
| INDEXING_CONCURRENCY | 8 | Parallel file processing (1–32) |
| CODESEARCH_DATA_DIR | ~/.codesearch | Data directory for indexes and state |
| CUSTOM_EXTENSIONS | [] | Additional file extensions as JSON array |
| CUSTOM_IGNORE_PATTERNS | [] | Additional glob ignore patterns as JSON array |
Using Ollama (free, local)
# Install and start Ollama with an embedding model
ollama pull nomic-embed-text
# Configure
export EMBEDDING_PROVIDER=ollamaDevelopment
git clone https://github.com/tai-io/codesearch.git
cd codesearch
npm install
npm run build
npm testLicense
MIT
