npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

mcp-codebase-index

v0.4.0

Published

MCP server for semantic code search using Tree-sitter, embeddings, and Qdrant vector database

Readme

MCP Codebase Index

A powerful Model Context Protocol (MCP) server that enables AI assistants to search and understand your codebase using semantic search. Find code using natural language queries like "authentication logic" or "database connection handling" instead of exact text matching.

smithery badge

Features

  • 🔍 Semantic Code Search - Search using natural language, not just keywords
  • 🌐 Multi-Language Support - TypeScript, JavaScript, Python, Java, Go, Rust, C/C++, C#, Ruby, PHP, and more
  • 🎯 Smart Code Parsing - Understands functions, classes, and code structure using Tree-sitter
  • 🔄 Real-time Updates - Automatically reindexes when files change or branches switch
  • 🚀 Multiple Embedding Providers - Use Google Gemini (free), OpenAI, or local Ollama models
  • ☁️ Flexible Storage - Works with Qdrant Cloud (free tier) or self-hosted instances

Installation

npm install -g mcp-codebase-index

Or use with npx (no installation required):

npx mcp-codebase-index

Quick Start

1. Set Up Qdrant Vector Database

Choose one option:

Option A: Qdrant Cloud (Recommended for beginners)

  • Sign up for free at Qdrant Cloud
  • Create a cluster (free tier available)
  • Get your API URL and key

Option B: Local Docker

docker run -p 6333:6333 qdrant/qdrant

2. Get an Embedding Provider API Key

Choose one option:

Option A: Google Gemini (Free)

Option B: OpenAI

Option C: Ollama (Local, Free)

  • Install Ollama
  • Pull a model: ollama pull nomic-embed-text

3. Configure Your IDE/Editor

Choose your IDE or editor and follow the setup instructions:

Add this to your Claude Desktop config file:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%/Claude/claude_desktop_config.json Linux: ~/.config/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "codebase-index": {
      "command": "npx",
      "args": ["-y", "mcp-codebase-index"],
      "env": {
        "CODEBASE_PATH": "/absolute/path/to/your/repository",
        "EMBEDDING_PROVIDER": "gemini",
        "GEMINI_API_KEY": "your-api-key-here",
        "QDRANT_URL": "http://localhost:6333"
      }
    }
  }
}

For Qdrant Cloud:

{
  "mcpServers": {
    "codebase-index": {
      "command": "npx",
      "args": ["-y", "mcp-codebase-index"],
      "env": {
        "CODEBASE_PATH": "/absolute/path/to/your/repository",
        "EMBEDDING_PROVIDER": "gemini",
        "GEMINI_API_KEY": "your-api-key-here",
        "QDRANT_URL": "https://your-cluster.cloud.qdrant.io",
        "QDRANT_API_KEY": "your-qdrant-api-key"
      }
    }
  }
}

Restart Claude Desktop after saving the configuration.

  1. Open VS Code Settings (Cmd/Ctrl + ,)
  2. Search for "Claude Code: MCP Servers"
  3. Click "Edit in settings.json"
  4. Add this configuration:
{
  "claude-code.mcpServers": {
    "codebase-index": {
      "command": "npx",
      "args": ["-y", "mcp-codebase-index"],
      "env": {
        "CODEBASE_PATH": "/absolute/path/to/your/repository",
        "EMBEDDING_PROVIDER": "gemini",
        "GEMINI_API_KEY": "your-api-key-here",
        "QDRANT_URL": "http://localhost:6333"
      }
    }
  }
}

Reload VS Code after saving.

Cursor supports MCP servers through its AI settings:

  1. Open Cursor Settings (Cmd/Ctrl + ,)

  2. Navigate to "Features" → "AI"

  3. Scroll to "Model Context Protocol"

  4. Click "Edit Config" or locate the config file:

    • macOS: ~/Library/Application Support/Cursor/User/globalStorage/mcp.json
    • Windows: %APPDATA%/Cursor/User/globalStorage/mcp.json
    • Linux: ~/.config/Cursor/User/globalStorage/mcp.json
  5. Add this configuration:

{
  "mcpServers": {
    "codebase-index": {
      "command": "npx",
      "args": ["-y", "mcp-codebase-index"],
      "env": {
        "CODEBASE_PATH": "/absolute/path/to/your/repository",
        "EMBEDDING_PROVIDER": "gemini",
        "GEMINI_API_KEY": "your-api-key-here",
        "QDRANT_URL": "http://localhost:6333"
      }
    }
  }
}

Restart Cursor after saving.

If you're using the Continue extension:

  1. Open the Continue configuration file:

    • macOS/Linux: ~/.continue/config.json
    • Windows: %USERPROFILE%\.continue\config.json
  2. Add to the experimental.modelContextProtocolServers section:

{
  "experimental": {
    "modelContextProtocolServers": [
      {
        "name": "codebase-index",
        "command": "npx",
        "args": ["-y", "mcp-codebase-index"],
        "env": {
          "CODEBASE_PATH": "/absolute/path/to/your/repository",
          "EMBEDDING_PROVIDER": "gemini",
          "GEMINI_API_KEY": "your-api-key-here",
          "QDRANT_URL": "http://localhost:6333"
        }
      }
    ]
  }
}

Reload VS Code after saving.

Windsurf supports MCP servers natively:

  1. Open Windsurf Settings

  2. Navigate to MCP Settings or locate the config file:

    • macOS: ~/Library/Application Support/Windsurf/mcp_config.json
    • Windows: %APPDATA%/Windsurf/mcp_config.json
    • Linux: ~/.config/Windsurf/mcp_config.json
  3. Add this configuration:

{
  "mcpServers": {
    "codebase-index": {
      "command": "npx",
      "args": ["-y", "mcp-codebase-index"],
      "env": {
        "CODEBASE_PATH": "/absolute/path/to/your/repository",
        "EMBEDDING_PROVIDER": "gemini",
        "GEMINI_API_KEY": "your-api-key-here",
        "QDRANT_URL": "http://localhost:6333"
      }
    }
  }
}

Restart Windsurf after saving.

Zed has experimental MCP support:

  1. Open Zed Settings (Cmd/Ctrl + ,)
  2. Add to your settings.json:
{
  "experimental": {
    "mcp_servers": {
      "codebase-index": {
        "command": "npx",
        "args": ["-y", "mcp-codebase-index"],
        "env": {
          "CODEBASE_PATH": "/absolute/path/to/your/repository",
          "EMBEDDING_PROVIDER": "gemini",
          "GEMINI_API_KEY": "your-api-key-here",
          "QDRANT_URL": "http://localhost:6333"
        }
      }
    }
  }
}

Restart Zed after saving.

For any MCP-compatible client, use this standard configuration:

{
  "mcpServers": {
    "codebase-index": {
      "command": "npx",
      "args": ["-y", "mcp-codebase-index"],
      "env": {
        "CODEBASE_PATH": "/absolute/path/to/your/repository",
        "EMBEDDING_PROVIDER": "gemini",
        "GEMINI_API_KEY": "your-api-key-here",
        "QDRANT_URL": "http://localhost:6333"
      }
    }
  }
}

Consult your client's documentation for the exact config file location.

4. Restart Your IDE/Editor

After saving the configuration, restart your IDE/editor. The server will automatically start indexing your codebase on first run.

Usage

Once configured, you can ask your AI assistant to search your codebase:

Example queries:

  • "Find the authentication middleware"
  • "Show me database connection code"
  • "Where is the user validation logic?"
  • "Find API endpoint handlers"
  • "Show me error handling utilities"

Advanced usage:

  • "Search for 'rate limiting' in TypeScript files"
  • "Find functions related to payment processing in the /src/api directory"
  • "Show me recent changes (reindex first)"

Available Tools

The MCP server provides these tools to your AI assistant:

codebase_search

Search your codebase using natural language or code queries.

Parameters:

  • query (required) - Your search query
  • limit - Number of results (default: 10)
  • threshold - Similarity threshold 0-1 (default: 0.7)
  • fileTypes - Filter by extensions (e.g., [".ts", ".js"])
  • paths - Filter by paths (e.g., ["src/api"])
  • includeContext - Include surrounding code (default: true)

indexing_status

Check indexing progress and statistics.

reindex

Re-index your codebase (useful after pulling changes).

Parameters:

  • mode - 'full', 'incremental', or 'file'
  • paths - Specific files to reindex (optional)
  • force - Force reindex even if unchanged

configure_indexer

Update indexer settings at runtime.

validate_config

Test your configuration and connections.

clear_index

Clear all indexed data and start fresh.

Configuration Reference

Required Settings

| Variable | Description | Example | |----------|-------------|---------| | CODEBASE_PATH | Absolute path to your repository | /Users/you/projects/myapp | | EMBEDDING_PROVIDER | Provider to use | gemini, openai, or ollama | | QDRANT_URL | Qdrant instance URL | http://localhost:6333 |

Provider-Specific Settings

For Gemini:

GEMINI_API_KEY=your-api-key

For OpenAI:

OPENAI_API_KEY=your-api-key
OPENAI_MODEL=text-embedding-3-small  # or text-embedding-3-large

For Ollama:

OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=nomic-embed-text

Optional Settings

| Variable | Description | Default | |----------|-------------|---------| | COLLECTION_NAME | Qdrant collection name | codebase_index | | INDEX_BATCH_SIZE | Files per batch | 50 | | INDEX_CONCURRENCY | Parallel processing limit | 5 | | LOG_LEVEL | Logging detail | info | | QDRANT_API_KEY | For Qdrant Cloud | - |

Supported Languages

  • TypeScript/JavaScript
  • Python
  • Java
  • Go
  • Rust
  • C/C++
  • C#
  • Ruby
  • PHP
  • Markdown

Unsupported languages fall back to intelligent text chunking.

Troubleshooting

"Cannot find module" errors

npm install -g mcp-codebase-index

Indexing is slow

Reduce INDEX_CONCURRENCY or increase INDEX_BATCH_SIZE in your configuration.

"Connection refused" to Qdrant

  • Ensure Qdrant is running: docker ps or check Qdrant Cloud status
  • Verify QDRANT_URL is correct
  • For cloud: ensure QDRANT_API_KEY is set

No search results

  • Check indexing status using the indexing_status tool
  • Try reindexing with the reindex tool
  • Lower the similarity threshold in your search

Rate limiting errors

If using a free API tier:

  • Reduce INDEX_CONCURRENCY to 1-2
  • Increase INDEX_BATCH_SIZE to reduce API calls

How It Works

  1. Code Parsing: Uses Tree-sitter to parse code into meaningful chunks (functions, classes, etc.)
  2. Embedding Generation: Converts code into vector embeddings using your chosen AI provider
  3. Vector Storage: Stores embeddings in Qdrant for fast similarity search
  4. Semantic Search: Finds relevant code by comparing query embeddings to stored code embeddings
  5. Real-time Updates: Watches for file changes and automatically reindexes

Performance Tips

  • For large codebases (1000+ files): Use INDEX_BATCH_SIZE=100 and INDEX_CONCURRENCY=3
  • For fast iteration: Use Ollama with local models (no API calls)
  • For best quality: Use OpenAI's text-embedding-3-large model
  • For free usage: Use Google Gemini (generous free tier)

Privacy & Security

  • All code indexing happens locally or in your chosen infrastructure
  • API keys are only used for embedding generation
  • Code is never sent to third parties (except for embedding generation)
  • Self-hosted Ollama option keeps everything completely local

Examples

Example 1: Find Authentication Code

You: "Find all authentication middleware in the codebase"
AI Assistant: [Uses codebase_search tool with query "authentication middleware"]

Example 2: Reindex After Git Pull

You: "I just pulled new changes, please reindex"
AI Assistant: [Uses reindex tool with mode "incremental"]

Example 3: Search Specific Directory

You: "Search for error handlers in the src/api directory"
AI Assistant: [Uses codebase_search with query "error handlers" and paths ["src/api"]]

Contributing

Contributions are welcome! Please visit our GitHub repository to:

  • Report issues
  • Submit pull requests
  • Request features
  • Ask questions

License

MIT License - see LICENSE file for details

Support

Acknowledgments

Built with: