npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

codesift-mcp

v0.1.0

Published

MCP server for code intelligence — 21 tools for symbol search, call graph analysis, and semantic code retrieval

Readme

CodeSift -- Token-efficient code intelligence for AI agents

CodeSift indexes your codebase with tree-sitter AST parsing and gives AI agents 22 search/retrieval tools via CLI or MCP server. It uses 20-33% fewer tokens than raw grep/Read workflows on typical code navigation tasks.

Quick install

npm install -g codesift-mcp

Quick start

# Index a project
codesift index /path/to/project

# Search for a function
codesift symbols local/my-project "createUser" --kind function --include-source

# Semantic search (requires embedding provider)
codesift retrieve local/my-project \
  --queries '[{"type":"semantic","query":"how does caching work?"}]'

Benchmark results

Measured on a real 4,127-file TypeScript codebase (70 tasks, CodeSift CLI vs Bash grep/Read).

| Category | CodeSift | Bash grep | Delta | |----------|----------|-----------|-------| | Text Search | 48,930 tok | 72,993 tok | -33% | | Symbol Search | 63,829 tok | 60,282 tok | +6% | | File Structure | 36,580 tok | 45,489 tok | -20% | | Code Retrieval | 57,703 tok | 60,482 tok | -5% | | Relationships | 52,312 tok | 60,810 tok | -14% | | Semantic Search | 7.8/10 quality | 6.5/10 | +20% quality |

CodeSift wins 4 of 6 categories. Symbol search is at parity (verbose output, being optimized). Relationship tracing is being rewritten for AST-level accuracy.

CLI commands

Indexing

| Command | Description | |---------|-------------| | codesift index <path> | Index a local folder | | codesift index-repo <url> | Clone and index a remote git repository | | codesift repos | List all indexed repositories | | codesift invalidate <repo> | Clear index cache for a repository |

Search

| Command | Description | |---------|-------------| | codesift search <repo> <query> | Full-text search across all files | | codesift symbols <repo> <query> | Search symbols by name/signature |

Outline

| Command | Description | |---------|-------------| | codesift tree <repo> | File tree with symbol counts | | codesift outline <repo> <file> | Symbol outline of a single file | | codesift repo-outline <repo> | High-level repository outline |

Symbol retrieval

| Command | Description | |---------|-------------| | codesift symbol <repo> <id> | Get a single symbol by ID | | codesift symbols-batch <repo> <ids...> | Get multiple symbols by ID | | codesift find <repo> <query> | Find symbol and show source | | codesift refs <repo> <name> | Find all references to a symbol |

Graph & analysis

| Command | Description | |---------|-------------| | codesift trace <repo> <name> | Trace call chain (callers/callees) | | codesift impact <repo> --since <ref> | Blast radius of git changes | | codesift context <repo> <query> | Assemble relevant code context | | codesift knowledge-map <repo> | Module dependency map |

Diff

| Command | Description | |---------|-------------| | codesift diff <repo> --since <ref> | Structural diff between git refs | | codesift changed <repo> --since <ref> | List changed symbols between refs |

Batch & utility

| Command | Description | |---------|-------------| | codesift retrieve <repo> --queries <json> | Batch multiple queries in one call | | codesift stats | Show usage statistics | | codesift generate-claude-md <repo> | Generate CLAUDE.md project summary |

When to use CodeSift vs grep

| Task | Best tool | Why | |------|-----------|-----| | Find text in files | codesift search | 33% fewer tokens, BM25 ranking | | Find function by name | codesift symbols | Returns signature + body in 1 call | | File structure | codesift tree | 20% fewer tokens, symbol counts | | "How does X work?" | codesift retrieve (semantic) | 20% better quality on concept queries | | Find ALL occurrences | grep -rn | Exhaustive, no top_k cap | | Count matches | grep -c | Simple exact count | | Call chain tracing | grep -rn "fn(" | CodeSift trace is being rewritten |

MCP server

CodeSift runs as an MCP server, exposing all 22 tools to AI agents like Claude.

Claude Code (CLI)

Add to ~/.claude.json:

{
  "mcpServers": {
    "codesift": {
      "command": "codesift-mcp"
    }
  }
}

Or from source:

{
  "mcpServers": {
    "codesift": {
      "command": "node",
      "args": ["/path/to/codesift-mcp/dist/server.js"]
    }
  }
}

Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "codesift": {
      "command": "node",
      "args": ["/path/to/codesift-mcp/dist/server.js"]
    }
  }
}

Semantic search

Semantic search uses embeddings to answer concept queries like "how does authentication work?" that keyword search misses.

Setup

Set one of these environment variables:

| Variable | Provider | Model | |----------|----------|-------| | CODESIFT_VOYAGE_API_KEY | Voyage AI | voyage-code-3 | | CODESIFT_OPENAI_API_KEY | OpenAI | text-embedding-3-small | | CODESIFT_OLLAMA_URL | Ollama (local) | nomic-embed-text |

Usage

# Pure semantic search
codesift retrieve local/my-project \
  --queries '[{"type":"semantic","query":"error handling and retry logic","top_k":10}]'

# Hybrid search (semantic + BM25 text, RRF-merged)
codesift retrieve local/my-project \
  --queries '[{"type":"hybrid","query":"caching strategy","top_k":10}]'

Semantic and hybrid queries exclude test files by default to maximize token efficiency. To include test files, set "exclude_tests": false in the sub-query or pass --exclude-tests=false on the CLI.

MCP example

{
  "mcpServers": {
    "codesift": {
      "command": "codesift-mcp",
      "env": {
        "CODESIFT_OPENAI_API_KEY": "sk-..."
      }
    }
  }
}

Configuration

All configuration is via environment variables.

| Variable | Description | Default | |----------|-------------|---------| | CODESIFT_DATA_DIR | Storage directory for indexes | ~/.codesift | | CODESIFT_WATCH_DEBOUNCE_MS | File watcher debounce interval | 500 | | CODESIFT_DEFAULT_TOKEN_BUDGET | Default token budget for retrieval | 8000 | | CODESIFT_DEFAULT_TOP_K | Default max results for search | 20 |

How it works

  1. Indexing -- Tree-sitter WASM grammars parse source files into ASTs. Symbol extraction produces functions, classes, methods, types, constants, etc. with signatures, docstrings, and source code.

  2. BM25F search -- Symbols are tokenized (camelCase/snake_case splitting) and indexed with field-weighted BM25 scoring. Name matches rank 3x higher than body matches.

  3. Semantic search (optional) -- Source code is chunked and embedded via the configured provider. Queries are embedded at search time and ranked by cosine similarity. Multi-sub-query decomposition with Reciprocal Rank Fusion (RRF, k=60).

  4. Hybrid search -- Combines semantic embedding similarity with BM25 text matches via RRF, getting the best of both keyword and concept search.

  5. File watcher -- chokidar watches indexed folders for changes. Modified files are re-parsed and the index is updated incrementally.

Supported languages

TypeScript, JavaScript (JSX/TSX), Python, Go, Rust, Java, Ruby, PHP, Markdown, CSS, Prisma.

Development

git clone https://github.com/greglas/codesift-mcp.git
cd codesift-mcp
npm install
npm run download-wasm   # Download tree-sitter WASM grammars
npm run build           # TypeScript compilation
npm test                # Run tests (Vitest)
npm run test:coverage   # Coverage report

License

MIT