npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

semantic-retrieval-mcp

v1.14.0

Published

Semantic code retrieval MCP Server — natural language → relevant code snippets, with incremental sync and file watching

Readme

semantic-retrieval-mcp

Semantic code retrieval MCP Server — natural language queries → relevant code snippets.

What it does

In any MCP-compatible AI client (Claude Code, Cline, Windsurf, etc.), ask:

"Find the authentication handler in this project"

The AI calls this MCP Server, which performs semantic search across your codebase and returns precisely matched code snippets.

Core capability: Natural language → relevant code. Cross-language, real-time index, far better than grep.

Install

npm install -g semantic-retrieval-mcp

Or use directly with npx:

npx -y semantic-retrieval-mcp

Configuration

With npx (recommended)

{
  "mcpServers": {
    "semantic-retrieval": {
      "command": "npx",
      "args": ["-y", "semantic-retrieval-mcp"],
      "env": {
        "SR_TOKEN": "your-access-token",
        "SR_TENANT": "https://your-tenant-url/"
      }
    }
  }
}

With global install

{
  "mcpServers": {
    "semantic-retrieval": {
      "command": "semantic-retrieval-mcp",
      "env": {
        "SR_TOKEN": "your-access-token",
        "SR_TENANT": "https://your-tenant-url/"
      }
    }
  }
}

Auto-detect from editor

If you have a compatible editor with an active session, the server can read credentials automatically:

{
  "mcpServers": {
    "semantic-retrieval": {
      "command": "npx",
      "args": ["-y", "semantic-retrieval-mcp"],
      "env": {
        "SR_EDITOR": "cursor"
      }
    }
  }
}

Environment Variables

| Variable | Description | |---|---| | SR_TOKEN | Access token (recommended) | | SR_TENANT | Tenant URL (recommended) | | SR_EDITOR | Auto-detect from editor: vscode / cursor / windsurf / trae / kiro | | SR_WATCH | File watching enabled by default; set to 0 to disable |

Usage Scenarios

Scenario 1: AI auto-invokes (most common)

After configuration, just chat with your AI normally. When it needs to understand code, it calls the tool automatically:

You: "Refactor the login logic from session-based to JWT"

AI internally: → calls codebase-retrieval("authentication and login logic")
              → receives relevant code snippets
              → provides solution based on actual code

You don't need to do anything. On first query for a directory, the server automatically:

  1. Scans all source files
  2. Computes fingerprints (SHA256)
  3. Uploads new files to the indexing backend
  4. Creates a checkpoint
  5. Performs semantic retrieval

Subsequent queries reuse the cached checkpoint — no re-scanning or re-uploading.

Scenario 2: Different project directory

Each directory_path is managed independently. New directories get indexed automatically.

Scenario 3: Force re-sync after major changes

You: "I just merged a big branch, re-index the project"
AI: → calls sync-workspace(directory_path)

Scenario 4: Real-time file watching

File watching is enabled by default. When active, chokidar monitors file changes:

  • File edits/creates/deletes → automatic fingerprint update
  • 2-second debounce for batch processing (or 50 changes triggers immediately)
  • Auto upload + checkpoint update
  • Next query always uses the latest index

Scenario 5: Token rotation

When the token changes, cached checkpoints auto-invalidate. Next query re-syncs with the new token automatically.

Tools

codebase-retrieval

| Parameter | Type | Description | |---|---|---| | information_request | string | Natural language query (e.g. "Find the database connection code") | | directory_path | string | Absolute path to the project directory to search |

sync-workspace

| Parameter | Type | Description | |---|---|---| | directory_path | string | Absolute path to the project directory to re-sync |

How it works

Your AI Client
    │ MCP (stdio)
    ▼
MCP Server (index.js)
    │
    ├─ First query for a dir ──→ BlobManager
    │   ├─ Scan files → SHA256(path+content) = blobName
    │   ├─ POST find-missing     → Which files does the server not have?
    │   ├─ POST batch-upload     → Upload missing files
    │   ├─ POST checkpoint-blobs → Create snapshot, get checkpointId
    │   └─ Persist to ~/.semantic-retrieval-mcp/<hash>.json
    │
    ├─ Subsequent queries ──→ Use cached checkpointId
    │
    └─ POST agents/codebase-retrieval
        → Cloud semantic search
        → Return matching code snippets to AI

File watching (SR_WATCH=1):
    chokidar → file change → debounce → find-missing → upload → checkpoint

Cache

Each workspace directory's state is cached in ~/.semantic-retrieval-mcp/<hash>.json.

  • Survives MCP Server restarts
  • Same tenant, different token → checkpoint reused (content-based SHA256)
  • Different tenant → checkpoint auto-invalidated
  • Delete ~/.semantic-retrieval-mcp/ to clear all caches

License

MIT