npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

semantic-code-mcp

v2.3.0

Published

AI-powered semantic code search for coding agents. MCP server with multi-provider embeddings and hybrid search.

Readme

Semantic Code MCP

npm version npm downloads License: MIT Node.js Non-Blocking Multi-Agent Milvus

AI-powered semantic code search for coding agents. An MCP server with non-blocking background indexing, multi-provider embeddings (Gemini, Vertex AI, OpenAI, local), and Milvus / Zilliz Cloud vector storage — designed for multi-agent concurrent access.

Run Claude Code, Codex, Copilot, and Antigravity against the same code index simultaneously. Indexing runs in the background; search works immediately while indexing continues.

Ask "where do we handle authentication?" and find code that uses login, session, verifyCredentials — even when no file contains the word "authentication."

Quick Start

npx -y semantic-code-mcp@latest --workspace /path/to/your/project

MCP config:

{
  "mcpServers": {
    "semantic-code-mcp": {
      "command": "npx",
      "args": ["-y", "semantic-code-mcp@latest", "--workspace", "/path/to/project"]
    }
  }
}
graph LR
    A["Claude Code"] --> M["Milvus Standalone<br/>(Docker)"]
    B["Codex"] --> M
    C["Copilot"] --> M
    D["Antigravity"] --> M
    M --> V["Shared Vector Index"]

Why

Traditional grep and keyword search break down when you don't know the exact terms used in the codebase. Semantic search bridges that gap:

  • Concept matching"error handling" finds try/catch, onRejected, fallback patterns
  • Typo-tolerant"embeding modle" still finds embedding model code
  • Hybrid scoring — semantic similarity (0.7 weight) + lexical exact/partial match boost (up to +1.5)
  • Search dedup — per-file result limiting (default 2) prevents a single large file from dominating results
  • Context-aware chunking — AST-based (Tree-sitter) or smart regex splitting preserves code structure
  • Fast — progressive indexing lets you search while the codebase is still being indexed

Based on Cursor's research showing semantic search improves AI agent performance by 12.5%.

Setup

{
  "mcpServers": {
    "semantic-code-mcp": {
      "command": "npx",
      "args": ["-y", "semantic-code-mcp@latest", "--workspace", "/path/to/project"]
    }
  }
}

Claude Code: ~/.claude/settings.local.jsonmcpServers
Claude Desktop: ~/Library/Application Support/Claude/claude_desktop_config.json

Create .vscode/mcp.json in your project root:

{
  "servers": {
    "semantic-code-mcp": {
      "command": "npx",
      "args": ["-y", "semantic-code-mcp@latest", "--workspace", "${workspaceFolder}"]
    }
  }
}

VS Code and Cursor support ${workspaceFolder}. Windsurf requires absolute paths.

~/.codex/config.toml:

[mcp_servers.semantic-code-mcp]
command = "npx"
args = ["-y", "semantic-code-mcp@latest", "--workspace", "/path/to/project"]

~/.gemini/antigravity/mcp_config.json:

{
  "mcpServers": {
    "semantic-code-mcp": {
      "command": "npx",
      "args": ["-y", "semantic-code-mcp@latest", "--workspace", "/path/to/project"]
    }
  }
}

For monorepos or workspaces with 1000+ files, a shell wrapper script gives you:

  • Real-time logs — see indexing progress, error details, 429 retry status
  • No MCP timeout — long-running index operations won't be killed
  • Environment isolation — pin provider credentials per project

Create start-semantic-code-mcp.sh:

#!/bin/bash
export SMART_CODING_WORKSPACE="/path/to/monorepo"
export SMART_CODING_EMBEDDING_PROVIDER="vertex"
export SMART_CODING_VECTOR_STORE_PROVIDER="milvus"
export SMART_CODING_MILVUS_ADDRESS="http://localhost:19530"
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"
export SMART_CODING_VERTEX_PROJECT="your-gcp-project-id"

cd /path/to/semantic-code-mcp
exec node index.js
chmod +x start-semantic-code-mcp.sh

Then reference in your MCP config:

{
  "semantic-code-mcp": {
    "command": "/absolute/path/to/start-semantic-code-mcp.sh",
    "args": []
  }
}

When to use shell scripts over npx:

  • Monorepo with multiple sub-projects sharing one index
  • 1000+ files requiring long initial indexing
  • Debugging 429 rate-limit or gRPC errors (need real-time stderr)
  • Pinning specific provider credentials per workspace

Features

Multi-Provider Embeddings

| Provider | Model | Privacy | Speed | | --------------------- | ----------------------- | ---------- | ------------- | | Local (default) | nomic-embed-text-v1.5 | 100% local | ~50ms/chunk | | Gemini | gemini-embedding-001 | API call | Fast, batched | | OpenAI | text-embedding-3-small | API call | Fast | | OpenAI-compatible | Any compatible endpoint | Varies | Varies | | Vertex AI | Google Cloud models | GCP | Fast |

Flexible Vector Storage

  • SQLite (default) — zero-config, single-file .smart-coding-cache/embeddings.db
  • Milvus — scalable ANN search for large codebases or shared team indexes

Smart Code Chunking

Three modes to match your codebase:

  • smart (default) — regex-based, language-aware splitting
  • ast — Tree-sitter parsing for precise function/class boundaries
  • line — simple fixed-size line chunks

Resource Throttling

CPU capped at 50% during indexing. Your machine stays responsive.

Multi-Agent Concurrent Access

Multiple AI agents (Claude Code, Codex, Copilot, Antigravity) can query the same vector index simultaneously via Milvus Standalone (Docker). No file locking, no index corruption.

Milvus Standalone runs 3 containers working together:

graph LR
    A["semantic-code-mcp"] -->|"gRPC :19530"| M["milvus standalone"]
    M -->|"object storage"| S["minio :9000"]
    M -->|"metadata"| E["etcd :2379"]

| Container | Role | Image | | -------------- | ------------------------------------- | ----------------- | | standalone | Vector engine (gRPC :19530) | milvusdb/milvus | | etcd | Metadata store (cluster coordination) | coreos/etcd | | minio | Object storage (index files, logs) | minio/minio |

Performance Guidelines

| Resource | Minimum | Recommended | | -------- | -------- | ----------------------------- | | RAM | 4 GB | 8 GB+ | | Disk | 10 GB | 50 GB+ (scales with codebase) | | CPU | 2 cores | 4+ cores | | Docker | v20+ | Latest |

⚠️ RAM is the critical bottleneck. Milvus Standalone idles at ~2.5 GB RAM across the 3 containers. Machines with < 4 GB will experience swap thrashing and gRPC timeouts. Check with docker stats.

1. Install with Docker Compose

# docker-compose.yml
version: '3.5'
services:
  etcd:
    image: coreos/etcd:v3.5.18
    environment:
      ETCD_AUTO_COMPACTION_MODE: revision
      ETCD_AUTO_COMPACTION_RETENTION: "1000"
      ETCD_QUOTA_BACKEND_BYTES: "4294967296"
    command: etcd -advertise-client-urls=http://127.0.0.1:2379 -listen-client-urls http://0.0.0.0:2379 --data-dir /etcd
    volumes:
      - etcd-data:/etcd

  minio:
    image: minio/minio:RELEASE.2023-03-20T20-16-18Z
    environment:
      MINIO_ACCESS_KEY: minioadmin
      MINIO_SECRET_KEY: minioadmin
    command: minio server /minio_data --console-address ":9001"
    ports:
      - "9000:9000"
      - "9001:9001"
    volumes:
      - minio-data:/minio_data

  standalone:
    image: milvusdb/milvus:v2.5.1
    command: ["milvus", "run", "standalone"]
    environment:
      ETCD_ENDPOINTS: etcd:2379
      MINIO_ADDRESS: minio:9000
    ports:
      - "19530:19530"
      - "9091:9091"
    volumes:
      - milvus-data:/var/lib/milvus
    depends_on:
      - etcd
      - minio

volumes:
  etcd-data:
  minio-data:
  milvus-data:

2. Start & Verify

# Start all 3 containers
docker compose up -d

# Verify all 3 containers are running
docker compose ps
# NAME         STATUS
# etcd         running
# minio        running
# standalone   running (healthy)

# Check RAM usage (expect ~2.5 GB total idle)
docker stats --no-stream

3. Configure MCP to use Milvus

{
  "env": {
    "SMART_CODING_VECTOR_STORE_PROVIDER": "milvus",
    "SMART_CODING_MILVUS_ADDRESS": "http://localhost:19530"
  }
}

4. Verify connection

# Should return collection list (may be empty initially)
curl http://localhost:19530/v1/vector/collections

5. Lifecycle Management

# Stop all containers (preserves data)
docker compose stop

# Restart after reboot
docker compose start

# Full reset (removes all indexed vectors)
docker compose down -v

# View logs for debugging
docker compose logs -f standalone

6. Monitoring

  • MinIO Console: http://localhost:9001 (minioadmin / minioadmin)
  • Milvus Health: http://localhost:9091/healthz
  • Container RAM: docker stats --no-stream

Troubleshooting

| Symptom | Cause | Fix | | ------------------------------------- | ---------------------------- | -------------------------------------------------------------------------------- | | gRPC timeout / connection refused | Milvus not fully started | Wait 30–60s after docker compose up -d, check docker compose logs standalone | | Swap thrashing, slow queries | < 4 GB RAM | Upgrade RAM or use SQLite for single-agent setups | | etcd: mvcc: database space exceeded | etcd compaction backlog | docker compose restart etcd | | Milvus OOM killed | RAM pressure from other apps | Close heavy apps or increase Docker memory limit |

SQLite vs Milvus: SQLite is single-process — only one agent can write at a time. Milvus handles concurrent reads/writes from multiple agents without conflicts. Use Milvus when running 2+ agents on the same codebase.

Tools

| Tool | Description | | ---------------------- | ------------------------------------------------------------ | | a_semantic_search | Find code by meaning. Hybrid semantic + exact match scoring. | | b_index_codebase | Trigger manual reindex (normally automatic & incremental). | | c_clear_cache | Reset embeddings cache entirely. | | d_check_last_version | Look up latest package version from 20+ registries. | | e_set_workspace | Switch project at runtime without restart. | | f_get_status | Server health: version, index progress, config. |

IDE Setup

| IDE / App | Guide | ${workspaceFolder} | | ------------------ | ----------------------------------------- | -------------------- | | VS Code | Setup | ✅ | | Cursor | Setup | ✅ | | Windsurf | Setup | ❌ | | Claude Desktop | Setup | ❌ | | OpenCode | Setup | ❌ | | Raycast | Setup | ❌ | | Antigravity | Setup | ❌ |

Multi-Project

{
  "mcpServers": {
    "code-frontend": {
      "command": "npx",
      "args": ["-y", "semantic-code-mcp@latest", "--workspace", "/path/to/frontend"]
    },
    "code-backend": {
      "command": "npx",
      "args": ["-y", "semantic-code-mcp@latest", "--workspace", "/path/to/backend"]
    }
  }
}

Configuration

All settings via environment variables. Prefix: SMART_CODING_.

Core

| Variable | Default | Description | | ------------------------------- | --------- | ------------------------------------------------------------------------------------------------------- | | SMART_CODING_VERBOSE | false | Detailed logging | | SMART_CODING_MAX_RESULTS | 5 | Search results returned | | SMART_CODING_BATCH_SIZE | 100 | Files per parallel batch | | SMART_CODING_MAX_FILE_SIZE | 1048576 | Max file size (1MB) | | SMART_CODING_CHUNK_SIZE | 25 | Lines per chunk | | SMART_CODING_CHUNKING_MODE | smart | smart / ast / line | | SMART_CODING_WATCH_FILES | false | Auto-reindex on changes | | SMART_CODING_AUTO_INDEX_DELAY | false | Background index on startup. false=off (multi-agent safe), true=5s, or ms value. Single-agent only. | | SMART_CODING_MAX_CPU_PERCENT | 50 | CPU cap during indexing |

Embedding Provider

| Variable | Default | Description | | ---------------------------------- | -------------------------------- | -------------------------------------------------------------- | | SMART_CODING_EMBEDDING_PROVIDER | local | local / gemini / openai / openai-compatible / vertex | | SMART_CODING_EMBEDDING_MODEL | nomic-ai/nomic-embed-text-v1.5 | Model name | | SMART_CODING_EMBEDDING_DIMENSION | 128 | MRL dimension (64–768) | | SMART_CODING_DEVICE | auto | cpu / webgpu / auto |

Gemini

| Variable | Default | Description | | --------------------------------- | ---------------------- | ----------------- | | SMART_CODING_GEMINI_API_KEY | — | API key | | SMART_CODING_GEMINI_MODEL | gemini-embedding-001 | Model | | SMART_CODING_GEMINI_DIMENSIONS | 768 | Output dimensions | | SMART_CODING_GEMINI_BATCH_SIZE | 24 | Micro-batch size | | SMART_CODING_GEMINI_MAX_RETRIES | 3 | Retry count |

OpenAI / Compatible

| Variable | Default | Description | | --------------------------------- | ------- | -------------------------- | | SMART_CODING_EMBEDDING_API_KEY | — | API key | | SMART_CODING_EMBEDDING_BASE_URL | — | Base URL (compatible only) |

Vertex AI

| Variable | Default | Description | | ------------------------------ | ------------- | -------------- | | SMART_CODING_VERTEX_PROJECT | — | GCP project ID | | SMART_CODING_VERTEX_LOCATION | us-central1 | Region |

Vector Store

| Variable | Default | Description | | ------------------------------------ | ------------------------- | -------------------------------------- | | SMART_CODING_VECTOR_STORE_PROVIDER | sqlite | sqlite / milvus | | SMART_CODING_MILVUS_ADDRESS | — | Milvus endpoint or Zilliz Cloud URI | | SMART_CODING_MILVUS_TOKEN | — | Auth token (required for Zilliz Cloud) | | SMART_CODING_MILVUS_DATABASE | default | Database name | | SMART_CODING_MILVUS_COLLECTION | smart_coding_embeddings | Collection |

Zilliz Cloud (Managed Milvus)

For teams or serverless deployments, use Zilliz Cloud instead of self-hosted Docker:

{
  "env": {
    "SMART_CODING_VECTOR_STORE_PROVIDER": "milvus",
    "SMART_CODING_MILVUS_ADDRESS": "https://in03-xxxx.api.gcp-us-west1.zillizcloud.com",
    "SMART_CODING_MILVUS_TOKEN": "your-zilliz-api-key"
  }
}

| Feature | Milvus Standalone (Docker) | Zilliz Cloud | | ----------- | -------------------------- | --------------------------- | | Setup | Self-hosted, 3 containers | Managed SaaS | | RAM | ~2.5 GB idle | None (serverless) | | Multi-agent | ✅ via shared Docker | ✅ via shared endpoint | | Scaling | Manual | Auto-scaling | | Free tier | — | 2 collections, 1M vectors | | Best for | Local dev, single machine | Team use, CI/CD, production |

Get your Zilliz Cloud URI and API key from the Zilliz Console → Cluster → Connect.

Search Tuning

| Variable | Default | Description | | --------------------------------- | ------- | ------------------------------------------------------------------------------------------- | | SMART_CODING_SEMANTIC_WEIGHT | 0.7 | Semantic score weight (ANN similarity × this value) | | SMART_CODING_EXACT_MATCH_BOOST | 1.5 | Boost added when query appears verbatim in chunk content | | SMART_CODING_DEDUP_MAX_PER_FILE | 1 | Max results per file. Ensures maximum source diversity — one chunk per file. 0 = disabled |

Hybrid scoring formula: score = ANN_similarity × semanticWeight + lexicalBoost

| Match type | Boost value | | ------------- | ------------------------------------ | | Exact match | +exactMatchBoost (default +1.5) | | Partial match | +(matchedWords / totalWords) × 0.3 | | No match | +0 |

Example with Gemini + Milvus

{
  "mcpServers": {
    "semantic-code-mcp": {
      "command": "npx",
      "args": ["-y", "semantic-code-mcp@latest", "--workspace", "/path/to/project"],
      "env": {
        "SMART_CODING_EMBEDDING_PROVIDER": "gemini",
        "SMART_CODING_GEMINI_API_KEY": "YOUR_KEY",
        "SMART_CODING_VECTOR_STORE_PROVIDER": "milvus",
        "SMART_CODING_MILVUS_ADDRESS": "http://localhost:19530"
      }
    }
  }
}

Architecture

graph TD
    A["MCP Server — index.js"] --> B["Features"]
    B --> B1["hybrid-search"]
    B --> B2["index-codebase"]
    B --> B3["set-workspace / get-status / clear-cache"]

    B2 --> C["Code Chunking — AST or Smart Regex"]
    C --> D["Embedding — Local / Gemini / Vertex / OpenAI"]
    D --> E["Vector Store — SQLite or Milvus"]

    B1 --> D
    B1 --> E

How It Works

flowchart LR
    A["📁 Source Files"] -->|"glob + .gitignore"| B["✂️ Smart/AST<br/>Chunking"]
    B -->|language-aware| C["🧠 AI Embedding<br/>(Local or API)"]
    C -->|vectors| D["💾 SQLite / Milvus<br/>Storage"]
    D -->|incremental hash| D

    E["🔍 Search Query"] -->|embed| C
    C -->|"k×5 oversample"| F["📊 Hybrid Scoring<br/>semantic × 0.7<br/>+ lexical boost"]
    F --> DD["🔄 Dedup<br/>max 2 per file"]
    DD --> G["🎯 Top N Results<br/>with relevance scores"]

    style A fill:#2d3748,color:#e2e8f0
    style C fill:#553c9a,color:#e9d8fd
    style D fill:#2a4365,color:#bee3f8
    style F fill:#744210,color:#fefcbf
    style DD fill:#553c9a,color:#e9d8fd
    style G fill:#22543d,color:#c6f6d5

Progressive indexing — search works immediately while indexing continues in the background. Only changed files are re-indexed on subsequent runs.

Incremental Indexing & Optimization

Semantic Code MCP uses a hash-based incremental indexing strategy to minimize redundant work:

flowchart TD
    A["File discovered"] --> B{"Hash changed?"}
    B -->|No| C["Skip — use cached vectors"]
    B -->|Yes| D["Re-chunk & re-embed"]
    D --> E["Update vector store"]
    F["Deleted file detected"] --> G["Prune stale vectors"]

    style C fill:#22543d,color:#c6f6d5
    style D fill:#744210,color:#fefcbf
    style G fill:#742a2a,color:#fed7d7

How it works:

  1. File discovery — glob patterns with .gitignore-aware filtering
  2. Hash comparison — each file's mtime + size is compared against the cached index
  3. Delta processing — only changed/new files are chunked and embedded
  4. Stale pruning — deleted files are removed from the vector store automatically
  5. Reconciliation sweep — see below
  6. Progressive search — queries work immediately, even mid-indexing

Reconciliation Sweep

Hash-based pruning catches deletions during normal indexing, but can miss ghost vectors when:

  • The hash cache (file-hashes.json) is cleared (e.g., c_clear_cache)
  • Files are moved outside the workspace
  • A previous indexing job was interrupted

The reconciliation sweep runs automatically after each b_index_codebase to catch these edge cases:

flowchart LR
    A["🔍 Query Milvus\n(all file paths)"] --> B{"File exists\non disk?"}
    B -->|Yes| C["✅ Keep"]
    B -->|No| D["🗑️ Delete vectors\nfilter: file == '...'"]
    D --> E["📊 Report via\nf_get_status"]

    style A fill:#2a4365,color:#bee3f8
    style C fill:#22543d,color:#c6f6d5
    style D fill:#742a2a,color:#fed7d7
    style E fill:#744210,color:#fefcbf

Status response (via f_get_status):

{
  "index": {
    "status": "ready",
    "lastReconcile": {
      "orphans": 0,
      "seconds": 0.43
    }
  }
}

Reconciliation is independent of file-hashes.json — it directly compares Milvus ↔ disk.

Performance characteristics:

| Scenario | Behavior | Typical Time | | --------------------------- | ----------------- | ------------------------------ | | First run (500 files) | Full index | ~30–60s (API), ~2–5min (local) | | Subsequent run (no changes) | Hash check only | < 1s | | 10 files changed | Incremental delta | ~2–5s | | Branch switch | Partial re-index | ~5–15s | | force=true | Full rebuild | Same as first run |

⚠️ Multi-agent warning: Auto-index is disabled by default to prevent concurrent Milvus writes when multiple agents share the same server. Set SMART_CODING_AUTO_INDEX_DELAY=true (5s) only if a single agent connects to this MCP server. Use b_index_codebase for explicit on-demand indexing in multi-agent setups.

MCP tool calls have timeout limits and don't expose real-time logs. For bulk operations (initial setup, full rebuild, migration), use the CLI reindex script directly:

cd /path/to/semantic-code-mcp
node reindex.js /path/to/workspace --force

When to use CLI over MCP tools:

| Scenario | Use | | ---------------------------- | ----------------------------------- | | Daily incremental updates | MCP b_index_codebase(force=false) | | Initial workspace setup | CLI node reindex.js /path --force | | Full rebuild after migration | CLI node reindex.js /path --force | | 1000+ file bulk update | CLI (timeout-safe, real-time logs) | | Debugging 429 / gRPC errors | CLI (stderr visible) |

The CLI reindex script uses the same incremental engine under the hood. --force only forces re-embedding; it still uses the same hash-based delta for efficiency.

Non-Blocking Indexing Workflow

All indexing operations run in the background and return immediately. The agent can search while indexing continues.

sequenceDiagram
    participant Agent
    participant MCP as semantic-code-mcp
    participant BG as Background Thread
    participant Store as Milvus / SQLite

    Agent->>MCP: b_index_codebase(force=false)
    MCP->>BG: startBackgroundIndexing()
    MCP-->>Agent: {status: "started", message: "..."}
    Note over Agent: ⚡ Returns instantly

    loop Poll every 2-3s
        Agent->>MCP: f_get_status()
        MCP-->>Agent: {index.status: "indexing", progress: "150/500 files"}
    end

    BG->>Store: upsert vectors
    BG-->>MCP: done

    Agent->>MCP: f_get_status()
    MCP-->>Agent: {index.status: "ready"}

    Agent->>MCP: a_semantic_search(query)
    MCP-->>Agent: [results]

Rules for agents:

  1. Always call f_get_status first — check workspace and indexing status
  2. Use e_set_workspace if workspace is wrong — before any indexing
  3. Poll f_get_status until index.status: "ready" before relying on search results
  4. Progressive search is supporteda_semantic_search works during indexing with partial results
  5. SMART_CODING_AUTO_INDEX_DELAY=false by default — use b_index_codebase for explicit on-demand indexing in multi-agent setups

Indexing Architecture Internals

The sister project markdown-rag (Python/asyncio) wraps every sync operation in asyncio.to_thread() to prevent blocking the event loop. This project doesn't need that — here's why:

| Operation | Python asyncio | Node.js | | ----------------------------- | ------------------------------------------- | ---------------------------------------------------------------- | | File I/O (stat, readFile) | Sync by default — blocks event loop | Async by defaultfs.promises.* runs on libuv thread pool | | Network I/O (Milvus gRPC) | milvus_client.delete()sync, blocks | Native async via Promises | | CPU-bound (embedding) | GIL limits to_thread effectiveness | Worker threads — true multi-core parallelism | | CPU-bound (chunking) | to_thread offload needed | Event loop yields between await points |

In Python, calling os.stat() or milvus_client.insert() inside an async def function freezes the entire event loop until the call completes. That's why markdown-rag needs 7 separate asyncio.to_thread() wrappers across its pipeline.

In Node.js, await fs.stat(file) dispatches to the libuv thread pool automatically. The event loop stays responsive and can handle other MCP requests (e.g., f_get_status, a_semantic_search) while file I/O executes in the background.

The only CPU-bound bottleneck — embedding computation — is offloaded to Worker threads (worker_threads module) for true multi-core parallelism. See processChunksWithWorkers() in features/index-codebase.js.

graph TD
    A["MCP Request: b_index_codebase"] --> B["handleToolCall()"]
    B --> C["startBackgroundIndexing() — fire-and-forget"]
    C --> D["indexAll()"]
    D --> E["discoverFiles() — async fs via libuv"]
    E --> F["sortFilesByPriority() — async stat via libuv"]
    F --> G["Per-file: fs.stat + fs.readFile — async"]
    G --> H{"Workers available?"}
    H -->|Yes| I["processChunksWithWorkers() — multi-core"]
    H -->|No| J["processChunksSingleThreaded() — fallback"]
    I --> K["Batch insert to vector store"]
    J --> K
    K --> L["cache.save()"]

Unlike markdown-rag's explicit 3-way delta (new_files / modified_files / deleted_files), this project uses a 2-phase mtime→hash check that handles new and modified files in a single code path:

For each file:
  1. mtime unchanged?  → skip (definitely unchanged)
  2. mtime changed → read content → compute hash
  3. hash unchanged?   → update cached mtime, skip
  4. hash changed?     → removeFileFromStore() + re-chunk + re-embed
  5. Not in cache?     → removeFileFromStore() + re-chunk + re-embed  (new file)

New files hit removeFileFromStore() (step 5), which is technically a no-op since there are no existing vectors for that file. This differs from markdown-rag, which explicitly skips delete for new files:

| Aspect | semantic-code-mcp | markdown-rag | | ------------------------ | --------------------------------------- | -------------------------------------------------- | | Delta classification | 2-way (changed vs unchanged) | 3-way (new / modified / deleted) | | New file handling | removeFileFromStore()no-op | Explicit skip — no delete call | | Delete cost per file | SQLite DELETE WHERE file=?<1ms | Milvus gRPC delete()10–50ms | | Impact of 1000 new files | <1s total waste (negligible) | 10–50s waste (significant) | | Deleted file pruning | Batch prune in indexAll() step 1.5 | get_index_delta_detailed() returns explicit list |

Why this design is acceptable: The vector store backend matters. With SQLite, a no-op delete is a sub-millisecond local operation. With Milvus (network gRPC), each no-op delete costs 10–50ms of round-trip time — that's why markdown-rag invested in the 3-way classification to eliminate 1,288 unnecessary gRPC calls.

Future consideration: If the Milvus backend (milvus-cache.js) shows measurable overhead on large new-file batches, a 3-way delta classification can be introduced. Currently, benchmarks show no meaningful difference for typical codebases (<5,000 files).

Privacy

  • Local mode: everything runs on your machine. Code never leaves your system.
  • API mode: code chunks are sent to the embedding API for vectorization. No telemetry beyond provider API calls.

Agent Rules (AGENTS.md Integration)

This server has a mandatory search role defined in AGENTS.md:

## Search Role: semantic-code-mcp

- **Code semantic search**. Use after grep narrows scope, or when grep can't find the logic.
  `a_semantic_search(query, maxResults=5)`. For duplicate detection: `maxResults=10`.

### DUAL-SEARCH MANDATE
You MUST use at least 2 tools per search. Single-tool search is FORBIDDEN.

### Decision Table

| What you need             | 1st tool         | 2nd tool (REQUIRED) | NEVER use |
| ------------------------- | ---------------- | ------------------- | --------- |
| Exact symbol / function   | `grep`           | Code RAG or view    | Doc RAG   |
| Code logic understanding  | Code RAG         | grep → `view_file`  | Doc RAG   |
| Config value across files | `grep --include` | Doc RAG             | —         |

### Parameters
- maxResults: quick=3, general=5, comprehensive/dedup=10.
- scopePath: ALWAYS set when target project is known.
- Query language: English for code search.

### Anti-patterns (FORBIDDEN)
- ❌ Doc RAG to find code locations → ✅ grep or Code RAG
- ❌ Code RAG for Korean workflow docs → ✅ Doc RAG
- ❌ Single-tool search → ✅ Always 2+ tools

Source: AGENTS.md §3 Search, rag/SKILL.md, 07.5-검색-도구-벤치마크


License

MIT License

Copyright (c) 2025 Omar Haris (original), bitkyc08 (modifications, 2026)

See LICENSE for full text.

About

This project is a fork of smart-coding-mcp by Omar Haris, heavily extended for production use.

Key additions over upstream:

  • Multi-provider embeddings (Gemini, Vertex AI, OpenAI, OpenAI-compatible)
  • Milvus vector store with ANN search for large codebases
  • Hybrid search scoring (semantic × 0.7 + lexical boost up to +1.5)
  • Per-file dedup in search results for diverse output
  • AST-based code chunking via Tree-sitter
  • Resource throttling (CPU cap at 50%)
  • Runtime workspace switching (e_set_workspace)
  • Package version checker across 20+ registries (d_check_last_version)
  • Comprehensive IDE setup guides (VS Code, Cursor, Windsurf, Claude Desktop, Antigravity)
  • Reconciliation sweep — post-index Milvus↔disk orphan cleanup