npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

graph-indexer

v2.0.0

Published

Air-gapped code-retrieval MCP server: AST chunking, hybrid RRF search and cross-language call graph, with an optional zero-dependency SQLite backend for monorepo scale and optional local-LLM semantic enrichment.

Readme

What it does

Graph Indexer is a local Model Context Protocol server that builds an AST-precise index of your repository with Tree-sitter and serves it to AI coding agents. Instead of grepping text or embedding whole files, it indexes semantic chunks — functions, classes, methods — and the call graph and import topology that connect them, so an agent can find the right symbol, see every caller and dependency that would be affected by a change, and resolve references across files. It runs entirely on your machine: the default search path is lexical (BM25 + morphological stemming) and needs no model, no daemon, and no network. Dense vector embeddings, LLM enrichment, and an LLM reranker are all available but off by default — you opt into each one when you have a measured reason to.

Why it matters for AI coding agents:

  • Up to 98.2% Token Savings: Delivers the exact AST chunk needed instead of dumping 1M token context files.
  • Blast radius before every edit. get_call_graph and find_references surface every caller, subclass, and dependency a change would touch, so the agent reasons about impact on code it never opened.
  • Private by default. Everything runs locally; the default path makes zero network calls and needs no model — your code never leaves your machine.
  • One command, any language. Guided setup wires your editors in seconds, and the zero-dependency lexical engine indexes 14 languages out of the box.

📦 Prerequisites

  • Node.js 18+ (22+ recommended if you hit >15k chunks for automatic SQLite scaling).
  • OS: Agnostic. macOS Apple Silicon required only for the optional MLX GPU acceleration.
  • Optional: Ollama for embeddings, enrichment, and reranking (see Ollama setup).
  • Optional: Python 3.10+ for the MLX embedder (macOS Apple Silicon only).

Quick start

1. Run the Interactive Indexer

Go to your project root (works for Python, Go, Rust, TS/JS, C#, and 9 more languages) and run:

npx graph-indexer /path/to/your/repo

That runs the guided setup against that repo, which leaves it ready to use. It is idempotent — re-run it whenever you add a language or change a setting; it merges into what you already have and never clobbers another tool's config. Every generated file lands in .graph-indexer/ (git-ignored), so your repo root stays clean.

2. Connect Your Agent

If you ran the guided setup and selected your agent in step 4, this is already done — skip ahead.

The guided setup writes the MCP config automatically (with absolute paths that survive GUI launches) for Claude Code (.mcp.json), Claude Desktop, Cursor (.cursor/mcp.json), and VS Code (.vscode/mcp.json). Re-run npx graph-indexer /path/to/your/repo at any time to add or refresh a client.

Claude Code — CLI

claude mcp add graph-indexer -- npx -y -p graph-indexer idx-mcp --repo /path/to/your/repo

Cursor — .cursor/mcp.json

{
  "mcpServers": {
    "graph-indexer": {
      "command": "npx",
      "args": ["-y", "-p", "graph-indexer", "idx-mcp", "--repo", "/path/to/your/repo"]
    }
  }
}

VS Code — .vscode/mcp.json

{
  "servers": {
    "graph-indexer": {
      "command": "npx",
      "args": ["-y", "-p", "graph-indexer", "idx-mcp", "--repo", "/path/to/your/repo"]
    }
  }
}

Claude Desktop — claude_desktop_config.json

{
  "mcpServers": {
    "graph-indexer": {
      "command": "npx",
      "args": ["-y", "-p", "graph-indexer", "idx-mcp", "--repo", "/path/to/your/repo"]
    }
  }
}

Note: The -p graph-indexer idx-mcp form is required — npx graph-indexer runs the setup wizard, not the MCP server. If npx isn't on your GUI PATH (common on macOS when launching editors from Finder/Dock), run the guided setup instead — it writes absolute paths that always work.

Once connected, the agent can call search_code. A query like search_code("rate limiting middleware") returns ranked semantic chunks, not whole files:

[
  {
    "score": 8.41,
    "chunk": {
      "id": "src/middleware/rateLimit.ts:14",
      "file_path": "src/middleware/rateLimit.ts",
      "name": "rateLimiter",
      "node_type": "function_declaration",
      "start_line": 14, "end_line": 47,
      "calls": ["tokenBucket", "getClientKey"],
      "class_context": ""
    }
  }
]

The agent can then call get_call_graph("rateLimiter") to see what calls it (the blast radius) before changing it.

Guided setup

npx graph-indexer <path> runs an interactive wizard that leaves the repo ready to use. It is idempotent — re-run it whenever you add a language or change a setting; it merges into what you already have and never clobbers another tool's config. Every generated file lands in .graph-indexer/ (git-ignored), so your repo root stays clean.

Before the steps, it scans the repo to pre-select your stack — languages from a bounded file walk, frameworks from your manifests (package.json, pyproject.toml, composer.json, pom.xml, Gemfile, *.csproj, …) — and, if it finds a pre-v1.4 layout, tidies those stray artifacts into .graph-indexer/ first. Then six steps:

| Step | You choose | What happens | |------|------------|--------------| | 1 · Languages | Languages to index | Detected languages are pre-checked — press Enter to accept. Selecting none indexes all supported languages. | | 2 · Frameworks | Prompt add-ons | Filtered to your languages and pre-checked from detection. Sharpens the agent prompt for React, Express/NestJS, FastAPI/Django, Spring, Rails, Laravel/Symfony, ASP.NET, or Android. | | 3 · Search engine & LLM | Retrieval engine | Press Enter for the recommended default — lexical search, no LLM, no network. Everything heavier is opt-in (see below). | | 4 · Agents & IDEs | Your coding tools | Multi-select which agents to wire: Claude Code/Desktop, Cursor, VS Code/Copilot, Windsurf, Cline/Roo, JetBrains Junie, Codex (AGENTS.md), Gemini CLI. Pre-checked from your saved choice, else what's detected, else all. Drives steps 5 and 7 — deselect a tool and graph-indexer generates nothing for it. | | 5 · MCP server wiring | (automatic) | Wires the MCP server for each selected editor (VS Code, Cursor, Claude Desktop, Claude Code). Merge-safe: your other MCP servers and keys are preserved. | | 6 · Project files & daemon | (automatic) | Adds mcp:* npm scripts (index + daemon control), a managed .gitignore block, and .graph-indexer/config.json (which remembers your agent selection). | | 7 · Agent instructions | (automatic) | Always writes the canonical layered prompt (GRAPH_INDEXER_PROMPT.md) + a GRAPH_INDEXER_DOMAIN.md template for your own rules. Then, for each selected agent: @-imports (no duplication) in CLAUDE.md and GEMINI.md; rule files for Cursor, Windsurf, and Cline/Roo; and managed blocks in .github/copilot-instructions.md, .junie/guidelines.md, and AGENTS.md (Codex/Zed/Jules) — preserving anything already there. |

It finishes with a grouped summary (created / updated / kept / skipped / warnings), offers to build the index now, and prints your next steps.

Most repos need nothing past step 3's default — the lexical engine has zero dependencies and works in any language. Step 3 only branches when you decline the defaults:

  • Storageauto (in-memory, promoting to SQLite past ~15k chunks), or force in-memory / SQLite.
  • Embeddingsoff (default), auto, Ollama, local (in-process MiniLM, ~25 MB on first index), or Apple Metal (MLX) on macOS. Ollama is probed only when you actually pick it; choosing MLX offers to provision its Python venv on the spot.
  • Enrichment / reranker — off by default. A measured per-language note is shown before the reranker prompt (it helps Go/Python, regresses JS/TS). Model pickers are provider-aware, and if you select MLX the wizard verifies mlx_lm.server end-to-end — installing mlx-lm, starting the server, and pre-loading the model — so your first index can't fail against a server that isn't up.

Non-interactive & preview

| Flag | Effect | |------|--------| | --yes, -y | Accept all detected/default selections — no prompts (CI). | | --dry-run | Print every file action without writing anything. | | --all-languages | Index every supported language (implies --yes). | | --help, -h | Show usage. |

Setup also runs non-interactively whenever stdin isn't a TTY, so piping into it behaves like --yes.

How it works

  • AST indexing. Tree-sitter parses each file into a syntax tree, and the indexer extracts one chunk per top-level definition (function, class, method, struct, …) with its name, parameters, line range, call sites, and enclosing class. A "god class" is split so its methods become their own chunks. Supported languages: TypeScript/JavaScript (.ts .tsx .js .jsx .mjs .cjs), Python, Go, Rust, Java, Kotlin, C#, C, Ruby, PHP, Bash, and Swift, plus CSS/SCSS.
  • Retrieval. A hybrid ranker fuses a lexical channel (BM25 with camelCase splitting and language-agnostic Porter stemming) with an optional dense-vector channel (local embeddings) via Reciprocal Rank Fusion. With embeddings off, only the lexical channel runs — and it is the default.
  • Call graph. get_call_graph returns the callers and callees of a symbol — the blast radius of a change — so an agent can see what it might break before editing code it never read.
  • Backend parity. The in-memory and SQLite backends share the same ranking core, so they return identical top-5 results for the same query (enforced by test/sqlite.mjs).

MCP tools

| Tool | Returns | |------|---------| | search_code | Ranked semantic chunks for a natural-language or symbol query. | | get_chunk | The full source of one chunk by id. | | get_chunk_summary | A compact summary of a chunk (signature, calls, context). | | resolve_symbol | Exact, case-insensitive symbol lookup by name. | | get_file_skeleton | The top-level structure (symbols + signatures) of a file. | | get_call_graph | Callers and callees of a symbol — the blast radius. | | find_references | Where a symbol is used: callers, subclasses, and type references. | | find_routes | HTTP routes mapped to their handler chunks (NestJS, FastAPI/Flask, Spring, Express/Koa). | | get_subgraph | A bounded connected subgraph around a seed symbol — its callees, high-confidence callers, and type/inheritance users, in one call. | | get_repo_map | A high-level map of the repository's modules and topology. | | list_index_stats | Index health: chunk/file/symbol/vector counts and the active config. |

Configuration

Everything beyond the lexical default is opt-in. The server, indexer, and daemon all print their effective configuration at startup (storage backend, model names, which optional features are on), and emit a visible warning whenever an opt-in feature has a known trade-off — nothing is enabled silently.

Headline trade-offs

| Option | Default | When to enable | Cost | |--------|---------|----------------|------| | --embeddings | off | Larger repos where recall matters; lifts success@5 | Requires Ollama or the in-process MiniLM model; slower indexing | | --embed-model qwen3-embedding:4b | nomic-embed-text | Better code recall + symbolic precision | slower indexing; requires Ollama | | --enrichment | off | Only useful paired with --rerank; alone it regresses | slowest indexing | | --rerank | off | Go/Python repos with weak semantic recall; regresses JS repos | Requires an Ollama 7B model; adds query latency | | --use-sqlite | auto | Repos past ~15k chunks or memory-constrained environments | Slightly higher query latency; needs Node 22+ |

For most repos, the default (lexical + stemming, no embeddings) is the right starting point. Enable embeddings when you notice the agent missing chunks it should find. Enable the reranker only on Go or Python repos after measuring whether it helps — it is known to regress JavaScript repositories.

All CLI flags

| Flag | Default | Effect | |------|---------|--------| | --repo <path> | current directory | Repository to index / serve. | | --embeddings | off | Enable the dense-vector channel. | | --embed-model <model> | nomic-embed-text | Ollama embedding model (e.g. qwen3-embedding:4b). | | --embed-provider <auto\|ollama\|local\|mlx\|off> | auto | Force the embedding backend (see Embedder backends). | | --mlx-embed-model <model> | mlx-community/all-MiniLM-L6-v2-4bit | Model loaded by the MLX (Apple Metal) embedder. | | --use-sqlite | auto | Force the disk-backed SQLite backend. | | --enrichment | off | Enable LLM enrichment of central chunks. | | --enrich-model <model> | qwen2.5-coder:1.5b | Model used for enrichment. | | --enrich-max <n> | 500 | Cap on new LLM calls per index run. | | --enrich-concurrency <n> | 4 | Parallel Ollama requests during enrichment. | | --rerank | off | Enable the LLM reranker (one call per NL query). | | --no-git-signals | (signals on) | Skip collecting local git churn/recency/co-change. | | --git-rank-boost <0..1> | 0 | Opt-in weight for git recency/churn in ranking (0 = ranking unchanged). | | --llm-provider <ollama\|mlx> | ollama | LLM backend for enrichment, reranking, and HyDE. mlx routes calls to a local mlx_lm.server. | | --mlx-lm-host <url> | http://localhost:8080 | Endpoint for the mlx_lm.server when --llm-provider mlx is set. |

All environment variables

| Variable | Default | Effect | |----------|---------|--------| | MCP_PROJECT_ROOT | current directory | Repository root when --repo is not given. | | OLLAMA_HOST | http://localhost:11434 | Ollama endpoint for embeddings/enrichment/rerank. | | INDEXER_EMBEDDINGS | off | on enables embeddings; off always wins over --embeddings. | | EMBED_MODEL | nomic-embed-text | Ollama embedding model (overrides config; overridden by --embed-model). | | INDEXER_EMBED_PROVIDER | auto | auto | ollama | local | mlx | off. | | INDEXER_MLX_EMBED_MODEL | mlx-community/all-MiniLM-L6-v2-4bit | Model loaded by the MLX embedder (overridden by --mlx-embed-model). | | INDEXER_MLX_BATCH_SIZE | 32 | Texts per batch sent to the MLX subprocess (raise on large unified memory). | | INDEXER_STORAGE | auto | auto | memory | sqlite. | | ENRICH_MODEL | (unset) | Naming a model enables enrichment and selects it. | | RERANK_MODEL | (unset) | Naming a model enables the reranker and selects it. | | INDEXER_GIT_SIGNALS | (on) | Set to off to skip git-signal collection. | | INDEXER_GIT_RANK_BOOST | 0 | Opt-in git recency/churn ranking weight (0..1). | | INDEXER_LLM_PROVIDER | ollama | LLM backend for enrichment, reranking, and HyDE: ollama or mlx. | | INDEXER_MLX_LM_HOST | http://localhost:8080 | Endpoint for the mlx_lm.server when INDEXER_LLM_PROVIDER=mlx. | | INDEXER_EMBED_CONCURRENCY | 4 | Parallel embedding batches; lower to 1 for large models on modest hardware. | | INDEXER_EMBED_TIMEOUT_MS | 120000 | Per-batch embedding timeout; raise for very large models. |

When embeddings are enabled, the provider is selected in this order, and every fallback is logged (never silent): Ollama with EMBED_MODEL if set and reachable → Ollama with nomic-embed-text → the in-process MiniLM model (optional @huggingface/transformers) → lexical-only with a warning.

Embedder backends

Set the backend with --embed-provider (or INDEXER_EMBED_PROVIDER). mlx is a native Python embedder that runs the Apple Metal GPU as a reused subprocess — faster in-process vectors than the bundled Xenova path, with no running Ollama daemon. The in-process (local) and mlx backends produce all-MiniLM-L6-v2-family 384-dim vectors; the default auto→Ollama path uses nomic-embed-text (768-dim).

| Backend | --embed-provider | Platform | Throughput¹ | Setup | |---------|--------------------|----------|-------------|-------| | Auto (Ollama → local → off) | auto (default) | any | varies | none | | In-process (Xenova) | local | any | ~18 ch/s | npm i @huggingface/transformers | | Apple Metal (MLX) | mlx | macOS Apple Silicon | ~42 ch/s | npm run embed:setup:mlx | | Ollama daemon | ollama | any | ~14 ch/s² | ollama serve + ollama pull |

¹ Throughput is hardware-dependent. These figures are the median of 3 cold builds on an Apple M2 Mac mini (24 GB) indexing the express-js fixture (389 chunks): node bench/cell.mjs express-js <E0|E0_MLX|O0>. Expect different numbers on other chips; reproduce on your own machine. Throughput also varies with system load, model warm state, and corpus size — larger repos amortize cold-start better (e.g. Ollama/nomic reaches ~32 ch/s on the larger gin corpus).

² On Apple Silicon, Ollama already uses the Metal GPU internally (via llama.cpp). The mlx provider's advantage comes from a smaller model (all-MiniLM-L6-v2-4bit, 22M params, 384-dim) and no HTTP round-trip, not from GPU vs CPU.

The index records which provider/model built it (in the code-index.embeddings.bin.meta.json sidecar) and queries with the same one, so vectors never get mixed across models. Switching providers or the MLX model between builds is detected and triggers a clean re-embed.

macOS Apple Silicon (recommended):

npm run embed:setup:mlx                                  # one-time: creates embedders/venv-mlx + installs MLX
npx idx-index --repo . --embeddings --embed-provider mlx

embed:setup:mlx provisions a dedicated Python virtualenv under embedders/venv-mlx (so MLX's deps never touch your system Python) and is idempotent — re-running it is a no-op once ready. The interactive setup (npx graph-indexer) offers MLX directly and can run this step for you. The first index downloads the model (~90 MB) into the Hugging Face cache; later builds reuse it.

Choose a different MLX model with --mlx-embed-model <id> (or INDEXER_MLX_EMBED_MODEL); it must be an mlx_embeddings-compatible sentence model, and the default mlx-community/all-MiniLM-L6-v2-4bit is the proven option. Tune batch size with INDEXER_MLX_BATCH_SIZE (default 32; lower to 16 on 8 GB machines, raise to 64 on M-series Max/Ultra). mlx is macOS-only; requesting it elsewhere fails fast with the alternative to use.

Benchmarks

Every number here is produced by the eval harness on cold, isolated builds — never hand-edited — with strict scoring (exact symbol match, no file-path fallback) against a held-out split (~20–25% per language, never used to tune). Reproduce it all with npm run test:eval.

What the measurements say, across 18 real-world fixtures spanning every supported language:

  • The zero-dependency lexical default is the right starting point. It wins the held-out metric outright on 2 fixtures, and 6 more need only a cheap in-process embedder — so 8 of 18 run with no Ollama and no network. Most repos never need more.
  • Symbolic lookups are strong (mean rank-1 ≈ 0.70). Behavioural, natural-language queries are harder (≈ 0.30 on the default path) — that's where the optional embedding and reranker channels earn their keep.
  • Optional features are measured, not assumed. Dense embeddings lift recall on larger repos; the LLM reranker is language- and repo-dependent (it lifts 8 fixtures but taxes JavaScript and Spring); enrichment helps only where proven (just rust). Setup surfaces each trade-off before you enable it.
  • Backend parity: the in-memory and SQLite backends return byte-identical top-5 on all 18 fixtures (gated by test/sqlite.mjs).

Per-fixture best configs, 3× spreads, and copy-paste enable flags live in docs/benchmarks/BENCH_PER_FIXTURE.md.

Structural coverage by language

get_call_graph and find_references are richest for typed languages. Every verdict is confirmed by invocation on the real index, not by reading field counts:

| Capability | Strong | Limited | |---|---|---| | Call graph (callers/callees) | resolves on every supported language | Java/Spring is class-granular; SCSS effectively none | | Caller precision | receiver-aware: TS/JS, Python, C#, Swift, PHP | name-only: Go, Rust, Kotlin, Ruby, Bash, C | | Typed find_references | precise: TS/JS, Python · field-precise: C# | heuristic: Java/PHP/Kotlin/Swift/Rust/Go/C · empty: Ruby, Bash, SCSS, dynamic JS/Python |

AST chunking and lexical search cover every supported language; only the typed cross-reference channel narrows for dynamic ones.

Where it's weakest (honestly)

  • Behavioural, natural-language recall on the default path (rank-1 ≈ 0.30) — closing it needs the embedding/reranker channel, not lexical tuning.
  • The hardest fixtures stay below 0.65 held-out success@5 even with the full stack (rails 0.50, rust 0.61).
  • Java/Spring get_call_graph reports at class granularity (the god-class split only fires ≥200 lines); SCSS has no meaningful call graph.
  • LLM enrichment without reranking regresses precision — enable it only paired with --rerank.

Reproducing the benchmarks

npm run test:all                   # full unit + integration suite
npm run test:setup                 # index the benchmark fixtures
npm run test:eval                  # default lexical path — prints the HELD-OUT block
npm run test:eval -- --suite css   # any single fixture (18 authored suites)
npm run test:eval -- --embeddings  # hybrid eval (requires Ollama)
npm run test:eval -- --verbose     # per-query breakdown incl. file-rank column
node test/sqlite.mjs               # backend-parity gate (memory ↔ SQLite top-5)
node bench/cell.mjs <fixture> L1   # cold rebuild + score one fixture

The harness is test/evaluate.mjs; deeper reproduction notes live in docs/benchmarks/BENCH_BASELINE.md and docs/benchmarks/BENCH_FULL_SUITE.md.