npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@pugi/plugin-codegraph

v0.1.0-alpha.2

Published

Pugi codegraph plugin - exposes pugi.code.search / definition / callers / trace / outline / repo_map tools backed by tree-sitter + SQLite FTS5 + PageRank.

Readme

@pugi/plugin-codegraph

Tree-sitter + SQLite FTS5 + PageRank code intelligence for Pugi / Pugi.

Part of the Pugi 1.0 soft fork sprint (see ADR-0081).

What it does

Builds a local symbol-level index of the active workspace so the model can answer "where is X defined?", "what calls Y?", and "rank these files by relevance" without re-reading source files. Inspired by Aider's PageRank- over-symbols repo map and Cody / NotebookLM-style grounded citations.

Six tools are exposed via tool.register:

| Tool | Purpose | |---|---| | pugi.code.search | FTS5 keyword search ranked by BM25 + PageRank | | pugi.code.definition | Locate symbol declaration(s) | | pugi.code.callers | List call sites for a symbol | | pugi.code.trace | Follow a call chain up to N hops | | pugi.code.outline | Hierarchical symbol tree for a single file | | pugi.code.repo_map | Top files by PageRank (with optional task bias) |

In addition, the plugin transparently injects a "relevant symbols" section into the system prompt whenever the user's last turn mentions a PascalCase identifier or function call that the index can resolve (token budget configurable, defaults to 2000).

Install

pnpm add @pugi/plugin-codegraph

Bring your own tree-sitter grammars - we treat them as peer dependencies so each consumer ships only what their codebase needs:

pnpm add better-sqlite3 chokidar tree-sitter tree-sitter-typescript tree-sitter-javascript tree-sitter-python

Optional language grammars (tree-sitter-rust, tree-sitter-go, tree-sitter-java, tree-sitter-ruby) are loaded lazily; if missing, the plugin logs once per language and skips those files.

Usage

// pugi.config.ts
export default {
  plugin: ['@pugi/plugin-codegraph'],
};

Or with options:

export default {
  plugin: [['@pugi/plugin-codegraph', {
    dbPath: '/custom/path/codegraph.sqlite',
    languages: ['typescript', 'tsx', 'python'],
    excludePatterns: ['node_modules/**', 'dist/**', '*.min.js'],
    maxFileBytes: 1_000_000,
    embeddingProvider: 'none',
    watchMode: true,
    injectTokenBudget: 2_000,
    injectMaxSymbols: 8,
  }]],
};

Architecture

Single SQLite file at <workspace>/.pugi/codegraph.sqlite (WAL mode so the watcher's writes don't block SDK reads). Schema:

files
  id PK, path UNIQUE, language, size_bytes, mtime_ms, sha256, pagerank

symbols
  id PK, file_id FK CASCADE, name, kind, start/end_line, start/end_col,
  signature, docstring, parent_symbol_id (self-ref FK)

references_table
  id PK, symbol_id FK SET NULL, file_id FK CASCADE, name,
  line, col, context_snippet

symbols_fts (FTS5 virtual table)
  name, signature, docstring  -- BM25-ranked
  triggers keep it in sync with symbols on INSERT / UPDATE / DELETE

references_table carries the _table suffix because references is a SQL reserved word.

Indexing pipeline (per file):

  1. Skip if .gitignore-style excludePatterns match (matches the named directory at any depth).
  2. Skip if extension not in supported language list.
  3. Compute SHA-256 over file contents.
  4. If existing row has matching SHA, return unchanged (~1ms).
  5. Parse via tree-sitter native binding.
  6. Extract symbol declarations + call-site references.
  7. Single transaction: DELETE existing file row (CASCADE) + INSERT new file/symbols/references. FTS5 stays in sync via triggers.

Cross-file reference resolution: each call site looks up the highest- PageRank declaration of its target name across all files. Order-dependent edge case handled by resolveDanglingRefs after the initial crawl completes.

PageRank: power iteration over the file-to-file graph (damping 0.85, max 50 iters, epsilon 1e-4). Recomputed after the initial crawl and every 100 file edits during watch mode. Stored in files.pagerank.

Why these choices

  • Native tree-sitter, not WASM. ~2x parse throughput, no WASM bundle, no async init. Trade-off: native ABI per Node major version. Pinned to engines.node >= 20.
  • better-sqlite3 with WAL. Synchronous API keeps the hot path simple; WAL lets the watcher write while queries read.
  • FTS5 with prefix-quoted tokens. unicode61 keeps camelCase as one token. Prefix matching (OrderService*) catches partial matches without trigram indexes.
  • BM25 + PageRank hybrid ranking. Cheap, deterministic, no embedding round-trip. Delivers most of the retrieval value at a fraction of the moving parts.
  • No Effect-TS. Per ADR-0081 containment rule.
  • No cross-plugin imports. @pugi/plugin-codegraph runs standalone.

Performance budget

Measured on packages/pugi-plugins/ (M-series Mac, 116 source files):

| Operation | Target | Measured | |---|---|---| | Cold index | <60s / 10k files | 330ms / 116 files | | Incremental reindex | <50ms | ~5-15ms | | pugi.code.definition | <10ms | 0.05ms avg | | pugi.code.callers | <100ms | 0.07ms avg | | pugi.code.search | <200ms | 0.11ms avg | | pugi.code.repo_map | <50ms | 1ms | | pugi.code.outline | <10ms | 0.03ms avg | | PageRank recompute | <500ms / 10k files | 1ms / 69 files (7 iters, converged) | | DB size | ~5MB / 1k files | 4.3MB / 1k files (extrapolated) |

Upstream contract note

@pugi-ai/[email protected] does not expose userMessageText on the experimental.chat.system.transform hook. We capture the last user message via chat.message and look it up by sessionID when the transform fires. Session cache is bounded (256 entries) so a long- running Pugi process cannot leak memory.

Deferred (P1)

  • Embedding-backed reranking for pugi.code.repo_map (embeddingProvider option is plumbed but only none is implemented).
  • Additional grammars (Rust, Go, Java, Ruby) are loaded lazily but their declaration tables are sketched rather than exhaustive.

License

MIT. See LICENSE.