npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

pythia-lcs

v3.0.7

Published

Local code search and architectural memory MCP server for AI coding agents. Hybrid retrieval (vector + FTS + reranker) with Tree-sitter AST chunking. Runs entirely on your machine.

Downloads

857

Readme

Pythia

Local Code Search & Architectural Memory for AI Agents

npm version Tests

Pythia is an MCP server that gives AI coding agents deep, persistent understanding of any codebase. It runs entirely on your machine — no API keys, no cloud dependencies, no data leaves your laptop.

Human? Skip to the human-readable section. Everything below is written for your AI agent.


Agent Install Guide

You are an AI agent. A human has asked you to install Pythia. This section is written for you. Everything is non-interactive — no user input required at any point. Follow these steps exactly.

Windows Note

If you are running on Windows (PowerShell, cmd, Git Bash), Pythia requires WSL2. The install script is bash-only and Pythia uses Unix domain sockets.

  1. Check if WSL is available: wsl --status
  2. If not installed, tell the user to open an admin PowerShell and run wsl --install, then restart their machine
  3. Run all Pythia commands inside WSL: wsl bash -c 'curl -fsSL https://raw.githubusercontent.com/michaeljboscia/pythia/main/install.sh | bash'
  4. After install, run pythia init and pythia mcp install inside WSL as well

Step 1: Run the Install Script

The install script handles all prerequisites automatically — Node.js, build tools, npm permissions, everything:

curl -fsSL https://raw.githubusercontent.com/michaeljboscia/pythia/main/install.sh | bash

Success signal: Output ends with === Pythia installed successfully === Failure signal: Any line starting with [!] — read the message, it tells you exactly what's wrong and how to fix it.

Step 2: Verify the Binary

pythia --version

Expected: A version number (e.g., 3.0.1).

If command not found, the npm global bin directory may not be in PATH. Fix it:

export PATH="$HOME/.npm-global/bin:$PATH"
pythia --version

Step 3: Index the User's Project

cd /path/to/user/project
pythia init

First run downloads ~3 GB of GGUF models. This takes 2-10 minutes depending on connection speed. Subsequent runs on other projects reuse the cached models.

Success signal: Output shows an indexing summary with file count and chunk count.

Step 4: Register as MCP Server

pythia mcp install

This writes Pythia's MCP configuration to the agent's config file. After this step, tell the user they need to restart their AI agent session. Pythia's tools become available on the next session start.

Step 5: Confirm MCP Tools Are Available

After the user restarts their session, these MCP tools should be available:

| Tool | What It Does | |------|-------------| | lcs_investigate | Semantic, structural, or reasoning-based code search | | pythia_force_index | Re-index specific files or the entire workspace | | pythia_corpus_health | Database stats — file count, chunk count, graph edges, model info | | spawn_oracle | Start a persistent Gemini session loaded with architectural docs | | ask_oracle | Query the Oracle — responses include numbered source citations | | oracle_commit_decision | Write a formal Architectural Decision Record (MADR) | | oracle_decommission | Archive an Oracle session safely (requires TOTP confirmation) |

Agent Troubleshooting

If something fails, use this table to diagnose. Every row is a concrete symptom → fix pair:

| Symptom | Cause | Fix | |---------|-------|-----| | command not found: pythia | npm global bin not in PATH | export PATH="$HOME/.npm-global/bin:$PATH" then retry | | npm ERR! node-gyp rebuild | Missing C++ toolchain | macOS: xcode-select --install / Ubuntu: sudo apt install build-essential python3 | | EACCES permission error | npm global prefix owned by root | npm config set prefix "$HOME/.npm-global" and update PATH | | Model download hangs or fails | Insufficient disk or network issue | Need ~5 GB free disk. Check curl -fsSL https://registry.npmjs.org/ works | | pythia init times out on large repo | Standard tier indexes deeply | Try pythia init --tier light for faster indexing (~2 min vs ~15 min, lower coverage) | | pythia mcp install — no config written | Unknown MCP client | Manually add Pythia to the agent's MCP config (see Manual MCP Configuration below) |

Manual MCP Configuration

If pythia mcp install doesn't detect your MCP client, add this to your MCP server configuration manually:

{
  "pythia": {
    "command": "pythia",
    "args": ["start"]
  }
}

For Claude Code, this goes in ~/.claude/mcp_config.json. For Gemini CLI, this goes in ~/.gemini/settings.json under mcpServers.

What the Install Script Does (For Agent Context)

The install script is safe to run on any macOS or Linux machine. It performs these checks in order:

  1. curl — installs if missing (apt/dnf/apk)
  2. Disk space — verifies ≥5 GB free, exits with clear error if not
  3. Network — verifies npm registry is reachable, suggests proxy config if not
  4. Node.js 22+ — installs via Homebrew (macOS) or NodeSource (Linux) if missing or outdated
  5. Build tools — installs Xcode CLI tools (macOS, non-interactive) or build-essential (Linux) if missing
  6. npm permissions — redirects global prefix to ~/.npm-global if system-owned (avoids EACCES)
  7. Pythianpm install -g pythia-lcs
  8. Verification — confirms pythia command exists in PATH

The script does NOT require user interaction at any point. It does NOT modify system files outside of standard package manager operations. It uses sudo only when necessary and only if available (works without sudo in Docker/root contexts).


For Humans

Everything above is written for your AI agent. Here's the context for you.

What Pythia Does

AI agents struggle with large codebases because they can't see everything at once. Pythia gives them a local search engine that understands your code:

  • Hybrid Search: Combines vector search, full-text search, and Reciprocal Rank Fusion — all reranked by a cross-encoder for precision
  • Semantic Chunking: 16 Tree-sitter language grammars parse your code structurally, so search results align with function and class boundaries
  • Architectural Memory: Your agent's reasoning gets persisted alongside the code, building institutional knowledge over time
  • Zero Cost: Three GGUF models run locally on your machine. No API keys, no cloud, no data exfiltration

How to Install

Tell your AI agent:

"Install Pythia from github.com/michaeljboscia/pythia"

Your agent will read this README and handle everything. No manual steps required.

Or install manually:

curl -fsSL https://raw.githubusercontent.com/michaeljboscia/pythia/main/install.sh | bash
cd /your/project
pythia init          # downloads models (~3 GB first time), indexes your code
pythia start         # start the MCP server

Already have Node.js 22+ and a C++ compiler? Skip the script: npm install -g pythia-lcs

System Requirements

| Requirement | Minimum | Notes | |-------------|---------|-------| | Node.js | v22+ | Install script handles this | | RAM | 8 GB | Runs 3 GGUF models simultaneously | | Storage | ~5 GB | 3 GB models + database (models shared across projects) | | OS | macOS or Linux | Windows via WSL2 |

Embedding Models

| Tier | Model | Size | Context | File Coverage | Command | |------|-------|------|---------|--------------|---------| | Standard | gte-Qwen2-1.5B-instruct Q4_K_M | 1.1 GB | 32K | 100% | pythia init | | Light | nomic-embed-text-v1.5 Q4 | 80 MB | 8K | ~65% | pythia init --tier light |

We benchmarked 8 models across real codebases. Context window proved more important than model quality — the #1 public benchmark model failed to index 73% of files due to a 2K context limit.

Thought Capture API

Pythia exposes a local HTTP API for agents to persist their reasoning. Thoughts become searchable alongside your code.

  • POST /v1/thoughts — ingest architectural reasoning
  • GET /v1/health — check server readiness

Configuration

Default config at ~/.pythia/config.json:

{
  "embeddings": {
    "mode": "local",
    "tier": "standard"
  }
}

For teams centralizing embedding generation (Ollama, Vertex AI, any OpenAI-compatible endpoint):

{
  "embeddings": {
    "mode": "openai_compatible",
    "base_url": "http://your-gpu-server:11434/v1",
    "api_key": "ollama",
    "model": "nomic-embed-text"
  }
}

Architecture

<your-workspace>/
└── .pythia/
    ├── lcs.db              # SQLite: chunks, embeddings, FTS, graph edges, thoughts
    └── pythia.sock         # Unix domain socket for Thought Capture API

~/.pythia/
├── config.json             # Global configuration
└── models/                 # Cached GGUF models (~3 GB, shared across all projects)
    ├── gte-qwen2-1.5b-Q4_K_M.gguf               (Embedder)
    ├── bge-reranker-v2-m3-Q8_0.gguf             (Cross-encoder reranker)
    └── qwen2.5-coder-1.5b-instruct-q4_k_m.gguf  (Chunk summarizer)

Benchmark Highlights

  1. Context limits matter most: 32K context achieves 100% file coverage vs 27% for 2K-limit models
  2. Chunking normalizes quality: Tree-sitter AST chunking compresses quality spread to 5-9 points
  3. Quantization is practically free: Q4, Q8, and f16 variations within margin of noise
  4. Reranking is essential: Cross-encoder reranking altered final ordering on 90% of queries

Full data and methodology: docs/benchmarks/BENCHMARK-RESULTS.md


Licensed under FSL-1.1-ALv2 (Functional Source License). Converts to Apache 2.0 after two years.