npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@13w/local-rag

v1.6.1

Published

Distributed semantic memory + code RAG as an MCP plugin for Claude Code agents

Readme

local-rag — Distributed Memory + Code RAG for Claude Code

npm GitHub

Semantic memory and code intelligence as an MCP plugin for Claude Code agents. 9 tools that give Claude persistent memory, semantic code search, and import graph traversal — all running locally.

What it does

| Tool | Description | |------|-------------| | recall(query) | Semantic search across stored memories | | remember(content) | Store memory with type / scope / tags / importance | | search_code(query) | Hybrid RAG over indexed codebase | | get_file_context(file_path) | Read file + list indexed symbols | | get_dependencies(file_path) | Import graph traversal (forward / reverse / transitive) | | project_overview() | 3-level directory tree, entry points, top imports | | forget(memory_id) | Delete a memory permanently | | consolidate() | Merge semantically similar memories | | stats() | Memory and index statistics |

Stack

  • Qdrant — vector database (Rust, production-ready)
  • Ollama — local embeddings (embeddinggemma:300m)
  • tree-sitter — multi-language code parser (TypeScript, JavaScript, Go, Rust)
  • MCP — Model Context Protocol (stdio transport)

Prerequisites

1. Ollama (local embeddings)

Install: https://ollama.com/download

# Linux
curl -fsSL https://ollama.com/install.sh | sh

# macOS — download the app from:
# https://ollama.com/download/mac

# Windows — download the installer from:
# https://ollama.com/download/windows

Pull the embedding model:

ollama pull embeddinggemma:300m

2. Qdrant (vector database)

Option A — Docker Compose (recommended)

A ready-to-use docker-compose.yml is included in this repo:

docker compose up -d

Exposes port 6333 (REST) and 6334 (gRPC). Data persists in a named volume qdrant-data.

Option B — Docker run

docker run -d --name qdrant \
  -p 6333:6333 -p 6334:6334 \
  -v qdrant-data:/qdrant/storage \
  qdrant/qdrant

Option C — Qdrant Cloud

https://cloud.qdrant.io/ — set qdrant-url in .memory.json to your cluster endpoint.

3. Node.js 18+

https://nodejs.org/


Installation

From npm (recommended):

npm install -g @13w/local-rag

From source:

git clone https://github.com/13W/local-rag.git
cd local-rag
npm install && npm run build

Claude Code Plugin Setup

Install local-rag

Option A — claude mcp add with npx (no global install needed)

Per-project (stored in .mcp.json, shared with the team):

claude mcp add memory -- npx -y @13w/local-rag serve --config .memory.json

Global — available in all projects on this machine:

claude mcp add memory -s user -- npx -y @13w/local-rag serve --config .memory.json

Option B — .mcp.json directly

{
  "mcpServers": {
    "memory": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "@13w/local-rag", "serve", "--config", ".memory.json"]
    }
  }
}

Option C — After global npm install -g

claude mcp add memory -- local-rag serve --config .memory.json

Install Serena (recommended companion)

Serena provides filesystem access and precise symbolic code editing that complements local-rag: local-rag finds code by meaning, Serena reads and edits it surgically.

Repo: https://github.com/oraios/serena

Requirements: Python 3.10+, uv

# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh

# Register Serena as a Claude Code plugin (per-project)
claude mcp add serena -- uvx --from serena serena-mcp-server --context ide-assistant --project .

Or in .mcp.json:

{
  "mcpServers": {
    "serena": {
      "type": "stdio",
      "command": "uvx",
      "args": ["--from", "serena", "serena-mcp-server", "--context", "ide-assistant", "--project", "."]
    }
  }
}

Combined .mcp.json (both plugins)

{
  "mcpServers": {
    "memory": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "@13w/local-rag", "serve", "--config", ".memory.json"]
    },
    "serena": {
      "type": "stdio",
      "command": "uvx",
      "args": ["--from", "serena", "serena-mcp-server", "--context", "ide-assistant", "--project", "."]
    }
  }
}

Agent workflow setup

Run init once in your project root after registering the MCP plugin. It installs hooks that enforce the recall → search → remember protocol on every session and prompt, and writes reference guides into .claude/rules/ so Claude always has the tool conventions at hand.

npx @13w/local-rag init

# If installed globally
local-rag init

Output:

wrote  .claude/hooks/session-start.sh
wrote  .claude/hooks/prompt-reminder.sh
wrote  .claude/settings.json
wrote  .claude/settings.local.json
wrote  .claude/rules/continuous-remember.md
wrote  .claude/rules/memory-protocol-reference.md
wrote  .claude/rules/serena-conventions.md

What each file does:

| File | Purpose | |------|---------| | hooks/session-start.sh | Injects the full protocol cheatsheet as a system-reminder at every session start and after context compaction | | hooks/prompt-reminder.sh | Fires on every user prompt — reminds Claude to recall() before acting and remember() after | | rules/continuous-remember.md | When and how to call remember() immediately (trigger events, format, anti-patterns) | | rules/memory-protocol-reference.md | Full tool reference with parameter tables and call examples | | rules/serena-conventions.md | Serena vs Memory MCP routing guide and end-to-end editing workflow | | settings.json | Registers the hooks in Claude Code (commit this) | | settings.local.json | Local hook overrides — add to .gitignore |

Commit .claude/hooks/, .claude/rules/, and .claude/settings.json to share the workflow with your team.


Configuration

Create .memory.json in your project root (auto-discovered if present):

{
  "project-id": "my-project",
  "project-root": ".",
  "qdrant-url": "http://localhost:6333",
  "embed-provider": "ollama",
  "embed-model": "embeddinggemma:300m",
  "ollama-url": "http://localhost:11434"
}

Full config reference

| Key | Default | Description | |-----|---------|-------------| | project-id | "default" | Isolates memories and code index per project | | project-root | config file directory | Root path for code indexing | | qdrant-url | http://localhost:6333 | Qdrant REST API URL | | embed-provider | "ollama" | Embedding provider: ollama, openai, voyage | | embed-model | provider default¹ | Embedding model name | | embed-dim | 1024 | Embedding vector dimension | | embed-api-key | "" | API key for OpenAI / Voyage embed providers — falls back to OPENAI_API_KEY / VOYAGE_API_KEY env var | | embed-url | "" | Custom embedding API endpoint | | ollama-url | http://localhost:11434 | Ollama API URL | | agent-id | "default" | Agent identifier (for multi-agent setups) | | llm-provider | "ollama" | LLM provider: ollama, anthropic, openai | | llm-model | provider default² | LLM model for reranking / description generation | | llm-api-key | "" | API key for Anthropic / OpenAI LLM providers — falls back to ANTHROPIC_API_KEY / OPENAI_API_KEY env var | | llm-url | "" | Custom LLM API endpoint | | include-paths | [] | Glob patterns to limit indexing scope (monorepos) | | generate-descriptions | false | Auto-generate LLM descriptions for code chunks (slow) | | dashboard | true | Enable the live dashboard HTTP server | | dashboard-port | 0 | Dashboard HTTP port; 0 lets the OS pick a random port | | collection-prefix | "" | String prepended to all Qdrant collection names (useful on shared Qdrant instances) | | no-watch | false | Disable automatic file re-indexing when files change (applies during serve) |

¹ embed-model defaults: ollamaembeddinggemma:300m, openaitext-embedding-3-small, voyagevoyage-code-3

² llm-model defaults: ollamagemma3n:e2b, anthropicclaude-haiku-4-5-20251001, openaigpt-4o-mini

Resolution order (highest to lowest priority): CLI flag → .memory.json value → environment variable → built-in default.

API key environment variables are provider-specific: | Provider | embed-api-key env var | llm-api-key env var | |----------|------------------------|-----------------------| | openai | OPENAI_API_KEY | OPENAI_API_KEY | | voyage | VOYAGE_API_KEY | — | | anthropic | — | ANTHROPIC_API_KEY |

All other keys can also be passed as CLI flags (e.g. --project-id foo). CLI flags override config file values. include-paths is config-file only.


Indexing Your Codebase

Before search_code and get_file_context tools return results, index the project:

# Index once
npx @13w/local-rag index . --config .memory.json

# Watch mode — re-indexes on file changes
npx @13w/local-rag watch . --config .memory.json

# If installed globally
local-rag index . --config .memory.json
local-rag watch . --config .memory.json

Other indexer commands:

local-rag clear --config .memory.json    # remove all indexed chunks
local-rag stats --config .memory.json    # show collection statistics
local-rag file <abs-path> <root>         # index a single file

Live Dashboard

local-rag serve automatically opens a browser dashboard on a local HTTP port. It displays real-time tool call statistics (calls, bytes, latency, errors per tool), a scrolling request log, a server info bar (project, branch, version, watch status), and an interactive tool playground for testing calls manually.

The port is OS-assigned by default (printed to stderr as [dashboard] http://localhost:PORT). To use a fixed port or disable the dashboard:

{ "dashboard-port": 4242 }
{ "dashboard": false }

Memory Types

| Type | Use for | Decay | |------|---------|-------| | episodic | Events, bugs, incidents | Time-decayed | | semantic | Facts, architecture, decisions | Long-lived | | procedural | Patterns, conventions, how-to | Long-lived |


Agent Protocol

Run local-rag init (see Agent workflow setup) to install the full RECALL → SEARCH_CODE → THINK → ACT → REMEMBER protocol into your project. The hooks fire automatically — no manual prompting required.