npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@chiefaia/local-rag

v0.1.0

Published

Local-first RAG over the CAIA monorepo — Ollama embeddings + SQLite vector store, no cloud calls.

Readme

@chiefaia/local-rag

Local-first RAG over the CAIA monorepo. Indexes source files into an embedded SQLite vector store, embeds chunks via local Ollama, returns top-K snippets in response to natural-language queries — all without a cloud round-trip.

Used by:

  • @chiefaia/local-llm-router will route "needs context" task types through this layer (LAI-005)
  • @chiefaia/llm-cache reuses the same embedder for semantic prompt cache lookups (LAI-004)

Install requirements

  • Ollama daemon running at 127.0.0.1:11434 (override via OLLAMA_BASE_URL)
  • Embedding model pulled: ollama pull nomic-embed-text
  • Node 20+ (for Float32Array and node:crypto)

Quick start

import { LocalRag } from '@chiefaia/local-rag';

const rag = new LocalRag({ dbPath: '.local-rag.db' });

await rag.indexDirectory('./packages');
const hits = await rag.query('how does the router decide local vs claude', {
  topK: 5,
  minScore: 0.3,
});

for (const hit of hits) {
  console.log(`${hit.score.toFixed(2)}  ${hit.chunk.path}:${hit.chunk.startLine}`);
}

What's in the box

| component | what it does | |--|--| | Embedder | POST /api/embeddings to Ollama and return a Float32Array. Defaults to nomic-embed-text (768-dim, 19ms warm on M1 Pro). | | chunkFile() | Line-window chunker (60 lines, 10 overlap) with a contextual header ([<path> L<a>-<b>]) prepended to each chunk for better retrieval recall. | | VectorStore | better-sqlite3 with one row per chunk, embedding stored as a BLOB. Brute-force cosine search over all rows on query(). | | LocalRag | High-level façade: walks a directory, chunks every matching file, embeds and stores, then exposes query(). |

The store deliberately doesn't use sqlite-vec. For the CAIA monorepo's expected ~50–100k chunks, brute-force cosine completes in well under 100 ms on M1 Pro and avoids a native-binary dependency. LAI-008 can swap in sqlite-vec if the index ever crosses ~1M chunks.

Embedding-model lock

A meta row stores the embedding model used to build the index. Re-opening the same db with a different Embedder model raises an error before any work happens — different models live in different vector spaces, and silently mixing them breaks retrieval in subtle ways. Delete the db file or pin the embedder.

File-walk defaults

  • Includes: .ts .tsx .js .jsx .md .mdx .json .yaml .yml
  • Excludes: node_modules .git dist build .next coverage .turbo
  • Skips files larger than 200 KB (configurable)

Override per-call via indexDirectory(root, { include, exclude, maxFileBytes }).

Testing

Unit tests run with mocked fetch (no live Ollama needed):

pnpm --filter @chiefaia/local-rag test

Live smoke script (requires nomic-embed-text pulled and pnpm build first):

pnpm --filter @chiefaia/local-rag build
ROOT=./packages QUERY="how does the router decide local vs claude" \
  node packages/local-rag/scripts/smoke.js

Sample output on the local-rag source itself:

[local-rag] indexing src -> /tmp/local-rag-smoke.db
  files=5
  chunks=11, embedding...
[local-rag] indexed 11 chunks

[local-rag] query: "cosine similarity search"

  [0.556] store.ts:1-60
    [store.ts L1-60]
    // Persistent vector store backed by better-sqlite3.
    ...

Performance budget (M1 Pro 16GB, April 2026)

  • nomic-embed-text warm latency: ~19ms / chunk
  • Index 50k chunks: ~16 minutes (sequential — Ollama serializes embedding calls per slot by default)
  • Query (50k chunks indexed): ~50ms (embed prompt + brute-force cosine)
  • Disk: ~600 MB for 50k chunks at 768-dim (Float32 + JSON metadata)

If the index walltime hurts, set OLLAMA_NUM_PARALLEL=2 daemon-side to roughly halve embedding time.

Roadmap

  • LAI-005 wires this into the router so retrieval-style tasks ("explain this package", "where is X used") get RAG context inserted automatically.
  • Reranking via bge-reranker-v2-m3 is a candidate for v0.2 if precision-at-1 isn't tight enough on the CAIA corpus.
  • AST chunking via tree-sitter is tracked under LAI-007.