npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

retrieval-kit

v1.0.0

Published

A perfect retrieval search toolkit — vector, BM25, hybrid, and graph-based search with pluggable backends.

Readme

retrieval-kit

Perfect retrieval search for Node.js — BM25, vector semantic, hybrid (RRF), and graph-based retrieval with pluggable backends and a standalone CLI.

Built by FavlinkSoftware / RaggedAI.


Features

| Mode | Algorithm | Best for | |----------|----------------------------------|---------------------------------------| | bm25 | Okapi BM25 | Keyword precision, no embedder needed | | vector | Cosine / dot / Euclidean | Semantic / meaning-based search | | hybrid | BM25 + Vector via RRF fusion | Best of both worlds (recommended) | | graph | Personalized PageRank on doc graph | Relational / context-aware retrieval |

Backends

| Backend | Class | Notes | |-------------|--------------------|--------------------------------------| | In-memory | (default) | No extra deps, perfect for dev/RAG | | MongoDB | MongoDBBackend | Atlas Vector Search | | Pinecone | PineconeBackend | Managed vector DB | | Qdrant | QdrantBackend | Open-source vector DB | | pgvector | PgVectorBackend | PostgreSQL + HNSW index |

Embedders

| Provider | Class | Model default | |----------|------------------|---------------------------| | OpenAI | OpenAIEmbedder | text-embedding-3-small | | Local | LocalEmbedder | all-MiniLM-L6-v2 (ONNX) | | Cohere | CohereEmbedder | embed-english-v3.0 | | Mock | MockEmbedder | Deterministic (testing) |


Installation

npm install retrieval-kit

Install optional peer dependencies for your chosen backend/embedder:

# OpenAI embeddings
npm install openai

# Local embeddings (no API key needed)
npm install @xenova/transformers

# MongoDB backend
npm install mongodb

# Pinecone backend
npm install @pinecone-database/pinecone

# Qdrant backend
npm install @qdrant/js-client-rest

# pgvector backend
npm install pg

Quick Start

Programmatic API

import { RetrievalKit, OpenAIEmbedder } from "retrieval-kit";

const embedder = new OpenAIEmbedder({ apiKey: process.env.OPENAI_API_KEY });

const kit = new RetrievalKit({ mode: "hybrid", embedder });

await kit.index([
  { id: "1", text: "Retrieval-Augmented Generation improves LLM accuracy." },
  { id: "2", text: "BM25 is a classic keyword-based ranking function." },
  { id: "3", text: "Vector search uses embeddings to find semantic matches." },
]);

const results = await kit.search("what is semantic search?", { topK: 3 });
// results: [{ doc: { id, text, metadata }, score, rank }]

BM25 only (no embedder)

import { BM25Retriever } from "retrieval-kit";

const bm25 = new BM25Retriever({ k1: 1.5, b: 0.75 });
bm25.index(docs);
const results = bm25.search("keyword query", { topK: 5 });

With a MongoDB backend

import { RetrievalKit, OpenAIEmbedder, MongoDBBackend } from "retrieval-kit";

const backend = new MongoDBBackend({
  uri: process.env.MONGODB_URI,
  dbName: "mydb",
  collectionName: "documents",
  indexName: "vector_index",
  dimensions: 1536,
});

const kit = new RetrievalKit({
  mode: "hybrid",
  embedder: new OpenAIEmbedder({ apiKey: process.env.OPENAI_API_KEY }),
  backend,
});

await kit.index(docs);
const results = await kit.search("semantic query");

Graph retrieval

import { GraphRetriever } from "retrieval-kit";

const graph = new GraphRetriever({ alpha: 0.85, seedTopK: 5 });
await graph.index(docs);
const results = graph.search("context-aware query", { topK: 10 });

Custom RRF fusion (manual)

import { BM25Retriever, VectorRetriever, RRFFusion, OpenAIEmbedder } from "retrieval-kit";

const bm25 = new BM25Retriever();
const vector = new VectorRetriever({ embedder: new OpenAIEmbedder({ apiKey }) });
const fusion = new RRFFusion(60);

bm25.index(docs);
await vector.index(docs);

const [bm25Results, vectorResults] = await Promise.all([
  bm25.search(query, { topK: 20 }),
  vector.search(query, { topK: 20 }),
]);

const fused = fusion.fuse([vectorResults, bm25Results], {
  weights: [1.2, 0.8],   // boost semantic results
  topK: 10,
});

CLI

# Global install
npm install -g retrieval-kit

# Or use npx
npx retrieval-kit --help

Commands

# Create config
retrieval-kit config init

# Index a file (JSON array or JSONL)
retrieval-kit index docs.json --mode hybrid

# Search
retrieval-kit search "my query" --top-k 5

# Inspect index
retrieval-kit inspect

# Benchmark all modes
retrieval-kit benchmark --query "example query" --runs 10

Document format

[
  { "id": "doc1", "text": "...", "metadata": { "source": "..." } },
  { "id": "doc2", "text": "..." }
]

JSONL (one doc per line) is also supported.

Config file (.retrievalrc.json)

{
  "mode": "hybrid",
  "embedder": {
    "provider": "openai",
    "model": "text-embedding-3-small",
    "apiKey": "${OPENAI_API_KEY}"
  },
  "search": { "topK": 10, "minScore": 0.0 },
  "hybrid": { "vectorWeight": 1.0, "bm25Weight": 1.0, "rrf_k": 60 },
  "bm25": { "k1": 1.5, "b": 0.75 }
}

API Reference

RetrievalKit

| Method | Description | |--------|-------------| | new RetrievalKit({ mode, embedder, backend, ...opts }) | Create instance | | await kit.index(docs) | Index documents | | await kit.search(query, { topK, minScore, filter }) | Search | | kit.size | Number of indexed documents |

Result shape

{
  doc: { id: string, text: string, metadata: object },
  score: number,
  rank: number,
  sources?: Array<{ listIdx, rank, score }>  // hybrid only
}

Integration with RaggedAI

import { RetrievalKit, MongoDBBackend, OpenAIEmbedder } from "retrieval-kit";

// Drop-in retriever for any RAG pipeline
const retriever = new RetrievalKit({
  mode: "hybrid",
  embedder: new OpenAIEmbedder({ apiKey }),
  backend: new MongoDBBackend({ uri, dbName: "raggedai", collectionName: "chunks" }),
});

// In your RAG pipeline:
const context = await retriever.search(userQuery, { topK: 5 });
const contextText = context.map(r => r.doc.text).join("\n\n");
// → pass contextText to your LLM prompt

License

MIT © FavlinkSoftware / RaggedAI