npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

pgvector-rag

v0.1.0

Published

Lightweight RAG toolkit for pgvector — chunking, hybrid search SQL, MMR, RRF, and more. Zero runtime dependencies.

Readme

pgvector-rag

Lightweight RAG toolkit for PostgreSQL + pgvector. Zero runtime dependencies.

Extracted from a production RAG pipeline serving thousands of queries. Provides the algorithms and SQL you need without the framework lock-in.

Why this exists

| | pgvector-rag | LangChain | LlamaIndex | |---|---|---|---| | Runtime deps | 0 | 50+ | 40+ | | Bundle size | ~15 KB | ~2 MB | ~1.5 MB | | DB lock-in | pgvector only | Many adapters | Many adapters | | Chunking | Built-in | Built-in | Built-in | | Hybrid search SQL | Yes | No (needs driver) | No | | MMR | Yes | Yes | Yes | | RRF fusion | Yes | No | No | | Bring your own DB client | Yes | No | No |

Install

npm install pgvector-rag

Quick Start

1. Chunk a document

import { chunk } from 'pgvector-rag';

const chunks = chunk(documentText, {
  maxChunkChars: 1200,
  overlapChars: 250,
});

// chunks = [{ index: 0, content: '...', type: 'heading' }, ...]

2. Create the table

import { createChunksTableSQL, createIndexesSQL } from 'pgvector-rag/sql';
import pg from 'pg';

const pool = new pg.Pool({ connectionString: DATABASE_URL });

const { text: createTable } = createChunksTableSQL({ dimensions: 1536 });
await pool.query(createTable);

for (const { text } of createIndexesSQL()) {
  await pool.query(text);
}

3. Upsert chunks with embeddings

import { upsertChunksSQL } from 'pgvector-rag/sql';

const records = chunks.map((c, i) => ({
  id: crypto.randomUUID(),
  documentId: 'doc-123',
  chunkIndex: c.index,
  content: c.content,
  embedding: embeddings[i], // from your embedding API
  metadata: { chunk_type: c.type },
}));

const { text, params } = upsertChunksSQL(records);
await pool.query(text, params);

4. Search with hybrid SQL

import { hybridSearchSQL } from 'pgvector-rag/sql';
import { selectMMR, buildContext, normalizeScores } from 'pgvector-rag';

// Generate the search SQL
const { text, params } = hybridSearchSQL({
  documentId: 'doc-123',
  queryText: 'How does photosynthesis work?',
  embedding: queryEmbedding, // from your embedding API
  limit: 50,
});

// Execute with your DB client
const { rows } = await pool.query(text, params);

// Map to ScoredChunks
const scored = normalizeScores(rows.map(r => ({
  rrfScore: r.rrf_score,
  id: r.id,
  chunkIndex: r.chunk_index,
  content: r.content,
  embedding: r.embedding, // if you fetched it
})));

// Diversify with MMR
const selected = selectMMR(scored, 10, 0.7);

// Build context string for your LLM
const context = buildContext(selected, 5000);

API Reference

Chunking

chunk(text, options?)

Split text into chunks with section-awareness, sentence boundaries, and overlap.

chunk(text: string, options?: {
  maxChunkChars?: number;  // default: 1200
  overlapChars?: number;   // default: 250
  maxChunks?: number;      // default: Infinity
}): Chunk[]

sanitizeText(text)

Strip null bytes and control characters.

detectChunkType(content)

Classify a chunk as 'heading', 'list', or 'paragraph'.

Vector Math

cosineSimilarity(a, b)

Cosine similarity between two vectors. Returns [-1, 1].

l2Normalize(vector)

L2-normalize a vector to unit length. Returns a new array.

MMR (Maximal Marginal Relevance)

selectMMR(candidates, k, lambda)

Select k chunks balancing relevance and diversity.

  • lambda = 1.0 → pure relevance (no diversity)
  • lambda = 0.0 → pure diversity (ignore scores)
  • lambda = 0.7 → good default for QA
  • lambda = 0.5 → good default for summaries

Falls back to Jaccard token similarity when embeddings are absent.

Context Building

buildContext(chunks, maxChars)

Format chunks into an LLM context string. Sorts by index, adds --- gap separators, respects character budget.

Scoring

normalizeScores(chunkRows)

Convert raw ChunkRow objects (from hybrid search) into ScoredChunk objects.

deduplicateByIndex(items)

Keep highest-scoring entry per chunkIndex.

Summary Sampling

selectSummaryRepresentatives(candidates, bucketSize?, maxReps?)

Pick one representative per document section for broad coverage.

Query Classification

classifyQueryType(query)

Regex-based classification: 'instructional', 'informational', or 'definitional'.

isInstructionalQuery(query)

Quick boolean check.

RRF (Reciprocal Rank Fusion)

getRRFWeights(query, queryType?, config?)

Get RRF signal weights tuned for the query type.

getThresholds(queryType?, config?)

Get similarity/BM25 thresholds for the query type.

Legacy Reranker

legacyRerank(query, candidates, topN)

Term-frequency + proximity reranker. Use as a fallback when a cross-encoder (Cohere, etc.) is unavailable.

Configuration

createConfig(overrides?)

Create a RAGConfig with sensible production defaults, optionally overriding specific values.

DEFAULT_CONFIG

Frozen default config with 25+ tuning knobs. See src/core/config.ts.

Concurrency

new Semaphore(max)

Counting semaphore for rate-limiting concurrent operations (e.g., embedding API calls).

const sem = new Semaphore(4);
await sem.acquire();
try { /* work */ } finally { sem.release(); }

SQL Generators (pgvector-rag/sql)

hybridSearchSQL(options)

3-CTE query combining vector similarity + BM25 + phrase matching via RRF.

createChunksTableSQL(options?)

CREATE TABLE with vector column, tsvector, and unique constraint.

createIndexesSQL(options?)

HNSW vector index + GIN text search index + document_id index.

upsertChunksSQL(chunks, tableName?)

Batch INSERT … ON CONFLICT with vector and jsonb casting.

deleteChunksSQL(documentId, tableName?)

DELETE all chunks for a document.

Using with ORMs

Knex

import { hybridSearchSQL } from 'pgvector-rag/sql';

const { text, params } = hybridSearchSQL({ ... });
const rows = await knex.raw(text, params);

Drizzle

import { sql } from 'drizzle-orm';
import { hybridSearchSQL } from 'pgvector-rag/sql';

const { text, params } = hybridSearchSQL({ ... });
const rows = await db.execute(sql.raw(text, ...params));

Prisma

import { hybridSearchSQL } from 'pgvector-rag/sql';

const { text, params } = hybridSearchSQL({ ... });
const rows = await prisma.$queryRawUnsafe(text, ...params);

Configuration

Every algorithm is configurable via createConfig():

import { createConfig } from 'pgvector-rag';

const config = createConfig({
  rrfK: 100,          // RRF constant (default: 60)
  kQA: 15,            // Final chunks for QA (default: 10)
  kSummary: 20,       // Final chunks for summaries (default: 14)
  mmrLambdaQA: 0.8,   // MMR trade-off for QA (default: 0.7)
  simThreshold: 0.2,  // Minimum cosine similarity (default: 0.15)
});

Pass config to getRRFWeights() and getThresholds().

Coming Soon

  • Pipeline builder (createPipeline({ embedder, db }))
  • Embedder adapters (OpenAI, Cohere, HuggingFace)
  • Reranker adapters (Cohere cross-encoder, BGE)
  • Streaming chunk insertion
  • Chunk overlap deduplication

License

MIT