npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

vectra-js

v1.0.2

Published

A production-ready, provider-agnostic Node.js SDK for End-to-End RAG pipelines.

Readme

Vectra (Node.js)

Vectra is a production-grade, provider-agnostic Node.js SDK for building end-to-end Retrieval-Augmented Generation (RAG) systems. It is designed for teams that need flexibility, extensibility, correctness, and observability across embeddings, vector databases, retrieval strategies, and LLM providers—without locking into a single vendor.

GitHub Release NPM Version NPM Downloads Quality Gate Status

If you find this project useful, consider supporting it: Star this project on GitHub Sponsor me on GitHub Buy me a Coffee

Table of Contents


1. Overview

Vectra provides a fully modular RAG pipeline:

Load → Chunk → Embed → Store → Retrieve → Rerank → Plan → Ground → Generate → Stream

Every stage is explicitly configurable, validated at runtime, and observable.

Key Characteristics

  • Provider‑agnostic LLM & embedding layer
  • Multiple vector backends (Postgres, Chroma, Qdrant, Milvus)
  • Advanced retrieval strategies (HyDE, Multi‑Query, Hybrid RRF, MMR)
  • Unified streaming interface
  • Built‑in evaluation & observability
  • CLI + SDK parity

2. Design Goals & Philosophy

Explicitness over Magic

Vectra avoids hidden defaults. Chunking, retrieval, grounding, memory, and generation behavior are always explicit.

Production‑First

Index helpers, rate limiting, embedding cache, observability, and evaluation are first‑class features.

Provider Neutrality

Swapping OpenAI → Gemini → Anthropic → Ollama requires no application code changes.

Extensibility

Every major subsystem (providers, vector stores, callbacks) is interface‑driven.


3. Feature Matrix

Providers

  • Embeddings: OpenAI, Gemini, Ollama, HuggingFace
  • Generation: OpenAI, Gemini, Anthropic, Ollama, OpenRouter, HuggingFace
  • Streaming: Unified async generator

Vector Stores

  • PostgreSQL (Prisma + pgvector)
  • PostgreSQL (native pg driver)
  • ChromaDB
  • Qdrant
  • Milvus

Retrieval Strategies

  • Naive cosine similarity
  • HyDE (Hypothetical Document Embeddings)
  • Multi‑Query expansion
  • Hybrid semantic + lexical (RRF)
  • MMR diversification

4. Installation

Library

npm install vectra-js
# or
pnpm add vectra-js

Backends:

npm install pg                    # https://node-postgres.com/
npm install @prisma/client       # https://prisma.io/docs
npm install chromadb              # https://docs.trychroma.com/
npm install qdrant-client         # https://qdrant.tech/documentation/
npm install pymilvus              # https://milvus.io/docs/

CLI

npm i -g vectra-js
# or
pnpm add -g vectra-js

5. Quick Start

const { VectraClient, ProviderType } = require('vectra-js');
const { Pool } = require('pg');

const pool = new Pool({
  connectionString: process.env.DATABASE_URL
});

const client = new VectraClient({
  embedding: {
    provider: ProviderType.OPENAI,
    apiKey: process.env.OPENAI_API_KEY,
    modelName: 'text-embedding-3-small'
  },
  llm: {
    provider: ProviderType.GEMINI,
    apiKey: process.env.GOOGLE_API_KEY,
    modelName: 'gemini-2.5-flash'
  },
  database: {
    type: 'postgres',
    clientInstance: pool,
    tableName: 'document',
    columnMap: { 'content': 'content', 'metadata': 'metadata', 'vector': 'vector' }
  }
});

await client.ingestDocuments('./docs');
const res = await client.queryRAG('What is the vacation policy?');
console.log(res.answer);

6. Core Concepts

Providers

Providers implement embeddings, generation, or both. Vectra normalizes outputs and streaming across providers.

Vector Stores

Vector stores persist embeddings and metadata. They are fully swappable via config.

Chunking

  • Recursive: Character‑aware, separator‑aware splitting
  • Agentic: LLM‑driven semantic propositions (best for policies, legal docs)

Retrieval

Controls recall vs precision using multiple strategies.

Reranking

Optional LLM‑based reordering of retrieved chunks.

Metadata Enrichment

Optional per‑chunk summaries, keywords, and hypothetical questions generated at ingestion time.

Query Planning & Grounding

Controls how context is assembled and how strictly answers must be grounded in retrieved text.

Conversation Memory

Persist multi‑turn chat history across sessions.


7. Configuration Reference (Usage‑Driven)

All configuration is validated using Zod at runtime.

Embedding

embedding: {
  provider: ProviderType.OPENAI,
  apiKey: process.env.OPENAI_API_KEY,
  modelName: 'text-embedding-3-small',
  dimensions: 1536
}

Use dimensions when using pgvector to avoid runtime mismatches.


LLM

llm: {
  provider: ProviderType.GEMINI,
  apiKey: process.env.GOOGLE_API_KEY,
  modelName: 'gemini-2.5-flash',
  temperature: 0.3,
  maxTokens: 1024
}

Used for:

  • Answer generation
  • HyDE & Multi‑Query
  • Agentic chunking
  • Reranking

Database

Supports Prisma, Postgres (native), Chroma, Qdrant, Milvus.

// PostgreSQL (native pg)
database: {
  type: 'postgres',
  clientInstance: pool, // new Pool(...)
  tableName: 'document',
  columnMap: { content: 'content', metadata: 'metadata', vector: 'vector' }
}
// Prisma
database: {
  type: 'prisma',
  clientInstance: prisma,
  tableName: 'Document',
  columnMap: { content: 'content', metadata: 'metadata', vector: 'embedding' }
}
// ChromaDB
database: {
  type: 'chroma',
  clientInstance: chromaClient,
  collectionName: 'rag_collection'
}
// Qdrant
database: {
  type: 'qdrant',
  clientInstance: qdrantClient,
  collectionName: 'rag_collection'
}
// Milvus
database: {
  type: 'milvus',
  clientInstance: milvusClient,
  collectionName: 'rag_collection'
}

Chunking

chunking: {
  strategy: ChunkingStrategy.RECURSIVE,
  chunkSize: 1000,
  chunkOverlap: 200
}

Agentic chunking:

chunking: {
  strategy: ChunkingStrategy.AGENTIC,
  agenticLlm: {
    provider: ProviderType.OPENAI,
    apiKey: process.env.OPENAI_API_KEY,
    modelName: 'gpt-4o-mini'
  }
}

Retrieval

retrieval: { strategy: RetrievalStrategy.HYBRID }

HYBRID is recommended for production.


Reranking

reranking: {
  enabled: true,
  windowSize: 20,
  topN: 5
}

Memory

memory: { enabled: true, type: 'in-memory', maxMessages: 20 }

Redis and Postgres are supported.

// Redis
memory: {
  enabled: true,
  type: 'redis',
  maxMessages: 20,
  redis: {
    clientInstance: redisClient,
    keyPrefix: 'vectra:chat:'
  }
}
// Postgres
memory: {
  enabled: true,
  type: 'postgres',
  maxMessages: 20,
  postgres: {
    clientInstance: pool, // pg Pool
    tableName: 'ChatMessage',
    columnMap: {
      sessionId: 'sessionId',
      role: 'role',
      content: 'content',
      createdAt: 'createdAt'
    }
  }
}

Observability

observability: {
  enabled: true,
  sqlitePath: 'vectra-observability.db'
}

8. Ingestion Pipeline

await client.ingestDocuments('./documents');

Supports files or directories.

Formats: PDF, DOCX, XLSX, TXT, Markdown


9. Querying & Streaming

const res = await client.queryRAG('Refund policy?');

Streaming:

const stream = await client.queryRAG('Draft email', null, true);
for await (const chunk of stream) process.stdout.write(chunk.delta || '');

10. Conversation Memory

Pass a sessionId to maintain context across turns.


11. Evaluation & Quality Measurement

await client.evaluate([{ question: 'Capital of France?', expectedGroundTruth: 'Paris' }]);

Metrics:

  • Faithfulness
  • Relevance

12. CLI

Ingest & Query

vectra ingest ./docs --config=./config.json
vectra query "What is our leave policy?" --config=./config.json --stream

WebConfig (Config Generator UI)

vectra webconfig

WebConfig launches a local web UI that:

  • Guides you through building a valid vectra.config.json
  • Validates all options interactively
  • Prevents misconfiguration

This is ideal for:

  • First‑time setup
  • Non‑backend users
  • Sharing configs across teams

Observability Dashboard

vectra dashboard

The Observability Dashboard is a local web UI backed by SQLite that visualizes:

  • Ingestion latency
  • Query latency
  • Retrieval & generation traces
  • Chat sessions

It helps you:

  • Debug RAG quality issues
  • Understand latency bottlenecks
  • Monitor production‑like workloads

13. Observability & Callbacks

Observability

Tracks metrics, traces, and sessions automatically when enabled.

Callbacks

Lifecycle hooks:

  • Ingestion
  • Chunking
  • Embedding
  • Retrieval
  • Reranking
  • Generation
  • Errors

14. Telemetry

Vectra collects anonymous usage data to help us improve the SDK, prioritize features, and detect broken versions.

What we track

  • Identity: A random UUID (distinct_id) stored locally in ~/.vectra/telemetry.json. No PII, emails, IPs, or hostnames.
  • Events:
    • sdk_initialized: Config shape (providers used), OS/Runtime version, session type (api/cli/chat).
    • ingest_started/completed: Source type, chunking strategy, duration bucket, chunk count bucket.
    • query_executed: Retrieval strategy, query mode (rag), result count, latency bucket.
    • feature_used: WebConfig/Dashboard usage.
    • evaluation_run: Dataset size bucket.
    • error_occurred: Error type and stage (no stack traces).
    • cli_command_used: Command name and flags.

Why we track it

  • Detect broken versions: Spikes in error_occurred help us find bugs.
  • Measure adoption: Helps us understand which providers (OpenAI vs Gemini) and vector stores are most popular.
  • Drop support safely: We can see if anyone is still using Node 18 before dropping it.

How to opt-out

Telemetry is enabled by default. To disable it:

Option 1: Config

const client = new VectraClient({
  // ...
  telemetry: { enabled: false }
});

Option 2: Environment Variable

Set VECTRA_TELEMETRY_DISABLED=1 or DO_NOT_TRACK=1.


15. Database Schemas & Indexing

model Document {
  id        String   @id @default(uuid())
  content   String
  metadata  Json
  vector    Unsupported("vector")?
  createdAt DateTime @default(now())
}

16. Extending Vectra

Custom Vector Store

class MyStore extends VectorStore {
  async addDocuments() {}
  async similaritySearch() {}
}

17. Architecture Overview

  • VectraClient: orchestrator
  • Typed config schema
  • Interface‑driven providers & stores
  • Unified streaming abstraction

18. Development & Contribution Guide

  • Node.js 18+
  • pnpm recommended
  • Lint: pnpm run lint

19. Production Best Practices

  • Match embedding dimensions to pgvector
  • Prefer HYBRID retrieval
  • Enable observability in staging
  • Evaluate before changing chunk sizes

Vectra scales cleanly from local prototypes to production‑grade RAG platforms.