npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

beaver-rag

v0.2.0

Published

Lightweight TypeScript RAG system with PostgreSQL, OpenAI & Cohere - Built by beavers, for builders 🦫

Downloads

297

Readme

beaver-rag 🦫

Lightweight TypeScript RAG (Retrieval-Augmented Generation) system with PostgreSQL and OpenAI

A production-ready, extensible RAG system built with TypeScript, designed for developers who want to add semantic search and retrieval capabilities to their applications. Built by beavers, for builders.

Features

  • Lightweight & Fast: Minimal dependencies, optimized for performance
  • Type-safe: Full TypeScript support with comprehensive type definitions
  • Extensible: Plugin architecture for embeddings, databases, and chunking strategies
  • Production-ready: Built with best practices, error handling, and validation
  • Multiple Chunking Strategies: Fixed-size and recursive text splitting
  • Powerful Search: Vector similarity search with metadata filtering
  • Developer-friendly: Clean API, detailed error messages, and comprehensive documentation

Installation

# Using Bun
bun add beaver-rag

# Using npm
npm install beaver-rag

# Using pnpm
pnpm add beaver-rag

Requirements

  • Node.js >= 18 or Bun >= 1.0
  • PostgreSQL >= 14 with pgvector extension
  • OpenAI API key

Quick Start

1. Set up your environment

Create a .env file:

OPENAI_API_KEY=sk-your-openai-api-key
DATABASE_URL=postgresql://user:password@localhost:5432/your_database

2. Install pgvector extension

CREATE EXTENSION IF NOT EXISTS vector;

3. Use the RAG system

import { RAGClient, OpenAIEmbeddings, PostgresDatabase } from 'beaver-rag';

// Initialize providers
const embeddings = new OpenAIEmbeddings({
  apiKey: process.env.OPENAI_API_KEY!,
  model: 'text-embedding-3-small',
});

const database = new PostgresDatabase({
  connectionString: process.env.DATABASE_URL!,
});

// Create RAG client
const rag = new RAGClient({
  embeddings,
  database,
});

// Initialize (creates tables, indexes, etc.)
await rag.initialize();

// Add documents
await rag.addDocument(
  'TypeScript is a strongly typed programming language that builds on JavaScript.',
  { category: 'programming', language: 'typescript' }
);

// Search
const results = await rag.search('What is TypeScript?', {
  limit: 5,
  minSimilarity: 0.7,
});

console.log(results[0].document.content);
console.log(`Similarity: ${results[0].similarity}`);

// Clean up
await rag.close();

API Documentation

RAGClient

The main entry point for the RAG system.

Constructor

new RAGClient(config: RAGConfig)

Options:

  • embeddings: An embeddings provider (e.g., OpenAIEmbeddings)
  • database: A database provider (e.g., PostgresDatabase)
  • chunking: (Optional) A chunking strategy (defaults to RecursiveChunking)

Methods

initialize(): Promise<void>

Initializes the RAG system, connects to the database, and sets up schema/indexes.

addDocument(content: string, metadata?: Record<string, any>): Promise<string>

Adds a single document to the system. Returns the document ID.

addDocuments(docs: DocumentInput[]): Promise<string[]>

Adds multiple documents in batch. Returns array of document IDs.

search(query: string, options?: SearchOptions): Promise<SearchResult[]>

Searches for relevant documents.

Options:

  • limit: Maximum results to return (default: 10)
  • minSimilarity: Minimum similarity score 0-1 (optional)
  • filters: Metadata filters as key-value pairs (optional)
updateDocument(id: string, content: string, metadata?: Record<string, any>): Promise<void>

Updates an existing document.

deleteDocument(id: string): Promise<boolean>

Deletes a document by ID.

close(): Promise<void>

Closes the connection to the database.

OpenAIEmbeddings

Provider for OpenAI embeddings.

new OpenAIEmbeddings({
  apiKey: string,
  model?: 'text-embedding-3-small' | 'text-embedding-3-large' | 'text-embedding-ada-002',
  dimensions?: number, // For text-embedding-3-* models
})

PostgresDatabase

Provider for PostgreSQL with pgvector.

new PostgresDatabase({
  connectionString: string,
  tableName?: string, // Default: 'rag_documents'
  dimensions?: number, // Default: 1536
  indexType?: 'hnsw' | 'ivfflat', // Default: 'hnsw'
})

Chunking Strategies

RecursiveChunking (Default)

Splits text while preserving semantic meaning using a hierarchy of separators.

import { RecursiveChunking } from 'beaver-rag';

const chunking = new RecursiveChunking({
  chunkSize: 400,
  chunkOverlap: 80,
  separators: ['\n\n', '\n', '. ', ' '], // Optional
});

FixedSizeChunking

Splits text into fixed-size chunks with overlap.

import { FixedSizeChunking } from 'beaver-rag';

const chunking = new FixedSizeChunking({
  chunkSize: 500,
  chunkOverlap: 100,
});

Examples

Custom Chunking Strategy

const rag = new RAGClient({
  embeddings: new OpenAIEmbeddings({ apiKey: process.env.OPENAI_API_KEY! }),
  database: new PostgresDatabase({ connectionString: process.env.DATABASE_URL! }),
  chunking: new FixedSizeChunking({ chunkSize: 500, chunkOverlap: 100 }),
});

Metadata Filtering

// Add documents with metadata
await rag.addDocument('Content about Python', { language: 'python', difficulty: 'beginner' });
await rag.addDocument('Content about TypeScript', { language: 'typescript', difficulty: 'intermediate' });

// Search with filters
const results = await rag.search('programming concepts', {
  filters: { language: 'typescript' },
  limit: 10,
});

Batch Operations

const documents = [
  { content: 'Document 1', metadata: { category: 'docs' } },
  { content: 'Document 2', metadata: { category: 'tutorial' } },
  { content: 'Document 3', metadata: { category: 'docs' } },
];

const ids = await rag.addDocuments(documents);
console.log(`Added ${ids.length} documents`);

Architecture

The system uses a plugin architecture with three main components:

  1. Embeddings Provider: Converts text to vectors (currently: OpenAI)
  2. Database Provider: Stores and searches vectors (currently: PostgreSQL + pgvector)
  3. Chunking Strategy: Splits documents into manageable pieces

This design makes it easy to add support for other providers in the future.

Error Handling

The system provides specific error types for different failure scenarios:

import {
  RAGError,
  EmbeddingError,
  DatabaseError,
  ValidationError
} from 'beaver-rag';

try {
  await rag.addDocument(content, metadata);
} catch (error) {
  if (error instanceof EmbeddingError) {
    console.error('Embedding failed:', error.message);
  } else if (error instanceof DatabaseError) {
    console.error('Database operation failed:', error.message);
  } else if (error instanceof ValidationError) {
    console.error('Invalid input:', error.field, error.message);
  }
}

Development

Build

bun run build

Type Check

bun run typecheck

Lint

bun run lint

Format

bun run format

Roadmap

Future Features

  • Additional embedding providers (Cohere, HuggingFace, local models)
  • Additional vector databases (Pinecone, Weaviate, Qdrant, ChromaDB)
  • Hybrid search (BM25 + vector)
  • Reranking support
  • Multi-modal support (images, PDFs)
  • Semantic chunking
  • Caching layer (Redis)
  • Analytics and monitoring

License

MIT

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Support

If you encounter any issues or have questions, please file an issue on GitHub.