npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

@ranavibe/rag

v1.0.0

Published

Advanced RAG (Retrieval Augmented Generation) for RANA Framework

Downloads

99

Readme

@ranavibe/rag

Advanced RAG (Retrieval Augmented Generation) for the RANA Framework.

Features

  • Intelligent Chunking - Semantic, markdown, code-aware, and recursive chunking
  • Hybrid Retrieval - Vector + keyword search with fusion strategies
  • Re-ranking - Cross-encoder, LLM, and diversity-based re-ranking
  • Synthesis - Refine, tree-summarize, and compact synthesis methods
  • Citations - Automatic citation tracking and source attribution
  • React Hooks - Easy integration with React applications
  • Pipeline Presets - Pre-configured pipelines for common use cases

Installation

npm install @ranavibe/rag

Quick Start

import { createRAGPipeline, RAGPresets } from '@ranavibe/rag';

// Use a preset for quick setup
const pipeline = RAGPresets.balanced();

// Index your documents
await pipeline.index([
  { id: 'doc1', content: 'RANA is an AI development framework...' },
  { id: 'doc2', content: 'RAG enables knowledge-grounded AI...' },
]);

// Query the pipeline
const result = await pipeline.query({
  query: 'What is RANA?',
});

console.log(result.answer);
// "RANA is an AI development framework that..."

console.log(result.citations);
// [{ text: '...', source: 'doc1', score: 0.95 }]

Pipeline Configuration

Custom Pipeline

import { createRAGPipeline } from '@ranavibe/rag';

const pipeline = createRAGPipeline({
  // Chunking strategy
  chunker: {
    type: 'semantic',  // 'semantic' | 'markdown' | 'code' | 'recursive'
    chunkSize: 512,
    overlap: 50,
  },

  // Retrieval strategy
  retriever: {
    type: 'hybrid',  // 'vector' | 'keyword' | 'hybrid'
    topK: 20,
    options: {
      vector: { topK: 20, similarityThreshold: 0.5 },
      keyword: { topK: 10, algorithm: 'bm25' },
      fusion: 'reciprocal-rank-fusion',
    },
  },

  // Re-ranking (optional)
  reranker: {
    type: 'cross-encoder',  // 'cross-encoder' | 'llm' | 'diversity'
    topK: 5,
  },

  // Query transformation (optional)
  queryTransformer: {
    multiQuery: true,
    hypotheticalAnswer: true,  // HyDE
    decompose: true,
  },

  // Synthesis strategy
  synthesizer: {
    type: 'refine',  // 'refine' | 'tree-summarize' | 'compact'
    citations: true,
    streaming: true,
    model: 'claude-sonnet-4',
  },

  // Pipeline options
  config: {
    caching: true,
    metrics: true,
    logging: 'verbose',
  },
});

Presets

import { RAGPresets } from '@ranavibe/rag';

// Fast: Optimized for speed
const fast = RAGPresets.fast();

// Accurate: Optimized for quality
const accurate = RAGPresets.accurate();

// Balanced: Good speed/quality tradeoff
const balanced = RAGPresets.balanced();

// Code: For code search and Q&A
const code = RAGPresets.code('typescript');

// Documentation: For documentation search
const docs = RAGPresets.documentation();

// Research: For research papers
const research = RAGPresets.research();

// Chat: For conversational RAG
const chat = RAGPresets.chat();

Chunking Strategies

Semantic Chunking

Splits text based on semantic boundaries using embedding similarity:

import { SemanticChunker } from '@ranavibe/rag';

const chunker = new SemanticChunker();
const chunks = await chunker.chunk(text, {
  chunkSize: 512,
  overlap: 50,
  similarityThreshold: 0.5,
});

Markdown Chunking

Preserves markdown structure (headers, code blocks, lists):

import { MarkdownChunker } from '@ranavibe/rag';

const chunker = new MarkdownChunker();
const chunks = await chunker.chunk(markdown, {
  chunkSize: 512,
  preserveHeaders: true,
  preserveCodeBlocks: true,
});

Code Chunking

Preserves function and class boundaries:

import { CodeChunker } from '@ranavibe/rag';

const chunker = new CodeChunker();
const chunks = await chunker.chunk(code, {
  language: 'typescript',
  chunkSize: 1024,
  preserveFunctions: true,
  preserveClasses: true,
});

Retrieval Strategies

Hybrid Retrieval

Combines vector and keyword search:

import { HybridRetriever } from '@ranavibe/rag';

const retriever = new HybridRetriever();
await retriever.index(chunks);

const results = await retriever.retrieve(query, {
  vector: { topK: 20 },
  keyword: { topK: 10, algorithm: 'bm25' },
  fusion: 'reciprocal-rank-fusion',  // or 'weighted', 'max'
});

Fusion Strategies

  • Reciprocal Rank Fusion (RRF): Combines rankings, good for diverse sources
  • Weighted: Configurable weights for vector vs keyword
  • Max: Takes highest score from either method

Re-ranking

Cross-Encoder Re-ranking

More accurate than bi-encoder but slower:

import { CrossEncoderReranker } from '@ranavibe/rag';

const reranker = new CrossEncoderReranker();
const reranked = await reranker.rerank(query, results, {
  topK: 5,
  normalize: true,
});

Diversity Re-ranking (MMR)

Maximize relevance while maintaining diversity:

import { DiversityReranker } from '@ranavibe/rag';

const reranker = new DiversityReranker();
const reranked = await reranker.rerank(query, results, {
  topK: 5,
  lambda: 0.5,  // Balance relevance vs diversity
});

Synthesis Methods

Refine Synthesis

Iteratively refines answer with each chunk:

// Good for comprehensive answers
synthesizer: {
  type: 'refine',
  citations: true,
}

Tree Summarization

Hierarchically summarizes in a tree structure:

// Good for many chunks
synthesizer: {
  type: 'tree-summarize',
  citations: true,
}

Compact Synthesis

Single LLM call with all context:

// Fastest, good for small contexts
synthesizer: {
  type: 'compact',
  citations: true,
}

React Integration

Setup

import { RAGProvider, RAGPresets } from '@ranavibe/rag';

const pipeline = RAGPresets.balanced();

function App() {
  return (
    <RAGProvider pipeline={pipeline}>
      <SearchComponent />
    </RAGProvider>
  );
}

useRAG Hook

import { useRAG } from '@ranavibe/rag';

function SearchComponent() {
  const { query, answer, citations, isLoading, error } = useRAG();

  const handleSearch = async (q: string) => {
    await query(q);
  };

  return (
    <div>
      <input onKeyDown={e => e.key === 'Enter' && handleSearch(e.target.value)} />
      {isLoading && <Spinner />}
      {error && <Error message={error.message} />}
      {answer && (
        <>
          <Answer content={answer} />
          <Citations items={citations} />
        </>
      )}
    </div>
  );
}

useRAGStream Hook

import { useRAGStream } from '@ranavibe/rag';

function StreamingSearch() {
  const { queryStream, answer, citations, isStreaming, stop } = useRAGStream();

  return (
    <div>
      <button onClick={() => queryStream('Explain RAG')}>
        Search
      </button>
      {isStreaming && <button onClick={stop}>Stop</button>}
      <div>{answer}</div>
      <Sources items={citations} />
    </div>
  );
}

useRAGIndex Hook

import { useRAGIndex } from '@ranavibe/rag';

function DocumentManager() {
  const { index, deleteDocuments, isIndexing, progress, documentCount } = useRAGIndex();

  const handleUpload = async (files: File[]) => {
    const documents = await Promise.all(
      files.map(async f => ({
        id: f.name,
        content: await f.text(),
      }))
    );
    await index(documents);
  };

  return (
    <div>
      <input type="file" multiple onChange={e => handleUpload(e.target.files)} />
      {isIndexing && <Progress value={progress} />}
      <p>Documents indexed: {documentCount}</p>
    </div>
  );
}

API Reference

RAGPipeline

| Method | Description | |--------|-------------| | query(options) | Execute RAG query | | queryStream(options) | Streaming RAG query | | index(documents) | Index documents | | delete(ids) | Delete documents |

RAGResult

| Property | Type | Description | |----------|------|-------------| | answer | string | Generated answer | | citations | Citation[] | Source citations | | sources | Source[] | Unique sources | | metrics | RAGMetrics | Performance metrics |

RAGMetrics

| Property | Description | |----------|-------------| | latency | Total query time (ms) | | cost | Estimated cost ($) | | chunks.total | Total indexed chunks | | chunks.retrieved | Chunks retrieved | | chunks.used | Chunks used in answer | | tokens.input | Input tokens | | tokens.output | Output tokens |

Best Practices

  1. Choose the right chunker: Use semantic for general text, markdown for docs, code for source files
  2. Tune topK: Start with 10-20 for retrieval, 3-5 after reranking
  3. Use hybrid retrieval: Combines semantic understanding with keyword matching
  4. Enable caching: Reduces cost and latency for repeated queries
  5. Monitor metrics: Track latency, cost, and citation quality
  6. Use streaming: Improves perceived performance for users

License

MIT