@ranavibe/rag
v1.0.0
Published
Advanced RAG (Retrieval Augmented Generation) for RANA Framework
Downloads
99
Maintainers
Readme
@ranavibe/rag
Advanced RAG (Retrieval Augmented Generation) for the RANA Framework.
Features
- Intelligent Chunking - Semantic, markdown, code-aware, and recursive chunking
- Hybrid Retrieval - Vector + keyword search with fusion strategies
- Re-ranking - Cross-encoder, LLM, and diversity-based re-ranking
- Synthesis - Refine, tree-summarize, and compact synthesis methods
- Citations - Automatic citation tracking and source attribution
- React Hooks - Easy integration with React applications
- Pipeline Presets - Pre-configured pipelines for common use cases
Installation
npm install @ranavibe/ragQuick Start
import { createRAGPipeline, RAGPresets } from '@ranavibe/rag';
// Use a preset for quick setup
const pipeline = RAGPresets.balanced();
// Index your documents
await pipeline.index([
{ id: 'doc1', content: 'RANA is an AI development framework...' },
{ id: 'doc2', content: 'RAG enables knowledge-grounded AI...' },
]);
// Query the pipeline
const result = await pipeline.query({
query: 'What is RANA?',
});
console.log(result.answer);
// "RANA is an AI development framework that..."
console.log(result.citations);
// [{ text: '...', source: 'doc1', score: 0.95 }]Pipeline Configuration
Custom Pipeline
import { createRAGPipeline } from '@ranavibe/rag';
const pipeline = createRAGPipeline({
// Chunking strategy
chunker: {
type: 'semantic', // 'semantic' | 'markdown' | 'code' | 'recursive'
chunkSize: 512,
overlap: 50,
},
// Retrieval strategy
retriever: {
type: 'hybrid', // 'vector' | 'keyword' | 'hybrid'
topK: 20,
options: {
vector: { topK: 20, similarityThreshold: 0.5 },
keyword: { topK: 10, algorithm: 'bm25' },
fusion: 'reciprocal-rank-fusion',
},
},
// Re-ranking (optional)
reranker: {
type: 'cross-encoder', // 'cross-encoder' | 'llm' | 'diversity'
topK: 5,
},
// Query transformation (optional)
queryTransformer: {
multiQuery: true,
hypotheticalAnswer: true, // HyDE
decompose: true,
},
// Synthesis strategy
synthesizer: {
type: 'refine', // 'refine' | 'tree-summarize' | 'compact'
citations: true,
streaming: true,
model: 'claude-sonnet-4',
},
// Pipeline options
config: {
caching: true,
metrics: true,
logging: 'verbose',
},
});Presets
import { RAGPresets } from '@ranavibe/rag';
// Fast: Optimized for speed
const fast = RAGPresets.fast();
// Accurate: Optimized for quality
const accurate = RAGPresets.accurate();
// Balanced: Good speed/quality tradeoff
const balanced = RAGPresets.balanced();
// Code: For code search and Q&A
const code = RAGPresets.code('typescript');
// Documentation: For documentation search
const docs = RAGPresets.documentation();
// Research: For research papers
const research = RAGPresets.research();
// Chat: For conversational RAG
const chat = RAGPresets.chat();Chunking Strategies
Semantic Chunking
Splits text based on semantic boundaries using embedding similarity:
import { SemanticChunker } from '@ranavibe/rag';
const chunker = new SemanticChunker();
const chunks = await chunker.chunk(text, {
chunkSize: 512,
overlap: 50,
similarityThreshold: 0.5,
});Markdown Chunking
Preserves markdown structure (headers, code blocks, lists):
import { MarkdownChunker } from '@ranavibe/rag';
const chunker = new MarkdownChunker();
const chunks = await chunker.chunk(markdown, {
chunkSize: 512,
preserveHeaders: true,
preserveCodeBlocks: true,
});Code Chunking
Preserves function and class boundaries:
import { CodeChunker } from '@ranavibe/rag';
const chunker = new CodeChunker();
const chunks = await chunker.chunk(code, {
language: 'typescript',
chunkSize: 1024,
preserveFunctions: true,
preserveClasses: true,
});Retrieval Strategies
Hybrid Retrieval
Combines vector and keyword search:
import { HybridRetriever } from '@ranavibe/rag';
const retriever = new HybridRetriever();
await retriever.index(chunks);
const results = await retriever.retrieve(query, {
vector: { topK: 20 },
keyword: { topK: 10, algorithm: 'bm25' },
fusion: 'reciprocal-rank-fusion', // or 'weighted', 'max'
});Fusion Strategies
- Reciprocal Rank Fusion (RRF): Combines rankings, good for diverse sources
- Weighted: Configurable weights for vector vs keyword
- Max: Takes highest score from either method
Re-ranking
Cross-Encoder Re-ranking
More accurate than bi-encoder but slower:
import { CrossEncoderReranker } from '@ranavibe/rag';
const reranker = new CrossEncoderReranker();
const reranked = await reranker.rerank(query, results, {
topK: 5,
normalize: true,
});Diversity Re-ranking (MMR)
Maximize relevance while maintaining diversity:
import { DiversityReranker } from '@ranavibe/rag';
const reranker = new DiversityReranker();
const reranked = await reranker.rerank(query, results, {
topK: 5,
lambda: 0.5, // Balance relevance vs diversity
});Synthesis Methods
Refine Synthesis
Iteratively refines answer with each chunk:
// Good for comprehensive answers
synthesizer: {
type: 'refine',
citations: true,
}Tree Summarization
Hierarchically summarizes in a tree structure:
// Good for many chunks
synthesizer: {
type: 'tree-summarize',
citations: true,
}Compact Synthesis
Single LLM call with all context:
// Fastest, good for small contexts
synthesizer: {
type: 'compact',
citations: true,
}React Integration
Setup
import { RAGProvider, RAGPresets } from '@ranavibe/rag';
const pipeline = RAGPresets.balanced();
function App() {
return (
<RAGProvider pipeline={pipeline}>
<SearchComponent />
</RAGProvider>
);
}useRAG Hook
import { useRAG } from '@ranavibe/rag';
function SearchComponent() {
const { query, answer, citations, isLoading, error } = useRAG();
const handleSearch = async (q: string) => {
await query(q);
};
return (
<div>
<input onKeyDown={e => e.key === 'Enter' && handleSearch(e.target.value)} />
{isLoading && <Spinner />}
{error && <Error message={error.message} />}
{answer && (
<>
<Answer content={answer} />
<Citations items={citations} />
</>
)}
</div>
);
}useRAGStream Hook
import { useRAGStream } from '@ranavibe/rag';
function StreamingSearch() {
const { queryStream, answer, citations, isStreaming, stop } = useRAGStream();
return (
<div>
<button onClick={() => queryStream('Explain RAG')}>
Search
</button>
{isStreaming && <button onClick={stop}>Stop</button>}
<div>{answer}</div>
<Sources items={citations} />
</div>
);
}useRAGIndex Hook
import { useRAGIndex } from '@ranavibe/rag';
function DocumentManager() {
const { index, deleteDocuments, isIndexing, progress, documentCount } = useRAGIndex();
const handleUpload = async (files: File[]) => {
const documents = await Promise.all(
files.map(async f => ({
id: f.name,
content: await f.text(),
}))
);
await index(documents);
};
return (
<div>
<input type="file" multiple onChange={e => handleUpload(e.target.files)} />
{isIndexing && <Progress value={progress} />}
<p>Documents indexed: {documentCount}</p>
</div>
);
}API Reference
RAGPipeline
| Method | Description |
|--------|-------------|
| query(options) | Execute RAG query |
| queryStream(options) | Streaming RAG query |
| index(documents) | Index documents |
| delete(ids) | Delete documents |
RAGResult
| Property | Type | Description |
|----------|------|-------------|
| answer | string | Generated answer |
| citations | Citation[] | Source citations |
| sources | Source[] | Unique sources |
| metrics | RAGMetrics | Performance metrics |
RAGMetrics
| Property | Description |
|----------|-------------|
| latency | Total query time (ms) |
| cost | Estimated cost ($) |
| chunks.total | Total indexed chunks |
| chunks.retrieved | Chunks retrieved |
| chunks.used | Chunks used in answer |
| tokens.input | Input tokens |
| tokens.output | Output tokens |
Best Practices
- Choose the right chunker: Use semantic for general text, markdown for docs, code for source files
- Tune topK: Start with 10-20 for retrieval, 3-5 after reranking
- Use hybrid retrieval: Combines semantic understanding with keyword matching
- Enable caching: Reduces cost and latency for repeated queries
- Monitor metrics: Track latency, cost, and citation quality
- Use streaming: Improves perceived performance for users
License
MIT
