@snap-agent/rag-docs

v0.1.0

Published

13 days ago

Documentation RAG plugin for SnapAgent SDK - Semantic search over markdown, code, and technical documentation

Downloads

0High
0Medium
0Low

nico.pinos.vt

snap-agent rag documentation markdown search embeddings vector-search ai plugin

@snap-agent/rag-docs

Documentation RAG plugin for SnapAgent SDK - Semantic search over markdown, code, and technical documentation.

Features

Smart Chunking - Markdown-aware, paragraph, sentence, or fixed-size strategies
Code-Aware - Extracts and indexes code blocks with language detection
Section Hierarchy - Preserves heading structure for context
Semantic Search - OpenAI embeddings for natural language queries
In-Memory - Fast, zero-config storage
Similarity Filtering - Configurable minimum score threshold

Installation

npm install @snap-agent/rag-docs @snap-agent/core

Quick Start

import { createClient, MemoryStorage } from '@snap-agent/core';
import { DocsRAGPlugin } from '@snap-agent/rag-docs';

const client = createClient({
  storage: new MemoryStorage(),
  providers: {
    openai: { apiKey: process.env.OPENAI_API_KEY! },
  },
});

const agent = await client.createAgent({
  name: 'Docs Assistant',
  instructions: 'You help users understand the documentation.',
  model: 'gpt-4o',
  userId: 'user-123',
  plugins: [
    new DocsRAGPlugin({
      embeddingProviderApiKey: process.env.OPENAI_API_KEY!,
      chunkingStrategy: 'markdown',
    }),
  ],
});

// Ingest documentation
await agent.ingestDocuments([
  {
    id: 'getting-started',
    content: `# Getting Started

Welcome to our platform!

## Installation

\`\`\`bash
npm install our-package
\`\`\`

## Basic Usage

First, initialize the client:

\`\`\`typescript
import { Client } from 'our-package';
const client = new Client();
\`\`\`
`,
    metadata: { title: 'Getting Started Guide' },
  },
]);

// Query the docs
const response = await client.chat({
  threadId: thread.id,
  message: 'How do I install the package?',
  useRAG: true,
});

Configuration

const plugin = new DocsRAGPlugin({
  // Required
  embeddingProviderApiKey: process.env.OPENAI_API_KEY!,

  // Embedding Provider (optional)
  embeddingProvider: 'openai',  // 'openai' | 'voyage' (default: 'openai')
  embeddingModel: 'text-embedding-3-small', // Model to use

  // Chunking
  chunkingStrategy: 'markdown', // 'markdown' | 'paragraph' | 'sentence' | 'fixed'
  maxChunkSize: 1000,           // Max characters per chunk
  chunkOverlap: 200,            // Overlap for fixed strategy

  // Search
  limit: 5,                     // Results to return
  minSimilarity: 0.7,           // Minimum similarity score (0-1)

  // Options
  includeCode: true,            // Index code blocks
});

Embedding Providers

OpenAI (Default)

Best for English-focused documentation with excellent general-purpose embeddings.

const plugin = new DocsRAGPlugin({
  embeddingProviderApiKey: process.env.OPENAI_API_KEY!,
  embeddingProvider: 'openai',
  embeddingModel: 'text-embedding-3-small', // or 'text-embedding-3-large'
});

Voyage AI

Better multilingual support and cost-effective for high-volume use cases.

const plugin = new DocsRAGPlugin({
  embeddingProviderApiKey: process.env.VOYAGE_API_KEY!,
  embeddingProvider: 'voyage',
  embeddingModel: 'voyage-3-lite', // or 'voyage-3', 'voyage-multilingual-2'
});

| Provider | Default Model | Best For | |----------|--------------|----------| | OpenAI | text-embedding-3-small | English docs, simplicity | | Voyage | voyage-3-lite | Multilingual, cost optimization |

Chunking Strategies

`markdown` (Recommended for docs)

Preserves heading hierarchy
Extracts code blocks separately
Maintains section context
Best for technical documentation

`paragraph`

Splits on double newlines
Good for prose-heavy content
Maintains natural reading units

`sentence`

Splits on sentence boundaries
Best for Q&A style content
Granular retrieval

`fixed`

Fixed-size chunks with overlap
Consistent chunk sizes
Good for uniform content

Ingesting Documents

Single Document

await agent.ingestDocuments([
  {
    id: 'api-reference',
    content: '# API Reference\n\n...',
    metadata: {
      title: 'API Reference',
      category: 'reference',
      version: '1.0.0',
    },
  },
]);

From Files (Example)

import fs from 'fs';
import path from 'path';

const docsDir = './docs';
const files = fs.readdirSync(docsDir);

const documents = files
  .filter(f => f.endsWith('.md'))
  .map(file => ({
    id: path.basename(file, '.md'),
    content: fs.readFileSync(path.join(docsDir, file), 'utf-8'),
    metadata: { filename: file },
  }));

await agent.ingestDocuments(documents);

Filtering Results

const response = await client.chat({
  threadId: thread.id,
  message: 'Show me code examples',
  useRAG: true,
  ragFilters: {
    type: 'code',     // Only return code chunks
    section: 'Usage', // Only from "Usage" sections
  },
});

Response Metadata

const response = await client.chat({
  threadId: thread.id,
  message: 'How do I authenticate?',
  useRAG: true,
});

console.log(response.metadata);
// {
//   count: 3,
//   totalChunks: 45,
//   strategy: 'markdown',
//   avgScore: 0.82,
//   sources: [
//     { id: 'auth-chunk-1', section: 'Authentication', type: 'text', score: 0.91 },
//     { id: 'auth-chunk-2', section: 'Authentication', type: 'code', score: 0.85 },
//     ...
//   ]
// }

API Reference

`DocsRAGPlugin`

Constructor

new DocsRAGPlugin(config: DocsRAGConfig)

Methods

| Method | Description | |--------|-------------| | retrieveContext(message, options) | Search documentation | | ingest(documents, options) | Index documents | | update(id, document, options) | Update a document | | delete(ids, options) | Remove documents | | getStats() | Get indexing statistics | | clearAgent(agentId) | Clear agent's data | | clearAll() | Clear all data |

Use Cases

API Documentation - Search endpoints, parameters, examples
User Guides - Natural language queries over tutorials
Knowledge Bases - Company wikis and internal docs
Code References - Search code examples and snippets
FAQs - Question-answer retrieval

License

MIT © ViloTech

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@snap-agent/rag-docs

Features

Installation

Quick Start

Configuration

Embedding Providers

OpenAI (Default)

Voyage AI

Chunking Strategies

markdown (Recommended for docs)

paragraph

sentence

fixed

Ingesting Documents

Single Document

From Files (Example)

Filtering Results

Response Metadata

API Reference

DocsRAGPlugin

Constructor

Methods

Use Cases

License

Support

`markdown` (Recommended for docs)

`paragraph`

`sentence`

`fixed`

`DocsRAGPlugin`