@aid-on/memory-rag

v1.0.0

Published

5 months ago

Vercel AI SDK対応のインメモリRAG実装 - In-memory RAG implementation with Vercel AI SDK support

0High
0Medium
0Low

rag retrieval-augmented-generation vector-search embeddings ai llm memory vercel-ai-sdk typescript openai anthropic in-memory

@aid-on/memory-rag

🚀 A lightweight, provider-agnostic in-memory RAG (Retrieval-Augmented Generation) library with seamless Vercel AI SDK integration.

✨ Features

🧠 In-Memory Vector Store: Lightning-fast similarity search without external dependencies
🔌 Multi-Provider Support: Works with OpenAI, Anthropic, Google, Cohere, and more via Vercel AI SDK
⚡ Zero Configuration: Get started with sensible defaults, customize when needed
📦 Modular Architecture: Clean separation between vector storage, RAG service, and providers
🎯 TypeScript First: Complete type safety with full IntelliSense support
🔄 Session Isolation: Manage multiple independent knowledge bases per user/session
🤖 Vercel AI SDK Native: Built-in streaming, tools, and edge runtime support
📏 Smart Chunking: Automatic document chunking with configurable size and overlap
🎨 Flexible API: Use high-level helpers or low-level components directly

📦 Installation

npm install @aid-on/memory-rag

# Install optional peer dependencies based on your needs:
npm install @ai-sdk/anthropic  # For Claude models
npm install @ai-sdk/google     # For Gemini models
npm install @ai-sdk/cohere     # For Cohere models

🚀 Quick Start

Basic Usage

import { createSimpleRAG } from '@aid-on/memory-rag';

// Create a RAG instance with OpenAI (default)
const rag = createSimpleRAG();

// Add documents to the knowledge base
await rag.addDocument('RAG combines retrieval and generation for better AI responses.');
await rag.addDocument('Vector embeddings capture semantic meaning of text.');

// Search and generate an answer
const response = await rag.search('What is RAG?', 3);
console.log(response.answer);
// Output: RAG (Retrieval-Augmented Generation) combines retrieval and generation...

Using Specific Providers

import { InMemoryVectorStore, RAGService } from '@aid-on/memory-rag';
import { openai } from '@ai-sdk/openai';
import { anthropic } from '@ai-sdk/anthropic';

// Mix and match providers for embeddings and LLM
const store = new InMemoryVectorStore(
  openai.embedding('text-embedding-3-large')
);
const service = new RAGService(
  anthropic('claude-3-haiku-20240307')
);

// Add documents with metadata
await store.addDocument(
  'Advanced RAG techniques include hybrid search and reranking.',
  { source: 'docs', topic: 'rag-advanced' }
);

// Search with answer generation
const results = await service.search(store, 'advanced RAG', 5, true);
console.log(results.answer);

Session-Based Knowledge Isolation

import { getStore, RAGService } from '@aid-on/memory-rag';

// Create isolated stores for different users/sessions
const userStore = getStore('user-123');
const adminStore = getStore('admin-456');

// Each session maintains its own knowledge base
await userStore.addDocument('User dashboard shows personal metrics.');
await adminStore.addDocument('Admin panel includes system monitoring.');

// Queries only search within the session's knowledge
const service = new RAGService();
const userResults = await service.search(userStore, 'dashboard features');
// Returns only user-specific results

🛠️ Core API

Factory Functions

`createInMemoryRAG(options?)`

Factory function for creating a complete RAG system.

const rag = createInMemoryRAG({
  llmProvider: 'openai',        // or 'anthropic', 'google', 'cohere'
  embeddingProvider: 'openai',   // or any supported provider
  llmModel: 'gpt-4o-mini',      // optional: specific model
  embeddingModel: 'text-embedding-3-small', // optional
  config: {
    vectorStore: {
      maxDocuments: 1000,        // max documents to store
      chunkSize: 500,            // characters per chunk
      chunkOverlap: 50           // overlap between chunks
    },
    search: {
      defaultTopK: 5,            // default results to return
      minScore: 0.5              // minimum similarity score
    }
  }
});

`createSimpleRAG()`

Quick start function with OpenAI defaults.

Core Classes

`InMemoryVectorStore`

In-memory vector storage with similarity search.

class InMemoryVectorStore {
  constructor(embeddingProvider?: EmbeddingProvider | EmbeddingModel | string);
  
  async addDocument(content: string, metadata?: DocumentMetadata): Promise<string>;
  async removeDocument(id: string): Promise<boolean>;
  async search(query: string, topK?: number): Promise<SearchResult[]>;
  clear(): void;
  size(): number;
  getStats(): StoreStats;
}

`RAGService`

Orchestrates RAG operations with LLM integration.

class RAGService {
  constructor(llmProvider?: LLMProvider | LanguageModel | string);
  
  async search(
    store: IVectorStore,
    query: string,
    topK?: number,
    generateAnswer?: boolean
  ): Promise<RAGSearchResult>;
  
  async addDocument(
    store: IVectorStore,
    content: string,
    metadata?: DocumentMetadata,
    useChunks?: boolean,
    chunkSize?: number
  ): Promise<AddDocumentResult>;
}

🔗 Vercel AI SDK Integration

Stream RAG Responses

Perfect for chat applications with real-time streaming:

import { streamRAGResponse } from '@aid-on/memory-rag';

// In your API route or server action
const stream = await streamRAGResponse({
  messages: [
    { role: 'user', content: 'Explain vector embeddings' }
  ],
  sessionId: 'user-123',
  enableRAG: true,           // Enable RAG context
  topK: 3,                   // Number of documents to retrieve
  model: 'gpt-4o-mini',      // LLM model
  temperature: 0.7           // Response creativity
});

// Return stream to client
return new Response(stream);

RAG as AI Tool

Integrate RAG with Vercel AI SDK's tool system:

import { createRAGTool } from '@aid-on/memory-rag';
import { generateText } from 'ai';

const ragTool = createRAGTool('session-123');

const result = await generateText({
  model: openai('gpt-4'),
  tools: {
    searchKnowledge: ragTool.search,
    addKnowledge: ragTool.add
  },
  prompt: 'Help me understand our documentation'
});

⚙️ Configuration

Environment Variables

Configure defaults via environment variables:

# Provider selection
MEMORY_RAG_LLM_PROVIDER=openai
MEMORY_RAG_EMBEDDING_PROVIDER=openai

# Model selection
MEMORY_RAG_MODEL=gpt-4o-mini
MEMORY_RAG_EMBEDDING_MODEL=text-embedding-3-small

# API Keys (if not set elsewhere)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

Runtime Configuration

import { setConfig } from '@aid-on/memory-rag';

setConfig({
  defaultProvider: {
    llm: 'anthropic',
    embedding: 'openai',  // Mix providers
  },
  vectorStore: {
    maxDocuments: 10000,   // Increase capacity
    chunkSize: 1000,       // Larger chunks
    chunkOverlap: 100      // More context overlap
  },
  search: {
    defaultTopK: 10,       // Return more results
    minScore: 0.7          // Higher quality threshold
  },
});

🔧 Advanced Features

Smart Document Chunking

const service = new RAGService();

// Automatically chunks large documents
const result = await service.addDocument(
  store,
  longArticle,  // 10,000+ characters
  { source: 'blog', author: 'John' },
  true,         // Enable auto-chunking
  1000          // Characters per chunk
);

console.log(`Added ${result.documentIds.length} chunks`);

Bulk Document Import

const documents = [
  { content: 'Getting started guide...', metadata: { type: 'tutorial' } },
  { content: 'API reference...', metadata: { type: 'reference' } },
  { content: 'Best practices...', metadata: { type: 'guide' } },
];

// Efficiently add multiple documents
const results = await service.bulkAddDocuments(store, documents);
console.log(`Imported ${results.documentIds.length} documents`);

Custom Provider Registration

import { 
  registerLanguageModelProvider, 
  registerEmbeddingModelProvider 
} from '@aid-on/memory-rag';

// Register a custom provider
registerLanguageModelProvider('custom-llm', (model) => {
  return {
    async generateText({ messages }) {
      // Your custom implementation
      return 'Generated response';
    }
  };
});

// Use the custom provider
const service = new RAGService('custom-llm');

Metadata Filtering

// Add documents with rich metadata
await store.addDocument('Python tutorial', {
  language: 'python',
  level: 'beginner',
  updated: '2024-01'
});

// Future: Query with metadata filters
// const results = await store.search('tutorial', {
//   filter: { language: 'python', level: 'beginner' }
// });

🎯 Real-World Use Cases

💬 Conversational AI

Build chatbots that remember context and provide accurate, grounded responses.

📚 Documentation Assistant

Create an AI that can answer questions about your codebase, API, or product docs.

🔍 Semantic Search Engine

Implement intelligent search that understands intent, not just keywords.

🤖 Customer Support Bot

Deploy AI agents that can access your knowledge base to resolve customer queries.

📝 Content Generation

Generate articles, summaries, or reports augmented with factual information.

🎓 Educational Tutor

Build personalized learning assistants with access to course materials.

🏗️ Architecture

@aid-on/memory-rag
├── 📁 types/          # TypeScript interfaces & types
├── 📁 providers/      # Provider abstraction layer
│   ├── factory.ts     # Provider factory pattern
│   ├── base.ts        # Base provider classes
│   └── vercel-ai.ts   # Vercel AI SDK adapter
├── 📁 stores/         # Vector storage layer
│   └── in-memory.ts   # In-memory vector store
├── 📁 services/       # Business logic
│   ├── rag-service.ts # RAG orchestration
│   └── store-manager.ts # Session management
├── 📁 integrations/   # Framework integrations
│   └── vercel-ai.ts   # Vercel AI SDK tools
└── 📄 index.ts        # Public API exports

Design Principles

Provider Agnostic: Swap LLM/embedding providers without changing code
Memory Efficient: Optimized for in-memory operations
Type Safe: Full TypeScript with strict typing
Modular: Use only what you need
Edge Ready: Works in serverless and edge environments

🧪 Development

# Install dependencies
npm install

# Run tests
npm test

# Run tests in watch mode
npm run test:watch

# Generate coverage report
npm run test:coverage

# Build the library
npm run build

# Type checking
npm run type-check

# Linting
npm run lint

🚀 Performance

Fast Embedding: ~50ms per document (varies by provider)
Instant Search: <10ms for 1000 documents
Low Memory: ~1MB per 100 documents
Zero Cold Start: No external services to initialize

🔒 Security

No data persistence by default
Session isolation for multi-tenant apps
Provider API keys stay on your server
Works in secure edge environments

📄 License

MIT © Aid-On

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

🔗 Resources

Built with ❤️ by the Aid-On team

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@aid-on/memory-rag

✨ Features

📦 Installation

🚀 Quick Start

Basic Usage

Using Specific Providers

Session-Based Knowledge Isolation

🛠️ Core API

Factory Functions

createInMemoryRAG(options?)

createSimpleRAG()

Core Classes

InMemoryVectorStore

RAGService

🔗 Vercel AI SDK Integration

Stream RAG Responses

RAG as AI Tool

⚙️ Configuration

Environment Variables

Runtime Configuration

🔧 Advanced Features

Smart Document Chunking

Bulk Document Import

Custom Provider Registration

Metadata Filtering

🎯 Real-World Use Cases

💬 Conversational AI

📚 Documentation Assistant

🔍 Semantic Search Engine

🤖 Customer Support Bot

📝 Content Generation

🎓 Educational Tutor

🏗️ Architecture

Design Principles

🧪 Development

🚀 Performance

🔒 Security

📄 License

🤝 Contributing

🔗 Resources

`createInMemoryRAG(options?)`

`createSimpleRAG()`

`InMemoryVectorStore`

`RAGService`