@sf-bot/rag-core
v0.1.0
Published
Core RAG schema with pgvector support for document embeddings
Maintainers
Readme
@sf-bot/rag-core
Core RAG (Retrieval-Augmented Generation) schema with pgvector support for document embeddings.
Overview
This package provides the foundational database schema for storing and managing document embeddings using PostgreSQL and pgvector. It supports:
- Multiple embedding models per collection
- Semantic chunking strategies
- JSON/CSV import and export for migrations
Schema
Tables
| Table | Description |
|-------|-------------|
| rag.embedding_model | Embedding model configurations (OpenAI, Cohere, local, etc.) |
| rag.collection | Document collections with chunking configuration |
| rag.collection_model | Links collections to embedding models (many-to-many) |
| rag.document | Source documents with content and metadata |
| rag.chunk | Document segments/chunks for embedding |
| rag.embedding | Vector embeddings (pgvector) linked to chunks |
Chunking Configuration
The chunk_config JSONB column on rag.collection supports:
{
"strategy": "semantic",
"max_tokens": 512,
"overlap_tokens": 50
}Supported strategies: semantic, fixed, sentence, paragraph
Installation
pgpm deploy @sf-bot/rag-coreRequirements
- PostgreSQL 17+
- pgvector extension
