@quarry-systems/drift-retrieval-sqlite

v0.1.1-alpha.1

Published

a month ago

SQLite FTS5 retrieval adapter for Drift RAG pipelines

0High
0Medium
0Low

brettnye

drift retrieval rag sqlite fts5 search backend node

@quarry-systems/mcg-retrieval-sqlite

SQLite FTS5 retrieval adapter for MCG RAG pipelines.

Features

Full-text search: SQLite FTS5 for keyword-based retrieval
Local-first: Perfect for development and small-scale RAG
Zero-ops: No external search service required
Unified model: Source-agnostic documents and chunks
Fast indexing: Automatic FTS triggers for real-time updates

Installation

npm install @quarry-systems/mcg-retrieval-sqlite

Usage

import { createSQLiteRetrieval } from '@quarry-systems/mcg-retrieval-sqlite';

// In-memory (for testing)
const retrieval = createSQLiteRetrieval({ filename: ':memory:' });

// File-based (for persistence)
const retrieval = createSQLiteRetrieval({ filename: './data/rag.db' });

// Ingest a document
const doc = await retrieval.upsertDocument({
  id: 'doc-1',
  sourceType: 'url',
  sourceRef: 'https://example.com/page',
  title: 'Example Page',
  metadata: { author: 'John Doe' }
});

// Index chunks
await retrieval.upsertChunks(doc.id, [
  { sequence: 0, text: 'First paragraph about AI...' },
  { sequence: 1, text: 'Second paragraph about ML...' }
]);

// Search
const results = await retrieval.search('artificial intelligence', {
  topK: 5,
  sourceTypes: ['url']
});

console.log(results[0]);
// {
//   chunk: { id: '...', text: '...', ... },
//   doc: { id: 'doc-1', title: 'Example Page', ... },
//   score: 2.5,
//   highlights: ['<mark>artificial intelligence</mark> is...']
// }

Integration with MCG

import { ManagedCyclicGraph } from '@quarry-systems/managed-cyclic-graph';
import { createSQLiteRetrieval } from '@quarry-systems/mcg-retrieval-sqlite';

// Create retrieval adapter
const retrieval = createSQLiteRetrieval({ filename: './data/rag.db' });

// Build RAG ingest graph
const ingestGraph = new ManagedCyclicGraph()
  .use({ services: { retrieval } })
  .node('fetchUrl', {
    type: 'action',
    action: async (ctx) => {
      const response = await fetch(ctx.data.url);
      const text = await response.text();
      return { url: ctx.data.url, content: text };
    }
  })
  .node('indexDocument', {
    type: 'action',
    action: async (ctx, services) => {
      // Index document
      const doc = await services.retrieval.upsertDocument({
        id: `doc-${Date.now()}`,
        sourceType: 'url',
        sourceRef: ctx.data.url,
        title: ctx.data.url,
        metadata: { indexedAt: Date.now() }
      });
      
      // Split into chunks (simple example)
      const chunks = ctx.data.content
        .split('\n\n')
        .map((text, i) => ({ sequence: i, text }));
      
      await services.retrieval.upsertChunks(doc.id, chunks);
      
      return { docId: doc.id, chunks: chunks.length };
    }
  })
  .edge('fetchUrl', 'indexDocument')
  .build();

// Build RAG query graph
const queryGraph = new ManagedCyclicGraph()
  .use({ services: { retrieval } })
  .node('search', {
    type: 'action',
    action: async (ctx, services) => {
      const results = await services.retrieval.search(ctx.data.query, {
        topK: 5
      });
      
      return { 
        query: ctx.data.query,
        results: results.map(r => ({
          text: r.chunk.text,
          score: r.score,
          source: r.doc.sourceRef
        }))
      };
    }
  })
  .build();

// Execute
await ingestGraph.run({ url: 'https://example.com/article' });
await queryGraph.run({ query: 'What is the main topic?' });

API

Implements RetrievalAdapter from @quarry-systems/mcg-contracts:

upsertDocument(doc) - Create or update document
upsertChunks(docId, chunks) - Index chunks (replaces existing)
deleteDocument(docId) - Delete document and chunks
search(query, options) - Full-text search with BM25 scoring
getDocument(docId) - Get document by ID
getChunks(docId) - Get all chunks for document
listDocuments(options) - List documents with filtering

Configuration

interface SQLiteRetrievalConfig {
  /** Path to SQLite database file (use ':memory:' for in-memory) */
  filename: string;
  
  /** Enable WAL mode for better concurrency (default: true) */
  walMode?: boolean;
}

Testing

npm test

All tests run in-memory with no external dependencies.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme