voctar

v0.1.3

Published

10 days ago

TypeScript library with RAG primitives for vector embeddings, chunking, storing and retrieval.

0High
0Medium
0Low

marvinified

rag vector vector-database embeddings chunking semantic-search semantic-retrieval typescript qdrant sqlite

Features

Simple primitives: embed and search
Supports multiple vector stores: SQLite, Qdrant, in-memory, or custom store providers
Automatic chunking for long documents with multiple strategies (fixed, recursive, sentence, paragraph, semantic)
Semantic search with score thresholds and metadata filtering
TypeScript-first.

Quick Start

yarn add voctar

import { Voctar } from 'voctar';

const vector = new Voctar({
  embedding: {
    type: 'openai',
    apiKey: '<your-api-key>',
  },
  store: {
    type: 'sqlite',
    path: 'data/vector.db',
  },
});

const { documentId } = await vector.embed('documents', "Very long text...", {
  metadata: { author: 'Alice' },
});

const results = await vector.search('documents', 'Some query');

Primitives API

`embed(collection, text, options?)`

Embeds a document into a collection.
If the text exceeds model limits, Voctar auto-chunks and stores chunk vectors.

const { documentId, chunkIds } = await vector.embed('documents', longText, {
  documentId: 'doc-1',                 // optional; auto-generated if omitted
  metadata: { source: 'guide' },       // optional user metadata
  chunkSize: 1000,                     // optional
  chunkStrategy: 'recursive',          // fixed | recursive | sentence | paragraph | semantic
  chunkOverlap: 200,                   // optional
  autoChunk: true,                     // optional override
});

Returns:

documentId: stable parent id for the document
chunkIds: stored ids (single id for unchunked docs, multiple for chunked docs)

`search(collection, query, options?)`

Retrieves semantically similar text from a collection.

const results = await vector.search('documents', 'how does chunking work', {
  limit: 5,                            // optional, default provider behavior
  scoreThreshold: 0,                   // optional
  filter: { source: 'guide' },         // optional metadata filter
  includeSystem: false,                // optional; include internal metadata when true
});

Each result includes:

id
text
score
createdAt
metadata (and optional system when includeSystem: true)

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Features

Quick Start

Primitives API

embed(collection, text, options?)

search(collection, query, options?)

Documentation

`embed(collection, text, options?)`

`search(collection, query, options?)`