voctar
v0.1.3
Published
TypeScript library with RAG primitives for vector embeddings, chunking, storing and retrieval.
Maintainers
Readme
Features
- Simple primitives:
embedandsearch - Supports multiple vector stores: SQLite, Qdrant, in-memory, or custom store providers
- Automatic chunking for long documents with multiple strategies (
fixed,recursive,sentence,paragraph,semantic) - Semantic search with score thresholds and metadata filtering
- TypeScript-first.
Quick Start
yarn add voctarimport { Voctar } from 'voctar';
const vector = new Voctar({
embedding: {
type: 'openai',
apiKey: '<your-api-key>',
},
store: {
type: 'sqlite',
path: 'data/vector.db',
},
});
const { documentId } = await vector.embed('documents', "Very long text...", {
metadata: { author: 'Alice' },
});
const results = await vector.search('documents', 'Some query');Primitives API
embed(collection, text, options?)
Embeds a document into a collection.
If the text exceeds model limits, Voctar auto-chunks and stores chunk vectors.
const { documentId, chunkIds } = await vector.embed('documents', longText, {
documentId: 'doc-1', // optional; auto-generated if omitted
metadata: { source: 'guide' }, // optional user metadata
chunkSize: 1000, // optional
chunkStrategy: 'recursive', // fixed | recursive | sentence | paragraph | semantic
chunkOverlap: 200, // optional
autoChunk: true, // optional override
});Returns:
documentId: stable parent id for the documentchunkIds: stored ids (single id for unchunked docs, multiple for chunked docs)
search(collection, query, options?)
Retrieves semantically similar text from a collection.
const results = await vector.search('documents', 'how does chunking work', {
limit: 5, // optional, default provider behavior
scoreThreshold: 0, // optional
filter: { source: 'guide' }, // optional metadata filter
includeSystem: false, // optional; include internal metadata when true
});Each result includes:
idtextscorecreatedAtmetadata(and optionalsystemwhenincludeSystem: true)
