vectlite
v0.1.11
Published
Embedded vector store for local-first AI applications.
Maintainers
Readme
vectlite
Embedded vector store for local-first AI applications.
vectlite is a single-file, zero-dependency vector database written in Rust with Node.js bindings. It gives you dense + sparse hybrid search, HNSW indexing, metadata filtering, transactions, and crash-safe persistence in a single .vdb file -- no server, no Docker, no network calls.
Installation
npm install vectliteRequires Node.js 18+. Pre-built binaries are available for macOS (x86_64, arm64), Linux (x86_64), and Windows (x86_64). Other platforms fall back to compiling from source (requires Rust/Cargo).
Quick Start
const vectlite = require('vectlite')
// Create or open a database
const db = vectlite.open('knowledge.vdb', { dimension: 384 })
// Insert records with vectors, metadata, and sparse terms
db.upsert('doc1', embedding, { source: 'blog', title: 'Auth Guide' })
db.upsert('doc2', embedding2, { source: 'notes', title: 'Billing' })
// Search with filters
const results = db.search(embeddingQuery, { k: 5, filter: { source: 'blog' } })
// Query-free inspection
console.log(db.count({ filter: { source: 'blog' } }))
// Clean up
db.close()Features
Core
- Single-file storage -- one
.vdbfile per database, portable and easy to back up - Dense vectors -- cosine similarity with automatic HNSW indexing for large collections
- Sparse vectors -- BM25-scored inverted index for keyword retrieval
- Hybrid search -- dense + sparse fusion with linear or RRF strategies
- Rich metadata -- string, number, boolean, null, array, and nested object values
- Crash-safe WAL -- writes land in a write-ahead log first, then checkpoint with
compact() - Transactions -- atomic batched writes with
db.transaction() - File locking -- advisory locks prevent corruption from concurrent access
Search & Retrieval
- Metadata filters -- MongoDB-style operators:
$eq,$ne,$gt,$gte,$lt,$lte,$in,$nin,$contains,$exists,$and,$or,$not - Nested filters -- dot-path traversal (
author.name),$elemMatch,$sizeon arrays and objects - Named vectors -- multiple vector spaces per record (
vectors: { title: [...], body: [...] }) - Multi-vector queries -- weighted search across vector spaces in a single call
- MMR diversification --
mmrLambdacontrols relevance vs. diversity trade-off - Namespaces -- logical isolation with per-namespace or cross-namespace search
- Observability --
searchWithStats()returns timings, BM25 term scores, ANN stats, and per-result explain payloads
Data Management
- Physical collections --
vectlite.openStore()manages a directory of independent databases - Bulk ingestion --
bulkIngest()with deferred index rebuilds for fast imports - Listing & filtered counts --
list()andcount({ namespace, filter })without a vector query - Delete by filter --
deleteByFilter()for bulk deletion by metadata filter - Snapshots --
db.snapshot(path)creates a self-contained copy - Backup / Restore --
db.backup(dir)andvectlite.restore(dir, path)for full roundtrips - Read-only mode --
vectlite.open(path, { readOnly: true })for safe concurrent readers - Explicit close --
db.close()to release locks deterministically - Lock timeouts --
lockTimeoutfor bounded lock acquisition waits
Usage
Hybrid Search
const vectlite = require('vectlite')
const db = vectlite.open('knowledge.vdb', { dimension: 384 })
// Upsert with dense + sparse vectors
db.upsert(
'doc1',
denseEmbedding,
{ source: 'docs', title: 'Auth Setup', text: 'How to configure SSO...' },
{ sparse: vectlite.sparseTerms('How to configure SSO authentication') },
)
// Hybrid search
const results = db.search(queryEmbedding, {
k: 10,
sparse: vectlite.sparseTerms('SSO authentication'),
fusion: 'rrf',
filter: { source: 'docs' },
explain: true,
})
for (const result of results) {
console.log(result.id, result.score)
}Collections
const store = vectlite.openStore('./my_collections')
const products = store.createCollection('products', 384)
products.upsert('p1', embedding, { name: 'Widget', price: 9.99 })
const logs = store.openOrCreateCollection('logs', 128)
console.log(store.collections()) // ["logs", "products"]Transactions
const tx = db.transaction()
try {
tx.upsert('doc1', emb1, { source: 'a' })
tx.upsert('doc2', emb2, { source: 'b' })
tx.delete('old_doc')
tx.commit() // All operations commit atomically
} catch (err) {
tx.rollback() // Roll back on error
throw err
}Text Helpers
async function run() {
// embedFn can be sync or async
await vectlite.upsertText(db, 'doc1', 'Auth setup guide', embedFn, { source: 'docs' })
const results = await vectlite.searchText(db, 'how to authenticate', embedFn, { k: 5 })
}Snapshots & Backup
db.snapshot('/backups/knowledge_2024.vdb') // Self-contained copy
db.backup('/backups/full/') // Full backup with ANN sidecars
const restored = vectlite.restore('/backups/full/', 'restored.vdb')Read-Only Mode
const ro = vectlite.open('knowledge.vdb', { readOnly: true, lockTimeout: 5 })
const results = ro.search(query, { k: 5 }) // Reads work
ro.upsert(...) // Throws VectLiteErrorListing, Counting, and Lifecycle
const db = vectlite.open('knowledge.vdb', { dimension: 384, lockTimeout: 5 })
const records = db.list({ namespace: 'docs', filter: { stale: false }, limit: 20 })
const count = db.count({ namespace: 'docs', filter: { source: 'blog' } })
const deleted = db.deleteByFilter({ stale: true }, { namespace: 'docs' })
db.close()Search Diagnostics
const outcome = db.searchWithStats(query, {
k: 5,
sparse: terms,
explain: true,
})
console.log(outcome.stats.timings) // { dense_us: 120, sparse_us: 45, ... }
console.log(outcome.stats.used_ann) // true
console.log(outcome.results[0].explain) // Detailed scoring breakdownDatabase Methods Reference
Write Methods
| Method | Description |
|---|---|
| db.upsert(id, vector, metadata, options) | Insert or update a single record |
| db.insert(id, vector, metadata, options) | Insert a record (throws on duplicate id) |
| db.upsertMany(records, { namespace }) | Upsert a batch of records |
| db.insertMany(records, { namespace }) | Insert a batch |
| db.bulkIngest(records, { namespace, batchSize }) | Fastest bulk import with batched WAL writes |
| db.delete(id, { namespace }) | Delete a single record |
| db.deleteMany(ids, { namespace }) | Delete multiple records by id |
| db.deleteByFilter(filter, { namespace }) | Delete all records matching a filter |
Read Methods
| Method | Description |
|---|---|
| db.get(id, { namespace }) | Get a single record by id |
| db.search(query, options) | Search and return a list of results |
| db.searchWithStats(query, options) | Search with detailed performance stats |
| db.count({ namespace, filter }) | Count records, optionally scoped by namespace/filter |
| db.list({ namespace, filter, limit, offset }) | List records without issuing a vector query |
| db.namespaces() | List all namespaces |
| db.dimension | Vector dimension (property) |
| db.path | Database file path (property) |
| db.readOnly | Whether the database is read-only (property) |
Maintenance Methods
| Method | Description |
|---|---|
| db.compact() | Fold WAL into snapshot and persist ANN indexes |
| db.flush() | Alias for compact() |
| db.snapshot(dest) | Create a self-contained .vdb copy |
| db.backup(destDir) | Full backup including ANN sidecar files |
| db.transaction() | Begin an atomic transaction |
| db.close() | Flush pending state, release the file lock, and invalidate the handle |
Filter Operators
| Operator | Example | Description |
|---|---|---|
| $eq | { field: { $eq: 'value' } } | Equal (also { field: 'value' }) |
| $ne | { field: { $ne: 'value' } } | Not equal |
| $gt / $gte | { field: { $gt: 5 } } | Greater than (or equal) |
| $lt / $lte | { field: { $lt: 20 } } | Less than (or equal) |
| $in / $nin | { field: { $in: ['a', 'b'] } } | In / not in set |
| $contains | { field: { $contains: 'auth' } } | Substring match |
| $exists | { field: { $exists: true } } | Field presence |
| $and / $or | { $and: [{...}, {...}] } | Logical combinators |
| $not | { $not: {...} } | Logical negation |
| $elemMatch | { tags: { $elemMatch: { $eq: 'rust' } } } | Match array elements |
| $size | { tags: { $size: 3 } } | Array length |
| dot-path | { 'author.name': 'Alice' } | Nested field access |
How It Works
- Records are stored in a compact binary
.vdbsnapshot file - Writes go through a crash-safe WAL (
.wal) before being applied in memory compact()folds the WAL into the snapshot and persists HNSW sidecar files- Dense search uses HNSW indexes (auto-built for collections above ~128 records)
- Sparse search uses an inverted index with BM25 scoring
- Hybrid fusion combines dense + sparse via linear combination or reciprocal rank fusion
- Advisory file locks (
flock) prevent concurrent write corruption
Links
License
MIT
