verso-db
v0.3.0
Published
High-performance vector search with HNSW indexing for Bun, Node.js, and Browser. 100% recall, 4x memory reduction with Int8 quantization.
Downloads
324
Maintainers
Readme
Verso
High-performance vector search with HNSW indexing for Bun, Node.js, and Browser.
Performance
| Metric | Value | |--------|-------| | Recall@10 | 100% on 768D Wikipedia embeddings | | Query Performance | 95.5% improvement from baseline | | Memory Reduction | 4x with Int8 quantization |
Features
- HNSW Algorithm - Hierarchical Navigable Small World for fast approximate nearest neighbor search
- Multiple Distance Metrics - Cosine similarity, Euclidean, dot product
- Int8 Quantization - 4x memory reduction with minimal recall loss
- Multi-Platform - Bun/Node.js (file system) and Browser (OPFS)
- Parameter Presets - Pre-tuned configurations for different use cases
- Batch Queries - Efficient batch processing for throughput
- Metadata Filtering - MongoDB-style query operators
Installation
# Bun
bun add verso-db
# npm / Node.js
npm install verso-dbQuick Start
import { VectorDB } from 'verso-db';
const db = new VectorDB({ storagePath: './my_vectors' });
const collection = await db.createCollection('docs', { dimension: 3 });
await collection.add({
ids: ['a', 'b', 'c'],
vectors: [new Float32Array([1, 0, 0]), new Float32Array([0, 1, 0]), new Float32Array([0, 0, 1])]
});
const results = await collection.query({ queryVector: new Float32Array([1, 0.1, 0]), k: 2 });
console.log(results.ids); // ['a', 'b']
await db.close();Only dimension is required — defaults are metric: 'cosine', M: 16, efConstruction: 200. Use presets to tune for your use case.
With All Options
import { VectorDB, getRecommendedPreset } from 'verso-db';
const db = new VectorDB({ storagePath: './my_vectors' });
const preset = getRecommendedPreset(768);
const collection = await db.createCollection('documents', {
dimension: 768,
metric: 'cosine', // 'cosine' | 'euclidean' | 'dot_product'
M: preset.M, // max connections per node
efConstruction: preset.efConstruction // build-time search depth
});
await collection.add({
ids: ['doc1', 'doc2', 'doc3'],
vectors: [
new Float32Array(768).fill(0.1),
new Float32Array(768).fill(0.2),
new Float32Array(768).fill(0.3)
],
metadata: [
{ title: 'Document 1', category: 'tech' },
{ title: 'Document 2', category: 'science' },
{ title: 'Document 3', category: 'tech' }
]
});
const results = await collection.query({
queryVector: new Float32Array(768).fill(0.15),
k: 10,
efSearch: preset.efSearch, // query-time search depth
filter: { category: 'tech' } // MongoDB-style metadata filter
});
console.log(results.ids); // ['doc1', 'doc3']
console.log(results.distances); // [0.01, 0.12]
console.log(results.metadata); // [{ title: 'Document 1', ... }, ...]
await db.close();API Reference
VectorDB
Main database class for managing collections.
const db = new VectorDB({ storagePath: './vectors' });
// Create collection
const collection = await db.createCollection('name', {
dimension: 768,
metric: 'cosine', // 'cosine' | 'euclidean' | 'dot_product'
M: 16, // Max connections per node
efConstruction: 200 // Build-time search depth
});
// Get existing collection (returns undefined if not found)
const collection = await db.getCollection('name');
// List collections
const names = await db.listCollections();
// Delete collection
await db.deleteCollection('name');
// Close database
await db.close();Collection
// Add vectors
await collection.add({
ids: ['id1', 'id2'],
vectors: [new Float32Array([...]), new Float32Array([...])],
metadata: [{ key: 'value' }, { key: 'value' }] // optional
});
// Upsert vectors (update existing, add new)
await collection.upsert({
ids: ['id1', 'id3'], // id1 updated, id3 added
vectors: [new Float32Array([...]), new Float32Array([...])],
metadata: [{ key: 'new_value' }, { key: 'value' }]
});
// Query — returns { ids: string[], distances: number[], metadata: object[] }
const results = await collection.query({
queryVector: new Float32Array([...]),
k: 10,
efSearch: 200, // optional
filter: { category: 'tech' } // optional
});
// Batch query
const batchResults = await collection.queryBatch([
{ queryVector: vec1, k: 10 },
{ queryVector: vec2, k: 5, filter: { type: 'article' } }
]);
// Delete a single vector
await collection.delete('id1');
// Delete multiple vectors
await collection.deleteBatch(['id1', 'id2']);
// Check existence
collection.has('id1'); // true if active (not deleted)
collection.count(); // number of active vectors
// Flush pending writes when using Collection directly
await collection.flush();
// Compact — permanently remove deleted vectors and reclaim space
await collection.compact();Metadata Filtering
const results = await collection.query({
queryVector: queryVec,
k: 10,
filter: {
category: 'tech', // Exact match
score: { $gt: 0.5 }, // Greater than
tags: { $in: ['ai', 'ml'] } // In array
}
});Supported operators: $eq, $ne, $gt, $gte, $lt, $lte, $in, $nin
Parameter Presets
import { getRecommendedPreset, getRAGPreset, PRESETS } from 'verso-db';
// Automatic preset based on dimensions
const preset = getRecommendedPreset(768);
// RAG-optimized preset (high recall) by embedding model name
const ragPreset = getRAGPreset('text-embedding-3-large');
// Named presets (accessed by string key)
PRESETS['low-dim'] // <= 128 dimensions
PRESETS['medium-dim'] // 256-512 dimensions
PRESETS['high-dim'] // 768+ dimensions
PRESETS['very-high-dim'] // 1536+ dimensions
PRESETS['small-dataset'] // < 10k vectors
PRESETS['large-dataset'] // 100k+ vectors
PRESETS['max-recall'] // Prioritize accuracy
PRESETS['low-latency'] // Prioritize speedInt8 Quantization
Reduce memory usage by 4x with minimal recall loss:
import { ScalarQuantizer, QuantizedVectorStore } from 'verso-db';
// Create quantizer
const quantizer = new ScalarQuantizer(768);
// Train on sample vectors
quantizer.train(sampleVectors);
// Quantize vectors
const quantized = quantizer.quantize(vector);
// Use QuantizedVectorStore for compact in-memory storage
const store = new QuantizedVectorStore(768);
store.addVectors(sampleVectors);
console.log(store.size(), store.memoryUsage());Storage Backends
Verso automatically selects the appropriate storage backend:
| Environment | Backend | Storage |
|-------------|---------|---------|
| Bun / Node.js | BunStorageBackend | File system |
| Browser | OPFSBackend | Origin Private File System |
| Fallback | MemoryBackend | In-memory (no persistence) |
import { createStorageBackend, getRecommendedStorageType } from 'verso-db';
// Automatic detection
const backend = await createStorageBackend({
path: './vectors'
});
// Check available types
const type = getRecommendedStorageType(); // 'bun' | 'opfs' | 'memory'Benchmarks
See docs/BENCHMARKS.md for detailed performance analysis.
Quick Summary (768D vectors, Cohere Wikipedia dataset):
- 100% Recall@10 with optimized parameters
- 95.5% query performance improvement through optimizations
- 4x memory reduction with Int8 quantization
Development
# Install dependencies
bun install
# Run tests
bun test
# Run browser tests
bun run test:browser
# Build
bun run build
# Run benchmarks
bun run bench
# Recall benchmark
bun run benchmark:recall
# Storage benchmark
bun run benchmark:storage
# Comprehensive benchmark suite
bun run benchmark:comprehensiveLicense
MIT
