sqlite-hybrid
v0.1.3
Published
Vector search and FTS5 hybrid search on a user-provided better-sqlite3 connection
Downloads
591
Maintainers
Readme
sqlite-hybrid
Vector search + FTS5 hybrid search on a user-provided better-sqlite3 (or better-sqlite3-multiple-ciphers) connection. Wraps sqlite-vec for vector operations and SQLite's built-in FTS5 for keyword search, then fuses both via Reciprocal Rank Fusion (RRF).
import Database from 'better-sqlite3-multiple-ciphers';
import { SqliteHybrid } from 'sqlite-hybrid';
import { Embedder } from 'hf-embedder';
const embedder = Embedder.createSync({
model: 'Xenova/multilingual-e5-small',
dtype: 'q4',
pooling: 'mean',
normalize: true,
});
const db = new Database(':memory:');
const hybrid = new SqliteHybrid(db, {
vectorSize: 384,
onEmbed: (text) => embedder.embedSync(text),
});
db.exec(`CREATE TABLE reviews (
id INTEGER PRIMARY KEY,
product TEXT NOT NULL,
data TEXT NOT NULL
)`);
// Index values inside JSON columns via json_extract
hybrid.createVectorIndex('reviews', "json_extract(data, '$.review')");
hybrid.createFTS5('reviews', "json_extract(data, '$.review')");
hybrid.createVectorIndex('reviews', "json_extract(data, '$.category')");
hybrid.createFTS5('reviews', "json_extract(data, '$.category')");
// Triggers compute embeddings and sync FTS5 immediately
const ins = db.prepare(`INSERT INTO reviews (id, product, data) VALUES (?, ?, ?)`);
ins.run(1, 'Noise-cancelling headphones',
JSON.stringify({ review: 'Amazing sound quality and battery life', category: 'audio, wireless' }));
ins.run(2, 'Mechanical keyboard',
JSON.stringify({ review: 'Great tactile feedback for programming', category: 'keyboard, wired' }));
ins.run(3, 'USB-C hub',
JSON.stringify({ review: 'Reliable dock for laptop connectivity', category: 'accessories, wired' }));
ins.run(4, 'Wireless mouse',
JSON.stringify({ review: 'Comfortable grip for long work sessions', category: 'mouse, wireless' }));
// Scoped hybrid search (single table)
const results = hybrid.hybridSearch('reviews', 'keyboard');
console.log(results[0].product); // "Mechanical keyboard"
// Global search across all indexed tables and fields
const global = hybrid.hybridSearch('wireless');
console.log(global[0]._table); // "reviews"
console.log(global[0]._field); // "json_extract(data, '$.category')"Features
- Vector indexes — creates
vec0companion tables via sqlite-vec. Sync triggers call youronEmbedcallback immediately during INSERT/UPDATE and write embeddings directly. - FTS5 indexes — external content (plain columns) or contentless (JSON path) FTS5 virtual tables, kept in sync via triggers.
- JSON path indexes — index values inside JSON columns via
json_extract(col, 'path'). - Hybrid search — runs vector and keyword retrievers independently, fuses via RRF:
score = 1/(k + rank_vec) + 1/(k + rank_fts). - Global search — single-argument forms discover all indexed tables/fields from the internal registry and search across them.
- Scoped search — two-argument forms restrict to a single source table.
- Encrypted databases — works with
better-sqlite3-multiple-ciphersand SQLCipher.
Installation
npm install sqlite-hybrid better-sqlite3better-sqlite3 (or better-sqlite3-multiple-ciphers) is a peer dependency.
API
new SqliteHybrid(db, options)
| Option | Type | Description |
|--------|------|-------------|
| vectorSize | number | Embedding dimension (required) |
| onEmbed | (text: string) => Float32Array \| number[] | Sync embedder for writes and search queries (required) |
Methods
| Method | Description |
|--------|-------------|
| createVectorIndex(table, field) | Create a vector index on table.field |
| createFTS5(table, field) | Create an FTS5 index on table.field |
| dropVectorIndex(table, field) | Drop a vector index |
| dropFTS5(table, field) | Drop an FTS5 index |
| vectorSearch(table, text, limit?) | Scoped vector search |
| vectorSearch(text, limit?) | Global vector search |
| keySearch(table, text, limit?) | Scoped keyword search |
| keySearch(text, limit?) | Global keyword search |
| hybridSearch(table, text, limit?) | Scoped hybrid search (vector + FTS5, RRF-fused) |
| hybridSearch(text, limit?) | Global hybrid search |
Pass json_extract(col, 'path') as field to index a value inside a JSON column. The same expression is used for drops and searches.
See docs/api.md for full reference.
How it works
createVectorIndexcreates avec0virtual table and installs AFTER INSERT/UPDATE/DELETE triggers. The INSERT and UPDATE triggers call_sqlite_hybrid_embed(a SQL function registered on construction that wrapsonEmbed) and write the result directly to the vec0 table.createFTS5creates an FTS5 virtual table (external content for plain columns, contentless for JSON path) with sync triggers.- For JSON path indexes, the field is parsed to extract the base column and JSON path. Companion table names include a sanitized path suffix, and triggers use
json_extract(NEW.col, 'path')automatically. - Search queries embed the query text via
onEmbed, join companion tables back to the source table by rowid, and return full records with_score. - Hybrid search runs vector and keyword queries against every indexed field on the target table, then fuses results via RRF:
score = 1/(k + rank_vec) + 1/(k + rank_fts).
License
MIT
