@omendb/omendb

v0.0.37

Published

a month ago

Fast embedded vector database with HNSW + ACORN-1 filtered search

Downloads

320

0High
0Medium
0Low

nijaru

vector database embeddings hnsw similarity-search ai machine-learning napi-rs rust

omendb

Fast embedded vector database with HNSW indexing for Node.js and Bun.

Installation

npm install @omendb/omendb

Quick Start

import { create, open } from "omendb";

// Create a new database
const db = create("./vectors", { dense: { dim: 3 } });

// Insert vectors
db.set([
  {
    id: "doc1",
    vector: new Float32Array([0.1, 0.1, 0.1]),
    metadata: { title: "Hello" },
  },
  {
    id: "doc2",
    vector: new Float32Array([0.2, 0.2, 0.2]),
    metadata: { category: "news" },
  },
]);

// Search
const results = db.search(new Float32Array([0.15, 0.15, 0.15]), 5);
// [{ id: 'doc1', distance: 0.05, metadata: { title: 'Hello' } }, ...]

// Batch search (async, parallel)
const batchResults = await db.searchBatch(queries, 10);

// Close when done (releases file locks)
db.close();

Features

HNSW indexing for fast approximate nearest neighbor search
ACORN-1 filtered search
SQ8 quantization (4x compression, ~99% recall)
Hybrid search (vector + BM25 text)
Collections for multi-tenancy
Persistent storage with auto-save
Works with Node.js 18+ and Bun

API

Opening a Database

import { create, open } from "omendb";

// Basic
const db = create("./vectors", { dense: { dim: 384 } });

// Reopen an existing store
const reopened = open("./vectors");

// In-memory
const memDb = create(":memory:", { dense: { dim: 128 } });

// Full options
const db = create("./vectors", {
  metric: "cosine", // "l2", "euclidean", "cosine", "dot", or "ip"
  dense: { dim: 768, quantization: "sq8" }, // SQ8 4x compression, ~99% recall
});

// Multi-vector stores use explicit schema
const mvdb = create(":memory:", {
  metric: "dot",
  multi: { tokenDim: 128 },
});

Core Operations

`db.set(items)`

Insert or update vectors.

db.set([
  { id: "doc1", vector: Float32Array, metadata?: object },
  { id: "doc2", vector: Float32Array, metadata?: object },
]);

`db.get(id)`

Get a vector by ID.

const item = db.get("doc1");
// { id: "doc1", vector: Float32Array, metadata: {...} } or null

`db.getBatch(ids)`

Get multiple vectors by ID.

const items = db.getBatch(["doc1", "doc2"]);
// [{ id, vector, metadata } | null, ...]

`db.update(id, options)`

Update a vector's data and/or metadata.

db.update("doc1", {
  vector: newVector, // Optional
  metadata: { title: "New" }, // Optional
});

`db.delete(ids)`

Delete vectors by ID.

const deleted = db.delete(["doc1", "doc2"]);
// Returns number deleted

`db.deleteByFilter(filter)`

Delete vectors matching a filter.

const deleted = db.deleteByFilter({ category: "old" });
const deleted = db.deleteByFilter({
  $and: [{ type: "draft" }, { age: { $gt: 30 } }],
});

Search

`db.search(query, k, options?)`

Search for k nearest neighbors (sync).

const results = db.search(queryVector, 10); // Basic
const results = db.search(queryVector, 10, {
  ef: 200, // Search quality (higher = better recall)
  filter: { category: "news" }, // Metadata filter
  maxDistance: 0.5, // Distance threshold
});
// [{ id, distance, metadata }, ...]

`db.searchBatch(queries, k, ef?)`

Batch search with parallel execution (async).

const results = await db.searchBatch(queries, 10, 100);
// [[{ id, distance, metadata }, ...], ...]

Text & Hybrid Search

`db.enableTextSearch(config?)`

Enable text indexing for hybrid search.

db.enableTextSearch(); // Default buffer + tokenizer
db.enableTextSearch({ bufferMb: 128 }); // Custom buffer size
db.enableTextSearch({ tokenizer: "code" }); // Code-aware tokenization
db.enableTextSearch({ writerBufferMb: 64, tokenizer: "raw" });

`db.hasTextSearch`

Check if text search is enabled.

if (db.hasTextSearch) { ... }

`db.setWithText(items)`

Insert vectors with text content.

db.setWithText([
  { id: "doc1", vector: vec, text: "Machine learning tutorial", metadata: {...} }
]);

`db.searchText(query, k)`

BM25 text-only search.

const results = db.searchText("machine learning", 10);
// [{ id, score, metadata }, ...]

`db.searchHybrid(queryVector, queryText, k, options?)`

Combined vector + text search using Reciprocal Rank Fusion.

// Basic
const results = db.searchHybrid(queryVector, "machine learning", 10);

// With options
const results = db.searchHybrid(queryVector, "machine learning", 10, {
  alpha: 0.7, // 0=text only, 1=vector only (default: 0.5)
  rrfK: 60, // RRF constant (default: 60)
  filter: { category: "ml" },
  subscores: true, // Include separate scores
});
// [{ id, score, metadata, keywordScore?, semanticScore? }, ...]

Collections

`db.collection(name)`

Get or create a named collection.

const users = db.collection("users");
users.set([...]);
users.search(query, 5);

`db.collections()`

List all collections.

const names = db.collections();
// ["users", "products", ...]

`db.deleteCollection(name)`

Delete a collection.

db.deleteCollection("old_collection");

Properties

db.length; // Number of vectors
db.dimensions; // Vector dimensionality
db.efSearch; // Get/set search quality parameter

db.efSearch = 200; // Tune for better recall

Utility Methods

`db.count(filter?)`

Count vectors, optionally with filter.

const total = db.count();
const filtered = db.count({ category: "news" });

`db.isEmpty()`

Check if database is empty.

`db.exists(id)`

Check if an ID exists.

if (db.exists("doc1")) { ... }

`db.ids()`

Get all vector IDs.

const allIds = db.ids();

`db.items()`

Get all vectors with metadata.

const allItems = db.items();
// [{ id, vector, metadata }, ...]

`db.stats()`

Get index statistics.

const stats = db.stats();
// { numVectors, dimensions, maxLevel, avgNeighborsL0, ... }

Persistence

`db.flush()`

Force write pending changes to disk.

db.flush();

`db.compact()`

Remove deleted records and reclaim space.

const removed = db.compact();

`db.optimize()`

Reorder graph for better cache locality (6-40% speedup).

const reordered = db.optimize();

`db.close()`

Close database and release file locks.

db.close();
// Can now reopen the same path

`db.mergeFrom(other)`

Merge another database into this one.

const merged = db.mergeFrom(otherDb);

Filter Operators

// Equality
{ field: "value" }                    // Shorthand
{ field: { $eq: "value" } }           // Explicit

// Comparison
{ field: { $ne: "value" } }           // Not equal
{ field: { $gt: 10 } }                // Greater than
{ field: { $gte: 10 } }               // Greater or equal
{ field: { $lt: 10 } }                // Less than
{ field: { $lte: 10 } }               // Less or equal

// Membership
{ field: { $in: ["a", "b"] } }        // In list
{ field: { $nin: ["a", "b"] } }       // Not in list

// Logical
{ $and: [{...}, {...}] }              // AND
{ $or: [{...}, {...}] }               // OR

Performance

Node bindings call the same Rust core as the Rust and Python APIs, so authoritative ANN numbers are tracked at the repo root with the shared SIFT benchmark on Fedora/Linux medians.

Use the shared benchmark for comparable performance claims:

cd python && uv run python benchmark.py
cd python && uv run python benchmark.py --publish

Older Apple M3 Max wrapper numbers were local reference points only and are no longer treated as the authoritative baseline.

License

Elastic License 2.0

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

omendb

Installation

Quick Start

Features

API

Opening a Database

Core Operations

db.set(items)

db.get(id)

db.getBatch(ids)

db.update(id, options)

db.delete(ids)

db.deleteByFilter(filter)

Search

db.search(query, k, options?)

db.searchBatch(queries, k, ef?)

Text & Hybrid Search

db.enableTextSearch(config?)

db.hasTextSearch

db.setWithText(items)

db.searchText(query, k)

db.searchHybrid(queryVector, queryText, k, options?)