eigen-db
v5.0.0
Published
High-performance vector database for the web, powered by Web Assembly.
Readme
Eigen DB
High-performance vector database for the web, powered by Web Assembly.
eigen-db stores and queries embedding vectors in-browser, using:
- Pluggable storage backends (in-memory by default, OPFS for browser persistence)
- WASM SIMD for fast compute when available
- JavaScript fallback when WASM SIMD is unavailable
Install
npm install eigen-dbGuide: Set up and query
Open a database
import { DB } from "eigen-db";
// In-memory (default) — no persistence, great for ephemeral sessions
const db = await DB.open({
dimensions: 1536, // required
normalize: true, // optional, defaults to true
});For browser persistence, mount an OPFS storage backend:
import { DB, OPFSStorageProvider } from "eigen-db";
const db = await DB.open({
dimensions: 1536,
storage: new OPFSStorageProvider("my-index"), // persistent OPFS directory
});Insert vectors
db.set("doc:1", embedding1);
db.set("doc:2", embedding2);
db.setMany([
["doc:3", embedding3],
["doc:4", embedding4],
]);Notes:
- Each vector must be a
number[](orFloat32Array) with exactlydimensionselements. - Duplicate keys use last-write-wins semantics.
Look up, check, and remove vectors
db.get("doc:1"); // number[] | undefined
db.has("doc:1"); // true
db.delete("doc:1"); // true (removed), false (not found)
db.dimensions; // configured vector dimensions
db.size; // number of entriesIterate over the database
// Iterate over all keys
for (const key of db.keys()) {
console.log(key);
}
// Iterate over all [key, value] pairs
for (const [key, vector] of db.entries()) {
console.log(key, vector);
}
// Spread into an array (uses Symbol.iterator, same as entries())
const all = [...db];Query nearest vectors
const queryVector = embeddingQuery;
// Returns a plain array of { key, similarity } sorted by descending similarity
const results = db.query(queryVector, { limit: 10 });
for (const { key, similarity } of results) {
console.log(key, similarity);
}For lazy iteration (useful for pagination or early stopping):
const results = db.query(queryVector, { limit: 100, iterable: true });
// Iterate and break early — keys are resolved on demand
for (const { key, similarity } of results) {
if (similarity < 0.5) break;
console.log(key, similarity);
}
// Or spread into an array when you need all results
const all = [...results];Use minSimilarity and maxSimilarity to filter results by a similarity range:
// Only return results with similarity ≥ 0.7 (inclusive)
const results = db.query(queryVector, { minSimilarity: 0.7 });
// Only return results with similarity ≤ 0.5 (inclusive)
const results = db.query(queryVector, { maxSimilarity: 0.5 });
// Combine both for a range
const results = db.query(queryVector, { minSimilarity: 0.3, maxSimilarity: 0.8 });Use order: "ascend" to get the least similar results first (bottom-K):
// Least similar results first
const bottomK = db.query(queryVector, { order: "ascend", limit: 10 });Persist and lifecycle
await db.flush(); // persist current state
await db.close(); // flush + mark closedTo delete all vectors and storage:
await db.clear();Export and import
Export the entire database as a streaming binary file:
const stream = await db.export(); // ReadableStream<Uint8Array>
// In a browser — download as a file
const response = new Response(stream);
const blob = await response.blob();
const url = URL.createObjectURL(blob);
const a = document.createElement("a");
a.href = url;
a.download = "database.bin";
a.click();Import from a stream, replacing all existing data:
// From a File (e.g., <input type="file">)
await db.import(file.stream());
// From a fetch response
const res = await fetch("/path/to/database.bin");
await db.import(res.body!);Notes:
import()replaces all existing data in the target database.- A dimension check is performed on import: the stream must contain data exported from a database with the same
dimensionssetting. - Both methods use the Web Streams API to avoid large heap allocations — vectors are streamed in 64KB chunks.
Similarity metric
Similarity is the dot product of the query and stored vectors.
- With normalization enabled (the default): vectors are L2-normalized before storage and query, so the dot product equals cosine similarity. Similarity ranges from 1 (identical) to -1 (opposite), with 0 indicating orthogonal vectors.
- With normalization disabled (
normalize: false): the dot product is computed on raw vectors. The range depends on the magnitude of your vectors. Use this mode when your vectors are already normalized or when you want raw dot-product semantics.
When to normalize:
| Scenario | Normalize? | Notes |
| ------------------------------------------ | ---------------- | --------------------------------------------------------------------------- |
| Using embeddings from OpenAI, Cohere, etc. | true (default) | Embeddings may not be unit-length; normalization ensures cosine similarity. |
| Vectors are already unit-length | Either | Setting false avoids redundant work. |
| You need raw dot-product semantics | false | Similarity will be the raw dot product; range depends on vector magnitudes. |
Full API Reference
Exports
export { DB };
export type { ResultItem };
export { VectorCapacityExceededError };
export type { OpenOptions, OpenOptionsInternal, SetOptions, QueryOptions, VectorInput };
export { InMemoryStorageProvider, OPFSStorageProvider };
export type { StorageProvider };DB
DB.open(options)
static open(options: OpenOptions): Promise<DB>
static open(options: OpenOptionsInternal): Promise<DB>Opens (or creates) a database instance and loads persisted data.
Properties
size: number— current number of key-vector pairsdimensions: number— number of dimensions per vector
Methods
set(key: string, value: VectorInput, options?: SetOptions): void- Inserts or overwrites a vector.
- Throws on dimension mismatch.
get(key: string): number[] | undefined- Returns a copy of the stored vector.
has(key: string): boolean- Returns
trueif the key exists,falseotherwise. O(1) lookup.
- Returns
delete(key: string): boolean- Removes the entry for the given key. Returns
trueif the key existed,falseotherwise.
- Removes the entry for the given key. Returns
setMany(entries: [string, VectorInput][]): void- Batch insert/update.
getMany(keys: string[]): (number[] | undefined)[]- Batch lookup.
keys(): IterableIterator<string>- Returns an iterable of all keys.
entries(): IterableIterator<[string, number[]]>- Returns an iterable of
[key, value]pairs. Values are plain number array copies.
- Returns an iterable of
[Symbol.iterator](): IterableIterator<[string, number[]]>- Same as
entries(). Enables[...db]andfor...ofiteration.
- Same as
query(value: VectorInput, options?: QueryOptions): ResultItem[]- Returns results sorted by descending similarity as a plain array.
- Throws on dimension mismatch.
query(value: VectorInput, options: QueryOptions & { iterable: true }): Iterable<ResultItem>- With
{ iterable: true }, returns a lazy iterable. Keys are resolved only as each item is consumed, enabling early stopping and pagination. - Throws on dimension mismatch.
- With
flush(): Promise<void>- Persists in-memory state to storage.
close(): Promise<void>- Flushes and closes the instance.
- Subsequent operations throw.
clear(): Promise<void>- Clears in-memory state and destroys storage for this DB.
export(): Promise<ReadableStream<Uint8Array>>- Exports the entire database as a streaming binary. Vectors are streamed in 64KB chunks.
import(stream: ReadableStream<Uint8Array>): Promise<void>- Imports data from a stream, replacing all existing data.
- Throws on dimension mismatch between the stream data and the database.
ResultItem
interface ResultItem {
key: string;
similarity: number;
}similarity— The dot product of query and stored vectors. With normalization (default), this is cosine similarity: 1 = identical, -1 = opposite.
Option types
OpenOptions
interface OpenOptions {
dimensions: number; // vector size
normalize?: boolean; // default: true
storage?: StorageProvider; // default: InMemoryStorageProvider
}OpenOptionsInternal
Advanced/testing override options.
interface OpenOptionsInternal extends OpenOptions {
wasmBinary?: Uint8Array | null;
}wasmBinary:Uint8Array: use provided precompiled WASMnull: force JavaScript-only compute- omitted: use embedded SIMD binary
SetOptions
interface SetOptions {
normalize?: boolean;
}QueryOptions
interface QueryOptions {
limit?: number; // default: Infinity (all results)
order?: "ascend" | "descend"; // default: "descend" (most similar first)
minSimilarity?: number; // inclusive lower bound on similarity; results below this are excluded
maxSimilarity?: number; // inclusive upper bound on similarity; results above this are excluded
normalize?: boolean;
iterable?: boolean; // when true, returns Iterable<ResultItem> instead of ResultItem[]
}Storage
StorageProvider
interface StorageProvider {
readAll(fileName: string): Promise<Uint8Array>;
append(fileName: string, data: Uint8Array): Promise<void>;
write(fileName: string, data: Uint8Array): Promise<void>;
destroy(): Promise<void>;
}OPFSStorageProvider
Browser persistence provider backed by OPFS.
new OPFSStorageProvider(dirName: string)InMemoryStorageProvider
Non-persistent in-memory provider, useful for tests or ephemeral sessions.
new InMemoryStorageProvider();Errors
VectorCapacityExceededError
Thrown when memory growth would exceed WASM 32-bit memory limits for the configured dimension size.
Benchmark results
WASM SIMD vs pure JavaScript performance on 1536-dimensional vectors (OpenAI embedding size), measured with vitest bench (Node.js):
| Operation | JS (ops/s) | WASM SIMD (ops/s) | Speedup | | -------------------------------------- | ---------- | ----------------- | -------- | | normalize (1536 dims) | 223,117 | 2,226,734 | ~10× | | searchAll (100 vectors × 1536 dims) | 3,429 | 77,130 | ~22× | | searchAll (1,000 vectors × 1536 dims) | 344 | 8,009 | ~23× | | searchAll (10,000 vectors × 1536 dims) | 34 | 398 | ~12× |
The WASM SIMD layer uses 2-vector outer loop unrolling (halving query memory reads) and 4× inner loop unrolling with multiple independent accumulators.
Running benchmarks
Node.js (via vitest):
npm run benchBrowser: start the dev server and navigate to the benchmark page:
npm run dev
# Open http://localhost:5173/bench.htmlPractical notes
- Similarity is the dot product of query and stored vectors; with normalization enabled (default), this behaves like cosine similarity (1 = identical, -1 = opposite).
limitdefaults toInfinity, returning all stored vectors sorted by similarity. UseminSimilarityandmaxSimilarityto filter results by proximity range.orderdefaults to"descend"(most similar first). Use"ascend"to get least similar first.- Querying an empty database returns an empty array (
[]). flush()writes deduplicated state, and reopen preserves key-to-slot mapping.
Related
- Just need cosine similarity? Try fast-theta.
