persistent-hnsw
v0.1.0
Published
Pure-TypeScript HNSW vector search. No native dependencies, no WASM. Sub-millisecond search with pluggable persistence, automatic sharding, and three distance metrics.
Downloads
89
Maintainers
Readme
Persistent HNSW
Pure-TypeScript HNSW vector search. No native dependencies, no WASM - just TypeScript.
Scales to tens of thousands of vectors in-memory with sub-millisecond search. Supports persistence via pluggable storage backends, automatic sharding with LRU eviction, and three distance metrics.
Install
npm i persistent-hnswimport {
FileSystemStorage,
HNSWIndex,
InMemoryStorage,
VectorStore,
} from "persistent-hnsw";Quick Start
import { VectorStore } from "persistent-hnsw";
const store = VectorStore.create({
hnsw: { dimensions: 128 },
});
// Insert vectors
await store.insert([
{ id: "doc-1", vector: new Float32Array(128).fill(0.1) },
{ id: "doc-2", vector: new Float32Array(128).fill(0.5) },
{ id: "doc-3", vector: new Float32Array(128).fill(0.9) },
]);
// Search
const results = await store.search(new Float32Array(128).fill(0.5), 2);
console.log(results);
// [{ id: "doc-2", distance: 0 }, { id: "doc-1", distance: 2.56 }]
// Delete
await store.delete("doc-1");API
VectorStore
The high-level API. Handles sharding, persistence, and configuration.
// Create a new store
const store = VectorStore.create({
hnsw: {
dimensions: 128, // required
M: 16, // max neighbors per node (default: 16)
efConstruction: 200, // build-time beam width (default: 200)
efSearch: 50, // search-time beam width (default: 50)
metric: "euclidean", // "euclidean" | "cosine" | "inner_product"
},
sharding: {
maxVectorsPerShard: 100_000,
maxLoadedShards: 4,
},
storage: new FileSystemStorage("./data"),
});
// Insert one or many
await store.insert({ id: "a", vector: [1, 2, 3] });
await store.insert([
{ id: "b", vector: [4, 5, 6] },
{ id: "c", vector: [7, 8, 9] },
]);
// Search — returns { id, distance }[]
const results = await store.search([1, 2, 3], 10);
// Search with options
const filtered = await store.search([1, 2, 3], 10, {
efSearch: 200, // override beam width
filter: (id) => id !== "b", // pre-filter by ID
includeVectors: true, // attach vector data to results
});
// Delete
await store.delete("a");
// Persistence
await store.flush(); // write dirty shards to storage
await store.close(); // flush + release resources
// Reopen from storage
const reopened = await VectorStore.open({
hnsw: { dimensions: 128 },
storage: new FileSystemStorage("./data"),
});HNSWIndex
The low-level index. Synchronous, zero async overhead. Use this when you don't need sharding or persistence.
import { defaultHNSWConfig, HNSWIndex } from "persistent-hnsw";
const config = defaultHNSWConfig(128, {
M: 16,
efConstruction: 200,
metric: "cosine",
seed: 42, // deterministic builds
});
const index = new HNSWIndex(config);
// Insert
index.insert("vec-1", new Float32Array([/* 128 values */]));
index.insert("vec-2", [0.1, 0.2 /* ... */]); // number[] also works
// Search — returns { id, distance }[]
const results = index.search([0.1, 0.2 /* ... */], 10);
// Search with custom efSearch
const precise = index.search(query, 10, 400);
// Search with filter
const filtered = index.search(
query,
10,
undefined,
(id) => id.startsWith("doc-"),
);
// Delete (lazy tombstone)
index.delete("vec-1");
// Rebuild without deleted nodes
const compacted = index.compact();
// Reclaim unused memory after bulk insert
index.shrinkToFit();
// Check state
index.size; // number of live vectors
index.has("vec-2"); // true
index.getVector("vec-2"); // Float32Array | null
index.memoryUsage(); // bytesStorage Backends
import { FileSystemStorage, InMemoryStorage } from "persistent-hnsw";
// In-memory — works in browsers, no filesystem needed
const mem = new InMemoryStorage();
// Filesystem — uses Deno.readFile/writeFile
const fs = new FileSystemStorage("./hnsw-data");Both implement StorageBackend:
interface StorageBackend {
write(key: string, data: Uint8Array): Promise<void>;
read(key: string): Promise<Uint8Array | null>;
delete(key: string): Promise<boolean>;
list(): Promise<string[]>;
exists(key: string): Promise<boolean>;
}You can implement your own for S3, Redis, SQLite, etc.
Distance Metrics
| Metric | Use case | Notes |
| ----------------- | ----------------------------------- | ------------------------------------------ |
| "euclidean" | General purpose | Squared L2, no sqrt (lower = closer) |
| "cosine" | Text embeddings, normalized vectors | 1 - cos(a,b) (0 = identical, 2 = opposite) |
| "inner_product" | Recommendation, MIP search | Negated dot product (lower = more similar) |
Configuration Guide
Choosing M
M controls how many neighbors each node maintains. Higher M = better recall,
slower insert, more memory.
| M | Recall@10 | Search QPS | Memory (10K/128d) | | ------ | --------- | ---------- | ----------------- | | 4 | 61% | 2.9K | 6.2 MB | | 8 | 84% | 1.8K | 6.8 MB | | 16 | 97% | 1.2K | 8.0 MB | | 32 | 99% | 840 | 9.8 MB |
Default: 16. Good balance for most use cases. Use 32 if you need >99% recall.
Choosing efSearch
efSearch controls search accuracy at query time. Higher = better recall,
slower search.
| efSearch | Recall@10 | Latency | QPS | | -------- | --------- | --------- | -------- | | 10 | 34% | 87µs | 11.5K | | 50 | 74% | 322µs | 3.1K | | 100 | 88% | 523µs | 1.9K | | 200 | 97% | 950µs | 1.1K | | 400 | 99.7% | 1.5ms | 652 |
Default: 50. Increase to 200+ for high-recall workloads.
Choosing efConstruction
efConstruction controls graph quality at build time. Higher = better graph,
slower insert. Only affects index building, not search.
Default: 200. Rarely needs changing. Lower to 100 for faster builds if recall is acceptable.
Benchmarks
All benchmarks on Apple M2 Pro, Deno 1.36.4. Run with deno task bench:full.
Scaling (128d, M=16, efConstruction=200)
| Vectors | Insert ops/s | Build Time | Search (ef=200) | QPS | Recall@ef200 | Recall@ef500 | Memory | On Disk | | ------- | ------------ | ---------- | --------------- | ---- | ------------ | ------------ | ------- | ------- | | 1K | 591 | 1.7s | 313µs | 3.2K | 100% | 100% | 0.8 MB | 0.8 MB | | 5K | 371 | 13.5s | 695µs | 1.4K | 99% | 100% | 4.0 MB | 4.0 MB | | 10K | 313 | 31.9s | 877µs | 1.1K | 97% | 100% | 8.0 MB | 8.1 MB | | 25K | 267 | 93.8s | 1095µs | 913 | 89% | 98% | 20.1 MB | 20.2 MB | | 50K | 242 | 206.4s | 1376µs | 727 | 79% | 93% | 49.4 MB | 49.7 MB |
Search Latency (median)
1.0K vectors |######### 317 us
5.0K vectors |#################### 695 us
10.0K vectors |######################### 877 us
25.0K vectors |################################ 1.1K us
50.0K vectors |######################################## 1.4K usDimension Scaling (5K vectors, M=16)
| Dimensions | Insert ops/s | Search ops/s | Latency | Recall@10 | Memory | | ---------- | ------------ | ------------ | ------- | --------- | ------- | | 32 | 759 | 4.6K | 216µs | 99.7% | 2.2 MB | | 128 | 391 | 2.3K | 443µs | 92.7% | 4.0 MB | | 384 | 170 | 963 | 1039µs | 89.7% | 8.9 MB | | 768 | 97 | 538 | 1859µs | 79.3% | 16.2 MB | | 1536 | 52 | 292 | 3427µs | 76.3% | 30.9 MB |
| Dimensions | Insert ops/s | Search ops/s | Search Median | Recall@10 | Memory | On Disk | | ---------- | ------------ | ------------ | ------------- | --------- | ------- | ------- | | 32 | 759 | 4.6K | 216µs | 99.7% | 2.2 MB | 2.2 MB | | 64 | 569 | 3.4K | 295µs | 98.0% | 2.8 MB | 2.8 MB | | 128 | 391 | 2.3K | 443µs | 92.7% | 4.0 MB | 4.0 MB | | 256 | 246 | 1.2K | 824µs | 88.3% | 6.5 MB | 6.5 MB | | 384 | 170 | 963 | 1039µs | 89.7% | 8.9 MB | 8.9 MB | | 512 | 140 | 723 | 1384µs | 88.7% | 11.3 MB | 11.4 MB | | 768 | 97 | 538 | 1859µs | 79.3% | 16.2 MB | 16.2 MB | | 1024 | 77 | 431 | 2319µs | 83.0% | 21.1 MB | 21.1 MB | | 1536 | 52 | 292 | 3427µs | 76.3% | 30.9 MB | 30.9 MB |
Distance Function Throughput
Euclidean (L2 squared) ops/sec
128d |######################################## 10.5M ops/s
256d |#################### 5.3M ops/s
768d |####### 1.8M ops/s
1536d |### 914K ops/sSerialization
| Vectors | Size | Encode | Decode | | ------- | ------- | --------- | --------- | | 1K | 0.8 MB | 1592 MB/s | 450 MB/s | | 10K | 8.1 MB | 801 MB/s | 1570 MB/s | | 25K | 20.2 MB | 2061 MB/s | 1634 MB/s |
Full benchmark results with per-dimension breakdowns are in
bench/RESULTS.md.
Architecture
VectorStore High-level async API
└─ ShardManager Multi-shard orchestrator, LRU eviction
└─ HNSWIndex Core HNSW graph — synchronous, zero async overheadThe hot path (distance calculations, graph traversal) is fully synchronous. Async only happens at storage/shard boundaries.
Key internals:
- Flat typed arrays for vectors and adjacency lists (cache-friendly, trivial serialization)
- Bitset visited set (Uint32Array-backed, replaces Set)
- Pooled heaps and bitset reused across search calls (no per-query allocation)
- 4-wide unrolled loops in distance functions for V8 optimization
- 1.5x growth factor with
shrinkToFit()to reclaim unused capacity - Lazy deletion with tombstones;
compact()rebuilds the graph
Development
deno task test # run all tests
deno task bench # micro-benchmarks
deno task bench:full # full benchmark suite → bench/RESULTS.md
deno task check # type-check
deno task lint # lint
deno task fmt # format