probe-filters
v1.0.1
Published
Dynamic approximate membership filters for point, range, spatial, and temporal queries (Aleph, Aeris, Zeno engines)
Maintainers
Readme
probe-filters
probe-filters — zero-false-negative, bounded-false-positive point, range, spatial, and temporal approximate membership filters with typed-array performance and expansion without key retention.
npm install probe-filtersimport { PointFilter, RangeFilter, SpatialFilter, TemporalFilter, MultiFilter } from 'probe-filters';MultiFilter provides four approximate membership filters and a composite facade. Every filter guarantees:
| Property | Meaning |
|----------|---------|
| no false negatives | insert(k) ⇒ query(k) always returns true |
| bounded false positives | tunable fingerprint size controls FPR |
| no key storage | all filters use fixed-size structural storage, independent of key count |
| online expansion | capacity doubles without access to original keys |
Filter types
PointFilter — Aleph quotient filter (SIGMOD 2025). 32-bit mother-hash slots with Zeno fractional-growth expansion (Kim et al., SIGMOD 2026). Handles deletions via void-entry tombstoning with queued duplicate removal on expansion. Packed RSQF metadata at 2.125 bits/slot.
RangeFilter — Aeris keepsake-box partition-count filter (Chesetti et al., SIGMOD 2026). Keys fall into fixed-width partitions. Each partition tracks a fingerprint and insertion count. Range queries check partition-level fingerprints; adaptation splits keepsake boxes and extends fingerprint length. No per-key storage — 1000 keys in the same partition occupy 1 entry.
SpatialFilter — Morton/Z-order curve backed by RangeFilter. 2D coordinates encode to 1D Morton codes. Bounding-box queries decompose into contiguous Morton intervals via recursive quad walk, then delegate to the RangeFilter backend. No cells, no per-coordinate storage.
TemporalFilter — ring-bucket PointFilters. Events inserted into time-aligned buckets indexed by floor(timestampMs / bucketDurationMs). Aging buckets fall out of the retention horizon. Sparsely serialized — only populated buckets are written.
MultiFilter facade
Composes all four filters behind a single interface with set/get/has/remove aliases and mergeFrom UNION merge across all enabled dimensions.
CONFIGURATION
PointFilter(options)
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| initialCapacity | number | 256 | Initial slot count (power of 2) |
| fingerprintSize | number | 12 | Fingerprint bits per slot |
| expansionThreshold | number | 0.95 | Load factor triggering expansion |
const pf = new PointFilter({ initialCapacity: 1024, fingerprintSize: 14 });RangeFilter(options)
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| partitionSize | number | 32 | Keys per partition |
| fingerprintBits | number | 8 | Initial fingerprint bits per partition |
| maxFingerprintBits | number | 24 | Maximum fingerprint bits after adaptation |
const rf = new RangeFilter({ partitionSize: 64, fingerprintBits: 10 });SpatialFilter(options)
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| bitsPerCoordinate | number | 16 | Precision bits per axis (grid size = 2^bits) |
| coordinateSystem | string | 'integer' | 'integer', 'float', 'latlon', or custom |
| bounds | array | — | Float coordinate bounds: [[minX, minY], [maxX, maxY]] |
| coordinateCodec | object | — | Custom { normalizePoint, decodePoint, bounds } |
| rangeOptions | object | — | Passed through to internal RangeFilter |
const sf = new SpatialFilter({ bitsPerCoordinate: 16 });
const sfFloat = new SpatialFilter({
coordinateSystem: 'float', bounds: [[0, 0], [1, 1]], bitsPerCoordinate: 16,
});
const sfGPS = new SpatialFilter({ coordinateSystem: 'latlon', bitsPerCoordinate: 16 });TemporalFilter(options)
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| bucketDurationMs | number | 60_000 | Width of each time bucket (ms) |
| retentionDurationMs | number | 86_400_000 | How far back events are retained (ms) |
| filterOptions | object | — | Passed through to internal PointFilter per bucket |
const tf = new TemporalFilter({
bucketDurationMs: 60_000, // 1 minute buckets
retentionDurationMs: 7 * 24 * 60_60_000, // 7 day retention
filterOptions: { initialCapacity: 64, fingerprintSize: 8 },
});MultiFilter(options)
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| point | boolean | true | Enable PointFilter; false to disable |
| range | boolean | true | Enable RangeFilter; false to disable |
| spatial | boolean | true | Enable SpatialFilter; false to disable |
| temporal | boolean | true | Enable TemporalFilter; false to disable |
| pointOptions | object | — | Options forwarded to PointFilter |
| rangeOptions | object | — | Options forwarded to RangeFilter |
| spatialOptions | object | — | Options forwarded to SpatialFilter |
| temporalOptions | object | — | Options forwarded to TemporalFilter |
Top-level options without a prefix are forwarded to PointFilter (backward compatibility):
const mf = new MultiFilter({ fingerprintSize: 10 }); // → pointOptions.fingerprintSizeAPI
PointFilter
insert(key) → void Insert a key
query(key) → boolean Test membership (no false negatives)
delete(key) → boolean Remove a key (tombstone)
rejuvenate(key) → boolean Move key to newer fingerprint to reduce persistent FPs
expand() → void Double capacity (Zeno fractional-growth)
getStats() → object { capacity, loadFactor, expansions, ... }
serialize() → ArrayBuffer Compact binary with CRC32 integrity
static deserialize(buffer[, options])RangeFilter
insert(key) → void Insert a key (increments partition count)
delete(key) → boolean Decrement count; remove partition at 0
queryRange(start, end) → boolean Test any key in [start, end]
adaptFalsePositive(start, end) → boolean Split partitions, extend fingerprints to fix FPs
expand() → void Shrink fingerprints, rejuvenate all partitions
getStats() → object { partitions, keepsakeBoxes, expansionLevel, ... }
serialize() → ArrayBuffer Compact binary with CRC32 integrity
static deserialize(buffer)SpatialFilter
insert(point) → void [x, y] point (integer, float, or lat/lon)
insertBox(min, max[, id]) → void Bounding box shape
insertCircle(center, radius[, id]) → void Circle shape
query(point) → boolean Single-point membership
queryBox(min, max) → boolean Bounding-box membership
adaptFalsePositiveBox(min, max) → boolean Decompose box to Morton intervals, adapt
getStats() → object { partitions, bitsPerCoordinate, gridSize, ... }
serialize() → ArrayBuffer Compact binary with CRC32 integrity (wraps RangeFilter)
static deserialize(buffer)TemporalFilter
(time units: ms, timestamps: epoch ms)
insertAt(key, timestampMs) → void
queryWithinLast(key, durationMs[, nowMs]) → boolean
queryAgo(key, ageMs, toleranceMs[, nowMs]) → boolean
queryBetweenAges(key, minAgeMs, maxAgeMs[, nowMs]) → boolean
adaptFalsePositiveWithinLast(key, durationMs[, nowMs]) → boolean
adaptFalsePositiveAgo(key, ageMs, toleranceMs[, nowMs]) → boolean
adaptFalsePositiveBetweenAges(key, minAgeMs, maxAgeMs[, nowMs]) → boolean
getStats([nowMs]) → object { bucketDurationMs, retentionDurationMs, activeBuckets }
serialize() → ArrayBuffer Compact binary with CRC32 integrity (populated-bucket-only)
static deserialize(buffer)MultiFilter
MultiFilter exposes the same methods as each sub-filter, prefixed where needed:
set(key) / get(key) / has(key) / remove(key) Point aliases
setRangeKey(key) / hasRange(start, end) / removeRangeKey(key) Range aliases
setSpatialPoint(point) / hasBox(min, max) Spatial aliases
setTemporalKey(key, ts) / queryWithinLast(key, dur[, now]) Temporal aliases
adaptFalsePositive(query, kind) kind ∈ {'point','range','spatial','temporal'}
mergeFrom(other, operator) 'union'
serialize() / static deserialize(buffer)Spatial shape insertion: insertSpatialBox(min, max) / insertSpatialCircle(center, radius).
Temporal adaptation: adaptTemporalFalsePositiveWithinLast/Ago/BetweenAges.
SERIALIZATION
All filters implement serialize() → ArrayBuffer and static deserialize(buffer[, options]). Every buffer is magic-number-prefixed and carries a 4-byte IEEE 802.3 CRC32 trailing checksum. Corrupted buffers throw on deserialization.
Low-level BinaryWriter/BinaryReader primitives and wrapCRC32/unwrapCRC32 are available via probe-filters/serialization.
PERFORMANCE
Mean per-operation latency from npm run benchmark (Node.js, 5000-element point/range, 2500 spatial, 4000 temporal):
| Operation | Mean | p50 | p95 | |-----------|------|-----|-----| | PointFilter query (hit) | 10.9 μs | 1.5 μs | 50.4 μs | | PointFilter query (miss) | 7.0 μs | 6.3 μs | 11.4 μs | | RangeFilter queryRange (hit) | 5.8 μs | 5.2 μs | 10.0 μs | | RangeFilter queryRange (miss) | 4.1 μs | 3.7 μs | 5.9 μs | | SpatialFilter queryBox | 9.0 μs | 6.1 μs | 16.4 μs | | TemporalFilter queryWithinLast | 43.6 μs | 41.7 μs | 62.0 μs | | PointFilter serialize | 1.0 ms | 0.4 ms | 2.8 ms | | PointFilter deserialize | 1.2 ms | 0.8 ms | 3.4 ms | | RangeFilter serialize | 1.0 ms | 0.8 ms | 2.0 ms | | RangeFilter deserialize | 5.3 ms | 4.2 ms | 9.7 ms | | SpatialFilter serialize | 1.0 ms | 0.6 ms | 1.7 ms | | SpatialFilter deserialize | 4.2 ms | 3.0 ms | 10.4 ms | | TemporalFilter serialize | 3.2 ms | 2.7 ms | 6.2 ms | | TemporalFilter deserialize | 5.5 ms | 4.4 ms | 12.7 ms | | MultiFilter serialize | 6.9 ms | 4.6 ms | 14.8 ms | | MultiFilter deserialize | 12.9 ms | 10.4 ms | 20.3 ms | | mergeFrom UNION (2×2000) | 32.9 ms | 30.6 ms | 45.9 ms |
Serialized buffer includes 4-byte CRC32 integrity check. All operations are batch-free, single-key, directly comparable across filter types.
EXAMPLES
LSM-tree SSTable filter
const sstableFilter = new MultiFilter({
pointOptions: { initialCapacity: 1_000_000, fingerprintSize: 14 },
rangeOptions: { partitionSize: 256, fingerprintBits: 10 },
});
sstable.set('row:42');
sstable.setRangeKey(42);
sstable.has('row:42'); // true — skip disk read
sstable.hasRange(40, 50); // true — interval coveredDistributed cache summary exchange
const local = new MultiFilter({ pointOptions: { fingerprintSize: 12 } });
local.set('key-a'); local.set('key-b');
const remote = new MultiFilter({ pointOptions: { fingerprintSize: 12 } });
remote.set('key-c');
// Merge remote summary into local
local.mergeFrom(remote, 'union');
local.has('key-c'); // true
// Transmit over wire
const wire = local.serialize();
// ... send wire ...
const peer = MultiFilter.deserialize(wire);Spatial pre-filter for geometry queries
const spatial = new SpatialFilter({
coordinateSystem: 'latlon', bitsPerCoordinate: 16,
});
spatial.insert([34.0522, -118.2437]); // Los Angeles
spatial.insert([40.7128, -74.0060]); // New York
spatial.queryBox([33.0, -120.0], [35.0, -117.0]); // true — LA in box
spatial.adaptFalsePositiveBox([37.0, -123.0], [38.0, -121.0]); // fix San Francisco FPEvent recency tracking
const temporal = new TemporalFilter({
bucketDurationMs: 10_000, // 10s buckets
retentionDurationMs: 3_600_000, // 1 hour retention
});
temporal.insertAt('sensor:42', Date.now());
temporal.insertAt('sensor:42', Date.now() - 45_000);
temporal.queryWithinLast('sensor:42', 60_000); // true — seen in last minute
temporal.queryBetweenAges('sensor:42', 30_000, 90_000); // true — seen 30-90s agoGame engine — spatial interest management
Avoid O(n²) entity distance checks. Insert positions into a SpatialFilter; query a box around the player to find candidate entities before running expensive exact checks. Remove despawned entities and adapt stale keepsake boxes to keep false positives low as entities move.
const world = new SpatialFilter({ bitsPerCoordinate: 14, coordinateSystem: 'integer' });
// Server tick: insert entity positions
world.insert([1523, 871]);
world.insert([1600, 900]);
// Entity teleports — remove old, insert new
world.delete([1523, 871]);
world.insert([1530, 875]);
// Interest query: any entity near player?
const range = 100;
if (world.queryBox([1525 - range, 870 - range],
[1525 + range, 870 + range])) {
// Candidate found — run exact distance checks
}
// Entity despawns — unconditionally remove
world.delete([1530, 875]);
// Stale keepsake boxes may leave false positives behind
world.adaptFalsePositiveBox([1525 - range, 870 - range],
[1525 + range, 870 + range]);
// Anti-cheat: has this player jumped in the last 500ms?
const actions = new TemporalFilter({
bucketDurationMs: 100, retentionDurationMs: 2_000,
});
actions.insertAt('player:42:jump', Date.now());
actions.queryWithinLast('player:42:jump', 500); // true → flag excessive input
// Merge snapshot from another server shard
const neighbor = MultiFilter.deserialize(wire);
world.mergeFrom(neighbor, 'union');SEE ALSO
probe-maplets — key-value maplet extensions providing value aggregation on top of probe-filters filters.
Papers:
- Yuvaraj Chesetti, Navid Eslami, Huanchen Zhang, Niv Dayan, and Prashant Pandey. 2026. Aeris Filter: A Strongly and Monotonically Adaptive Range Filter. Proc. ACM Manag. Data 4, 1 (SIGMOD), Article 7. https://doi.org/10.1145/3786621 — RangeFilter and SpatialFilter keepsake-box engine
- Hyuhng Min Kim, Navid Eslami, and Niv Dayan. 2026. Zeno Filter: To Infinity in Tiny Steps. Proc. ACM Manag. Data 4, 3 (SIGMOD), Article 251. https://doi.org/10.1145/3802128 — PointFilter fractional-growth expansion policy
- Aleph Filter: A Dynamically Expanding Quotient Filter (SIGMOD 2025) — PointFilter quotient-filter engine
- Time To Replace Your Filter: How Maplets Simplify System Design (arXiv:2510.05518) — companion key-value layer
Liberated from Tenere Labs Monolith, May 2026.
