ribbit-riblt
v1.1.1
Published
Rateless Invertible Bloom Lookup Table (RIBLT) for set reconciliation
Maintainers
Readme
ribbit-RIBLT 🐸
Rateless Invertible Bloom Lookup Table (RIBLT) for set reconciliation. Zero dependencies.
Two peers each hold a set of elements. This library lets them figure out which elements differ — without exchanging the sets themselves. The sender streams IBLT cells; the receiver compares against their own cells and peels the symmetric difference. If the first batch isn't enough, ask for more — the protocol is rateless, so there's no up-front size estimate needed.
Based on the scheme described by Yang et al., using gap-based cell mapping, which yields ~1.5–1.7× overhead relative to the symmetric difference size.
Install
npm install ribbit-ribltUsage
import { BloomStream, BloomDiff } from "ribbit-riblt";
// Sender has ["cat", "dog"]
const sender = new BloomStream(["cat", "dog"]);
const senderCells = [...sender.next(20)];
// Receiver has ["dog", "bird", "fish"]
const receiver = new BloomStream(["dog", "bird", "fish"]);
const receiverCells = [...receiver.next(20)];
// Compute the diff
const diff = new BloomDiff();
diff.addCells(senderCells, receiverCells);
diff.status; // "peelable"
diff.left; // ["cat"] — only in sender
diff.right; // ["bird", "fish"] — only in receiverStreaming (rateless)
If the first batch doesn't resolve, feed more cells. The diff and streams remember their position:
const a = new BloomStream(["alpha", "beta", "gamma"]);
const b = new BloomStream(["delta", "epsilon"]);
const diff = new BloomDiff();
diff.addCells([...a.next(2)], [...b.next(2)]);
diff.status; // "needs more data"
// Pull more cells and extend the diff
diff.addCells([...a.next(30)], [...b.next(30)]);
diff.status; // "peelable"Custom types
Pass encode/decode for non-string elements:
const stream = new BloomStream([42, 99], {
encode: (n) => new Uint8Array(new Uint32Array([n]).buffer),
decode: (b) => new Uint32Array(b.buffer)[0],
});This works with any structured type. Here's an example reconciling product catalogs:
import { BloomStream, BloomDiff } from "ribbit-riblt";
interface Product {
id: number;
name: string;
tags: string[];
}
const encoder = new TextEncoder();
const decoder = new TextDecoder();
const codec = {
encode: (p: Product): Uint8Array =>
encoder.encode(JSON.stringify([p.id, p.name, p.tags])),
decode: (b: Uint8Array): Product => {
const [id, name, tags] = JSON.parse(decoder.decode(b));
return { id, name, tags };
},
};
const warehouse = [
{ id: 1, name: "Widget", tags: ["sale"] },
{ id: 2, name: "Gadget", tags: ["new"] },
{ id: 3, name: "Doohickey", tags: ["clearance"] },
];
const store = [
{ id: 1, name: "Widget", tags: ["sale"] },
{ id: 3, name: "Doohickey", tags: ["clearance"] },
{ id: 4, name: "Thingamajig", tags: ["exclusive"] },
];
const a = new BloomStream<Product>(warehouse, codec);
const b = new BloomStream<Product>(store, codec);
const diff = new BloomDiff<Product>(codec);
diff.addCells([...a.next(20)], [...b.next(20)]);
diff.status; // "peelable"
diff.left; // [{ id: 2, name: "Gadget", tags: ["new"] }]
diff.right; // [{ id: 4, name: "Thingamajig", tags: ["exclusive"] }]You want to strive to keep the encoded size of the elements small. It is better to reconcile a list of 64-bit ids, than a list of 50Mb byte streams!!
Wire format
Cells can be serialized for transmission using a protobuf-compatible binary format:
import { serializeCellBatch, deserializeCellBatch } from "ribbit-riblt";
const bytes = serializeCellBatch(cells); // Uint8Array
const { cellOffset, cells } = deserializeCellBatch(bytes);Individual cells can also be serialized with serializeCell / deserializeCell.
API
CodecOptions<T>
Both constructors accept an optional CodecOptions<T> object:
| Option | Type | Default | Description |
| -------- | -------------------------- | --------------------- | ------------------------------------- |
| encode | (value: T) => Uint8Array | UTF-8 string encoding | Serializes an element to bytes |
| decode | (bytes: Uint8Array) => T | UTF-8 string decoding | Deserializes bytes back to an element |
When T is string (the default), both can be omitted.
BloomStream<T>
new BloomStream(collection, options?)
Creates a cell stream from a collection. options is a CodecOptions<T> — both encode and decode are used.
.next(n): Generator<Cell> — yields the next n cells. Successive calls continue where the last left off.
BloomDiff<T>
new BloomDiff(options?)
Creates an empty diff. options is a CodecOptions<T> — only decode is used (to reconstruct elements from peeled cells).
.addCells(leftCells, rightCells) — appends paired cell batches and re-peels. leftCells and rightCells must have equal length. Can be called multiple times to incrementally extend the diff.
.status—"peelable"if the diff resolved,"needs more data"otherwise.left— elements present only in the left set.right— elements present only in the right set
Cell
interface Cell {
idSum: Uint8Array;
hashSum: bigint;
count: number;
}Wire serialization
serializeCell(cell: Cell): Uint8ArraydeserializeCell(buf: Uint8Array): CellserializeCellBatch(cells: Cell[], cellOffset?: number): Uint8ArraydeserializeCellBatch(buf: Uint8Array): { cellOffset: number; cells: Cell[] }
License
ISC
