@zakkster/lite-stats-math

v1.0.0

Published

7 days ago

Single-pass avg/min/max/quantile aggregation over a RingBuffer. NaN-filtering, zero per-frame allocation.

Downloads

0High
0Medium
0Low

zakkster

stats statistics quantile percentile telemetry zero-gc single-pass quickselect lite no-allocation performance

@zakkster/lite-stats-math

$npm version$ $npm bundle size$ $npm downloads$ $npm total downloads$

Single-pass avg / min / max / quantile aggregation over a RingBuffer. NaN-filtering. Caller-owned output object. Zero per-frame allocation.

import { RingBuffer } from '@zakkster/lite-ring-buffer';
import { StatsMath }   from '@zakkster/lite-stats-math';

const rb    = new RingBuffer(1024);
const stats = new StatsMath(rb.capacity);
const out   = { avg: 0, min: 0, max: 0, p01: 0, p99: 0 };  // hoist & reuse

function frame(dt) {
  rb.push(dt);
  stats.compute(rb, out);
  // out.avg, out.min, out.max, out.p01, out.p99 are all up to date.
  // Zero allocation per frame.
}

Why

You have a sliding window of telemetry — frame times, audio peaks, mouse velocity — and you want a HUD that shows mean, min, max, and tail percentiles, every frame, without GC pauses.

The naive version returns a fresh object:

// One object + four boxed numbers per frame, ~6 KB/sec of garbage at 60 fps
function statsOf(window) {
  const sorted = [...window].sort((a, b) => a - b);
  return {
    avg: window.reduce((a, b) => a + b, 0) / window.length,
    min: sorted[0],
    max: sorted[sorted.length - 1],
    p01: sorted[Math.floor(sorted.length * 0.01)],
    p99: sorted[Math.floor(sorted.length * 0.99)],
  };
}

There are four allocation sources hiding in there: the spread, the sort, the reduce closure, and the result object. At 60 fps × dozens of metrics they add up.

@zakkster/lite-stats-math collapses all four:

The window is read directly from a RingBuffer via copyTo into a caller-owned scratchpad that lives for the lifetime of the StatsMath instance.
NaN values are filtered in-place by compaction during the same single pass that computes mean/min/max.
The two quantiles share that compacted scratchpad — no second copy.
Results are written into a caller-owned out object that you allocate once and reuse forever.

The hot path: one copyTo, one pass that compacts + accumulates, two quantile lookups. No allocations, no closures, no return-value churn.

Install

npm i @zakkster/lite-ring-buffer @zakkster/lite-stats-math

ESM-only. Zero dependencies (the RingBuffer is a peer — StatsMath only relies on the copyTo(dst, dstOffset) → number contract).

import { StatsMath } from '@zakkster/lite-stats-math';

Quick start

import { RingBuffer } from '@zakkster/lite-ring-buffer';
import { StatsMath }   from '@zakkster/lite-stats-math';

// Allocate once, at startup.
const rb    = new RingBuffer(1024);                                  // sliding window
const stats = new StatsMath(rb.capacity);                            // scratchpad
const out   = { avg: 0, min: 0, max: 0, p01: 0, p99: 0 };           // result object

// Per frame.
function tick(dtMs) {
  rb.push(dtMs);
  stats.compute(rb, out);
  hud.set(out.avg, out.p99);  // your renderer reads from out
}

Single-quantile mode

If you only need one tail (e.g. a TP99 latency display), skip compute and call quantile directly. It still does the NaN-compaction pass — the difference is that it skips the avg/min/max accumulation and writes only one output field.

const out = { p99: 0 };
stats.quantile(rb, 0.99, out, 'p99');

How it works

Single-pass NaN-filtering compaction

flowchart LR
    subgraph IN["RingBuffer.copyTo → scratchpad"]
        direction LR
        S0[5.0] --- S1[NaN] --- S2[3.0] --- S3[NaN] --- S4[7.0] --- S5[2.0]
    end
    subgraph PASS["Single pass: compact + accumulate"]
        direction LR
        P[("for i in 0..rawCount:<br/>if v == v:<br/>  scratchpad[count++] = v<br/>  sum += v<br/>  min = ..; max = ..")]
    end
    subgraph OUT["scratchpad after pass"]
        direction LR
        O0[5.0] --- O1[3.0] --- O2[7.0] --- O3[2.0] --- ON[stale: NaN/7.0]
    end
    IN --> PASS --> OUT

v !== v is the canonical NaN test (NaN is the only value that's not equal to itself in IEEE-754). It costs a single comparison per sample.

After the pass, count is the number of finite samples, and scratchpad[0..count-1] is contiguous valid data. Indices count..rawCount-1 are stale and never read again that frame.

Quantile selection

For n ≤ 8 (typical for tiny startup windows): insertion sort. It's actually faster than quickselect for small n, and as a side-benefit it leaves the scratchpad sorted — so a second quantile call within the same compute reads the same array without re-sorting.

For n > 8: quickselect with median-of-three pivot. This avoids the O(n²) worst case on already-sorted or reverse-sorted input — the common case for telemetry where samples drift monotonically. The two compute calls (p01 then p99) operate on increasingly-rearranged data; that's intentional and harmless because each call only needs the value at index k, not a fully sorted array.

flowchart TB
    Q{count ≤ 8?} -->|yes| IS["insertion sort<br/>O(n²) but tiny n<br/>+ leaves array sorted"]
    Q -->|no|  QS["quickselect with<br/>median-of-three pivot<br/>O(n) average"]
    IS --> R[scratchpad idx]
    QS --> R

Index calculation

Quantiles use the nearest-rank method with rounding:

idx = round((n - 1) * p)

For n = 100, p = 0.99: idx = round(99 * 0.99) = round(98.01) = 98 → the sample at sorted-index 98, i.e. the 99th-percentile bucket. For n < 100 the percentile values collapse toward the extremes, which is why a 1024-sample window is a good default for stable tails.

API reference

`new StatsMath(capacity)`

| Arg | Type | Description | |---|---|---| | capacity | number | Scratchpad size. Must be ≥ the largest count any RingBuffer passed to compute() will hold. Typically pass ringBuffer.capacity. |

Throws RangeError if capacity is not a finite positive number.

Methods

| Method | Returns | Description | |---|---|---| | compute(ringBuffer, out) | the out reference | Single-pass avg / min / max + p01 / p99. Mutates out. | | quantile(ringBuffer, p, out, key?) | the out reference | Single-quantile mode. Writes to out[key] (default 'quantile'). | | destroy() | void | Drop the scratchpad reference. Subsequent calls are undefined behavior. |

`out` shape

Pass an object with all five fields pre-allocated for compute:

{ avg: number, min: number, max: number, p01: number, p99: number }

For quantile, pass any object — the implementation only writes out[key].

Edge cases & guarantees

Empty or all-NaN window → all output fields are zeroed. compute will not leave stale values from a previous call. quantile zeroes only out[key].
NaN is silently filtered. ±Infinity passes through and will appear in min / max as expected.
compute is single-pass over the data. The NaN-compaction and avg/min/max accumulation share one loop; the two quantiles run on the already-compacted scratchpad without re-copying.
Quantile clamping. p < 0 clamps to 0; p > 1 clamps to 1. The library never throws on out-of-range p.
Median-of-three quickselect. The pivot strategy guarantees average-case O(n) on sorted, reverse-sorted, and random input. The test suite verifies sub-50ms on a 1000-element sorted window.
Scratchpad capacity is fixed at construction. It is never resized. If you pass a RingBuffer whose count exceeds the scratchpad size, the typed-array set will throw a RangeError. Always size the scratchpad to ringBuffer.capacity.
out is mutated and returned. The return value is a convenience for chaining; the same reference is mutated in-place.
Hot-path zero-allocation. compute and quantile allocate nothing on a steady-state call. The test suite verifies sub-MB heap growth across 1000 compute calls under --expose-gc.

FAQ

Why does the input have to be a RingBuffer? It doesn't, technically — StatsMath only depends on the copyTo(dst, dstOffset) → number contract. Any object that implements it (a Float32Array wrapper, a custom buffer) works. The peer-dep is conventional, not enforced.

Why not use Math.min(...arr) / Math.max(...arr)? The spread allocates an arguments array, and Math.min chokes on windows >~10k elements (stack-overflow on most engines). The single explicit loop is both faster and bounded.

Why round in the nearest-rank quantile? Don't most libraries interpolate? Linear interpolation between bucket boundaries is more "correct" by some statistical definitions, but it requires two memory reads instead of one and gives values that don't necessarily appear in your data. For HUD displays, "the 99th-percentile sample" is more useful than "a number computed between two samples". If you need linear-interpolation quantiles, fork the _calcQuantile method and replace the index lookup.

Can I get more than two quantiles per compute call? Only if you call quantile separately afterwards. compute is hard-coded to p01/p99 because those are the standard "tail" displays. For arbitrary multi-quantile output, call compute then quantile(rb, 0.5, out, 'p50') etc. — each quantile does its own NaN-compaction pass (cheap on a 1k-sample buffer).

What about stddev? Not included — it requires a second pass over the compacted data. Add it locally:

let sumSq = 0;
for (let i = 0; i < count; i++) {
  const d = scratchpad[i] - out.avg;
  sumSq += d * d;
}
out.std = Math.sqrt(sumSq / count);

Companion libraries

@zakkster/lite-ring-buffer — the sliding-window source.
@zakkster/lite-canvas-graph — pair this with a stats overlay rendered via the labelBitmapHook.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@zakkster/lite-stats-math

Contents

Why

Install

Quick start

Single-quantile mode

How it works

Single-pass NaN-filtering compaction

Quantile selection

Index calculation

API reference

new StatsMath(capacity)

Methods

out shape

Edge cases & guarantees

FAQ

Companion libraries

License

`new StatsMath(capacity)`

`out` shape