@delusional/midpoint-proximity

v0.1.8

Published

a month ago

Hyper-optimised midpoint proximity ranking for point pairs on a spherical surface

0High
0Medium
0Low

@delusional/midpoint-proximity

Given two lists of geographic points (A and B) and a target, find the (A, B) pairs whose geographic midpoint is closest to the target. The brute-force approach checks all |A| x |B| pairs — this crate uses a spatial index to avoid that quadratic blowup.

Installation

npm install @delusional/midpoint-proximity

Usage

import init, {
  initThreadPool,
  find_top_midpoints,
  find_best_pair,
  compute_all_midpoints,
} from "@delusional/midpoint-proximity";

await init();
await initThreadPool(navigator.hardwareConcurrency);

// Flat [lat, lng, lat, lng, ...] in degrees
const pointsA = new Float64Array([51.5074, -0.1278, 48.8566, 2.3522]);
const pointsB = new Float64Array([40.7128, -74.0060, 35.6762, 139.6503]);

// Top 5 pairs whose midpoint is closest to (50°N, 0°E), no minimum endpoint distance
const results = find_top_midpoints(pointsA, pointsB, 50.0, 0.0, 5, 0.0);
// Float64Array: [mid_lat, mid_lng, dist_to_target_km, idx_a, idx_b, dist_ab_km, ...]
//               (6 values per result, sorted closest-first)

// Single best pair
const best = find_best_pair(pointsA, pointsB, 50.0, 0.0, 0.0);
// Float64Array: [mid_lat, mid_lng, dist_to_target_km, idx_a, idx_b, dist_ab_km]

// All midpoints (no target, no ranking)
const all = compute_all_midpoints(pointsA, pointsB);
// Float64Array: [mid_lat, mid_lng, idx_a, idx_b, dist_ab_km, ...]
//               (5 values per pair)

Note: initThreadPool requires SharedArrayBuffer, which needs these response headers:

Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Embedder-Policy: require-corp

A should be the smaller set — see Quick overview below for why.

Quick overview

The algorithm builds a spatial index on B, then loops over every A point. For each A, it computes where the ideal B partner would be (the "reflection point"), searches the index near that location, and scores a small candidate set instead of all of B.

A should be the smaller set. The outer loop iterates over A, and the index absorbs the size of B. Building the index is O(|B|) — done once. Each A point queries it in O(C) where C is a small constant (256-4096 candidates). So total work is O(|B| + |A| x C). Putting the smaller list as A minimises the number of loop iterations, which is where the per-point work (reflection, grid lookup, flood-fill, scoring) happens.

architectural diagram

Three public functions:

find_top_midpoints — the sub-quadratic indexed search (main entry point)
find_best_pair — optimised wrapper for N=1
compute_all_midpoints — brute-force all pairs (no target, no ranking)

Key insights

The reflection trick

For a given A point, which B point would produce a midpoint exactly on the target? The geographic midpoint of two unit vectors a and b is normalize(a + b). For this to equal the target t, we need a + b parallel to t, which gives us b* = 2(a . t)t - a.

This b* is the reflection of a about the target axis. Instead of checking all B points, we search the grid in a small neighbourhood around b*. This turns the search from O(|B|) per A to O(C) where C is typically 256-4096 candidates.

The octahedral grid

B points are projected onto the unit sphere and then onto a 2D square using the octahedral equal-area projection. This projection unfolds the sphere onto the faces of an octahedron, giving roughly uniform cell sizes (unlike equirectangular, which bunches cells at the poles).

The 2D square is divided into a res x res grid (256, 512, or 1024 depending on |B|). Each B point lands in one cell. The grid uses a dense offset array (CSR format — Compressed Sparse Row) so looking up which B points are in cell (u, v) is a direct array index — O(1), no binary search.

Three grids are built with different axis permutations (XYZ, YZX, ZXY). Each permutation puts a different axis as the "primary" axis, so cells near that axis have the best resolution. For any query direction, we pick the single best table — the one where the query direction is closest to the primary axis — giving the tightest candidate set without redundant work.

The flood-fill

Once we know which cell b* lands in, we need to find all nearby cells that could contain promising B points. The algorithm:

Center cell: always included.
Four axis spines: walk north/south/east/west from the center. For each cell, check if a midpoint formed from that cell's center could beat the current threshold. Stop when it can't. Uses SIMD f32x4 to test 4 consecutive cells per instruction.
Four quadrant sweeps: fill in the diagonal regions bounded by the spines. Each quadrant sweeps column-by-column, and the column height monotonically decreases (convexity of the threshold boundary). Every cell is visited exactly once — no visited bitset needed.

The threshold check is: "if I placed A's partner at this cell's center, would the resulting midpoint be closer to the target than my current N-th best result?" This is computed as dot^2 > threshold * len_sq — multiplications only, no division.

The ranking metric

The true geodesic distance requires asin and sqrt, which are expensive. Instead, we rank by -(dot * |dot|) / |mid|^2 where dot = midpoint . target and |mid|^2 is the squared length of the unnormalised midpoint. This is monotonic with geodesic distance — it gives the exact same ranking but uses only multiplies and one division (and the division only happens for candidates that pass the threshold check).

Why single-table selection works

The octahedral projection stretches cells near the "equator" of each permutation's primary axis. Three tables (XYZ, YZX, ZXY) guarantee that for any direction, at least one table has tight cells. But querying all three tables triples the work for little benefit — the best table already gives a tight candidate set. Selecting the best table per query reduces gather time significantly.

The selection is trivial: compare which component of b* is largest in absolute value, and pick the table whose primary axis matches. Three comparisons, O(1).

Implementation details

Memory layout

B point coordinates are stored in Structure of Arrays (SoA) format: separate Vec<f32> for x, y, and z. This means consecutive cells along the v-axis have their x-coordinates contiguous in memory, then y-coordinates contiguous, etc. The SIMD flood-fill exploits this — f32x4::from_slice loads 4 consecutive cell centers with a single instruction, no gather needed.

Cell offsets are a dense Vec<u32> of size res*res + 1. Cell (u, v) contains B indices indices[cell_offsets[u*res+v] .. cell_offsets[u*res+v+1]]. Empty cells have equal consecutive offsets. This is the standard CSR (Compressed Sparse Row) format used in sparse matrices.

Memory cost at res=512: ~1 MB for cell offsets + ~3 MB for cell centers = ~4 MB per table.

Parallelism

The outer loop over A points is parallelised with Rayon. Each thread maintains:

A local results buffer (top-N candidates found so far)
A local pruning threshold (tightens as better results are found)
A reusable scratch buffer for candidates

When a thread's buffer exceeds a size limit, it runs select_nth_unstable to compact down to the top-N and updates its threshold. This batched compaction amortises the O(n) selection cost across many insertions.

After all threads finish, results are merged with a reduce step that keeps the global top-N.

Target-neighbor safety net

When a . t <= 0 (A is on the opposite hemisphere from the target), the reflection b* points away from the target. The flood-fill near b* might miss B points that are actually near the target. To handle this, we pre-gather a set of B points near the target direction and merge them into each A's candidate set.

This merge is conditionally skipped per A point when the pruning threshold is tight enough. The best possible metric from a target neighbor is approximately -(a.t + 1) / 2. When this can't beat the current threshold, the merge is skipped entirely — saving work for the majority of A points once early iterations have established good results.

SIMD details

The crate uses Rust's portable_simd (#![feature(portable_simd)]) which compiles to WASM SIMD128 instructions when targeting wasm32-unknown-unknown with -C target-feature=+simd128.

The threshold check on a cell center is:

mx = ax + cell_center_x
my = ay + cell_center_y
mz = az + cell_center_z
dot = mx*tx + my*ty + mz*tz
len_sq = mx*mx + my*my + mz*mz
passes = dot*dot > threshold * len_sq

With f32x4, this tests 4 cells in ~12 SIMD instructions. North/south spines use contiguous f32x4::from_slice loads (cells along v are contiguous). East/west spines use f32x4::from_array with manual gather (stride = res).

Index construction

Building the grid is O(|B|) per table:

Compute each B point's cell index via octahedral projection — O(|B|)
Count points per cell — O(|B|)
Prefix sum to build the dense offset array — O(res^2)
Scatter B indices into position — O(|B|)
Precompute cell-center unit vectors — O(res^2)

No sorting is needed. The prefix-sum + scatter pattern is the standard CSR construction used in sparse matrix libraries.

Grid resolution

| B size | Resolution | Cells | Cell offsets | Cell centers | |--------|-----------|-------|-------------|-------------| | < 50K | 256 | 65K | 256 KB | 768 KB | | 50K-500K | 512 | 262K | 1 MB | 3 MB | | > 500K | 1024 | 1M | 4 MB | 12 MB |

Complexity

| Function | Time | Space | |----------|------|-------| | find_top_midpoints | O(|B| + |A| · C + k log k) | O(|B|) index + O(k) per thread | | find_best_pair | O(|B| + |A| · C) | O(|B|) index | | compute_all_midpoints | O(|A| · |B|) | O(|A| · |B|) |

Where C is the per-A candidate count (typically 256–4096, controlled by grid resolution and pruning threshold), and k = top_n. Candidate selection uses select_nth_unstable (introselect, O(n)) with batched compaction. The final k winners are sorted in O(k log k). Index construction is O(|B|) per table (3 tables). The outer loop over A is parallelised with Rayon. compute_all_midpoints is the brute-force baseline with no indexing — quadratic in both time and space.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@delusional/midpoint-proximity

Installation

Usage

Quick overview

Key insights

The reflection trick

The octahedral grid

The flood-fill

The ranking metric

Why single-table selection works

Implementation details

Memory layout

Parallelism

Target-neighbor safety net

SIMD details

Index construction

Grid resolution

Complexity