@delusional/midpoint-proximity
v0.1.1
Published
Hyper-optimised midpoint proximity ranking for point pairs on a spherical surface
Readme
@delusional/midpoint-proximity
Given two lists of geographic points (A and B) and a target, find the (A, B) pairs whose
geographic midpoint is closest to the target. The brute-force approach checks all |A| x |B|
pairs — this crate uses a spatial index to avoid that quadratic blowup.
Installation
npm install @delusional/midpoint-proximityUsage
import init, {
initThreadPool,
find_top_midpoints,
find_best_pair,
compute_all_midpoints,
} from "@delusional/midpoint-proximity";
await init();
await initThreadPool(navigator.hardwareConcurrency);
// Flat [lat, lng, lat, lng, ...] in degrees
const pointsA = new Float64Array([51.5074, -0.1278, 48.8566, 2.3522]);
const pointsB = new Float64Array([40.7128, -74.0060, 35.6762, 139.6503]);
// Top 5 pairs whose midpoint is closest to (50°N, 0°E), no minimum endpoint distance
const results = find_top_midpoints(pointsA, pointsB, 50.0, 0.0, 5, 0.0);
// Float64Array: [mid_lat, mid_lng, dist_to_target_km, idx_a, idx_b, dist_ab_km, ...]
// (6 values per result, sorted closest-first)
// Single best pair
const best = find_best_pair(pointsA, pointsB, 50.0, 0.0, 0.0);
// Float64Array: [mid_lat, mid_lng, dist_to_target_km, idx_a, idx_b, dist_ab_km]
// All midpoints (no target, no ranking)
const all = compute_all_midpoints(pointsA, pointsB);
// Float64Array: [mid_lat, mid_lng, idx_a, idx_b, dist_ab_km, ...]
// (5 values per pair)Note: initThreadPool requires SharedArrayBuffer, which needs these response headers:
Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Embedder-Policy: require-corpA should be the smaller set — see Quick overview below for why.
Quick overview
The algorithm builds a spatial index on B, then loops over every A point. For each A, it computes where the ideal B partner would be (the "reflection point"), searches the index near that location, and scores a small candidate set instead of all of B.
A should be the smaller set. The outer loop iterates over A, and the index absorbs the size of B. Building the index is O(|B|) — done once. Each A point queries it in O(C) where C is a small constant (256-4096 candidates). So total work is O(|B| + |A| x C). Putting the smaller list as A minimises the number of loop iterations, which is where the per-point work (reflection, grid lookup, flood-fill, scoring) happens.

Three public functions:
find_top_midpoints— the sub-quadratic indexed search (main entry point)find_best_pair— optimised wrapper for N=1compute_all_midpoints— brute-force all pairs (no target, no ranking)
Key insights
The reflection trick
For a given A point, which B point would produce a midpoint exactly on the target? The
geographic midpoint of two unit vectors a and b is normalize(a + b). For this to equal
the target t, we need a + b parallel to t, which gives us b* = 2(a . t)t - a.
This b* is the reflection of a about the target axis. Instead of checking all B points, we search the grid in a small neighbourhood around b*. This turns the search from O(|B|) per A to O(C) where C is typically 256-4096 candidates.
The octahedral grid
B points are projected onto the unit sphere and then onto a 2D square using the octahedral equal-area projection. This projection unfolds the sphere onto the faces of an octahedron, giving roughly uniform cell sizes (unlike equirectangular, which bunches cells at the poles).
The 2D square is divided into a res x res grid (256, 512, or 1024 depending on |B|).
Each B point lands in one cell. The grid uses a dense offset array (CSR format —
Compressed Sparse Row) so looking up which B points are in cell (u, v) is a direct
array index — O(1), no binary search.
Three grids are built with different axis permutations (XYZ, YZX, ZXY). Each permutation puts a different axis as the "primary" axis, so cells near that axis have the best resolution. For any query direction, we pick the single best table — the one where the query direction is closest to the primary axis — giving the tightest candidate set without redundant work.
The flood-fill
Once we know which cell b* lands in, we need to find all nearby cells that could contain promising B points. The algorithm:
- Center cell: always included.
- Four axis spines: walk north/south/east/west from the center. For each cell, check if a midpoint formed from that cell's center could beat the current threshold. Stop when it can't. Uses SIMD f32x4 to test 4 consecutive cells per instruction.
- Four quadrant sweeps: fill in the diagonal regions bounded by the spines. Each quadrant sweeps column-by-column, and the column height monotonically decreases (convexity of the threshold boundary). Every cell is visited exactly once — no visited bitset needed.
The threshold check is: "if I placed A's partner at this cell's center, would the
resulting midpoint be closer to the target than my current N-th best result?" This is
computed as dot^2 > threshold * len_sq — multiplications only, no division.
The ranking metric
The true geodesic distance requires asin and sqrt, which are expensive. Instead, we
rank by -(dot * |dot|) / |mid|^2 where dot = midpoint . target and |mid|^2 is the
squared length of the unnormalised midpoint. This is monotonic with geodesic distance —
it gives the exact same ranking but uses only multiplies and one division (and the division
only happens for candidates that pass the threshold check).
Why single-table selection works
The octahedral projection stretches cells near the "equator" of each permutation's primary axis. Three tables (XYZ, YZX, ZXY) guarantee that for any direction, at least one table has tight cells. But querying all three tables triples the work for little benefit — the best table already gives a tight candidate set. Selecting the best table per query reduces gather time significantly.
The selection is trivial: compare which component of b* is largest in absolute value, and pick the table whose primary axis matches. Three comparisons, O(1).
Implementation details
Memory layout
B point coordinates are stored in Structure of Arrays (SoA) format: separate Vec<f32>
for x, y, and z. This means consecutive cells along the v-axis have their x-coordinates
contiguous in memory, then y-coordinates contiguous, etc. The SIMD flood-fill exploits
this — f32x4::from_slice loads 4 consecutive cell centers with a single instruction,
no gather needed.
Cell offsets are a dense Vec<u32> of size res*res + 1. Cell (u, v) contains B
indices indices[cell_offsets[u*res+v] .. cell_offsets[u*res+v+1]]. Empty cells have
equal consecutive offsets. This is the standard CSR (Compressed Sparse Row) format used
in sparse matrices.
Memory cost at res=512: ~1 MB for cell offsets + ~3 MB for cell centers = ~4 MB per table.
Parallelism
The outer loop over A points is parallelised with Rayon. Each thread maintains:
- A local results buffer (top-N candidates found so far)
- A local pruning threshold (tightens as better results are found)
- A reusable scratch buffer for candidates
When a thread's buffer exceeds a size limit, it runs select_nth_unstable to compact
down to the top-N and updates its threshold. This batched compaction amortises the O(n)
selection cost across many insertions.
After all threads finish, results are merged with a reduce step that keeps the global top-N.
Target-neighbor safety net
When a . t <= 0 (A is on the opposite hemisphere from the target), the reflection b*
points away from the target. The flood-fill near b* might miss B points that are
actually near the target. To handle this, we pre-gather a set of B points near the target
direction and merge them into each A's candidate set.
This merge is conditionally skipped per A point when the pruning threshold is tight
enough. The best possible metric from a target neighbor is approximately -(a.t + 1) / 2.
When this can't beat the current threshold, the merge is skipped entirely — saving work
for the majority of A points once early iterations have established good results.
SIMD details
The crate uses Rust's portable_simd (#![feature(portable_simd)]) which compiles to
WASM SIMD128 instructions when targeting wasm32-unknown-unknown with
-C target-feature=+simd128.
The threshold check on a cell center is:
mx = ax + cell_center_x
my = ay + cell_center_y
mz = az + cell_center_z
dot = mx*tx + my*ty + mz*tz
len_sq = mx*mx + my*my + mz*mz
passes = dot*dot > threshold * len_sqWith f32x4, this tests 4 cells in ~12 SIMD instructions. North/south spines use
contiguous f32x4::from_slice loads (cells along v are contiguous). East/west spines
use f32x4::from_array with manual gather (stride = res).
Index construction
Building the grid is O(|B|) per table:
- Compute each B point's cell index via octahedral projection — O(|B|)
- Count points per cell — O(|B|)
- Prefix sum to build the dense offset array — O(res^2)
- Scatter B indices into position — O(|B|)
- Precompute cell-center unit vectors — O(res^2)
No sorting is needed. The prefix-sum + scatter pattern is the standard CSR construction used in sparse matrix libraries.
Grid resolution
| B size | Resolution | Cells | Cell offsets | Cell centers | |--------|-----------|-------|-------------|-------------| | < 50K | 256 | 65K | 256 KB | 768 KB | | 50K-500K | 512 | 262K | 1 MB | 3 MB | | > 500K | 1024 | 1M | 4 MB | 12 MB |
Complexity
| Function | Time | Space |
|----------|------|-------|
| find_top_midpoints | O(|B| + |A| · C + k log k) | O(|B|) index + O(k) per thread |
| find_best_pair | O(|B| + |A| · C) | O(|B|) index |
| compute_all_midpoints | O(|A| · |B|) | O(|A| · |B|) |
Where C is the per-A candidate count (typically 256–4096, controlled by grid resolution
and pruning threshold), and k = top_n. Candidate selection uses select_nth_unstable
(introselect, O(n)) with batched compaction. The final k winners are sorted in O(k log k).
Index construction is O(|B|) per table (3 tables). The outer loop over A is parallelised
with Rayon. compute_all_midpoints is the brute-force baseline with no indexing —
quadratic in both time and space.
