lightning-levenshtein
v0.0.5
Published
Blazing-fast Levenshtein distance with bit-parallel Myers algorithm and early exits.
Maintainers
Readme
⚡ lightning-levenshtein
Fast Levenshtein distance in pure JavaScript. Compact default API for general-purpose edit distance, plus opt-in subpaths for maximum throughput and Unicode-width character tables.
Features
- Specialized kernels for very short strings
- Precompiled bit-parallel dispatch for short inputs
- Fixed-width Myers variants for medium inputs
- Generalized Myers fallback for large inputs
- Zero runtime dependencies
- Works in Node.js and browsers
Install
pnpm add lightning-levenshteinQuick start
import { distance, distanceMax, closest } from "lightning-levenshtein";
distance("kitten", "sitting"); // 3
distanceMax("kitten", "sitting", 2); // -1 (over threshold)
closest("kitten", ["kitchen", "sitting"]); // "kitten"-nearest entryAPI
The package exposes three logical entrypoints, each with an ESM-source path for bundlers and a pre-built minified path for unbundled <script type="module"> use:
{
"exports": {
".": { "import": "./src/index.js", "default": "./dist/lightning-levenshtein.min.js" },
"./min": { "default": "./dist/lightning-levenshtein.min.js" },
"./v2": { "default": "./dist/lightning-levenshtein-v2.min.js" },
"./unicode": { "import": "./src/unicode.js", "default": "./dist/lightning-levenshtein-unicode.min.js" },
"./unicode/min": { "default": "./dist/lightning-levenshtein-unicode.min.js" }
}
}As of 0.0.4, the . and ./unicode entries route bundlers (esbuild, vite, rollup, webpack 5) at ESM source so unused exports can be tree-shaken. Single-function consumers of distance drop ~26% gzipped vs the pre-bundled .min.js. The ./v2 entry still ships only the pre-built bundle pending a source-folder refactor.
Default API
import { distance, distanceMax, closest } from "lightning-levenshtein";The stable core API. Bundlers receive ESM source; unbundled consumers can use lightning-levenshtein/min to point at the pre-built bundle directly.
Max-throughput API
The /v2 subpath exposes the larger max-throughput runtime:
import { levenshteinLightning } from "lightning-levenshtein/v2";This resolves to dist/lightning-levenshtein-v2.min.js. The v2 build is a separate optimized runtime that uses more aggressive length-based dispatch, tiny-string fast paths, precompiled 32-bit kernels, fixed-width Myers variants, and a generalized large-input fallback. Choose it when throughput matters more than the extra JavaScript payload.
Unicode API
Unicode-correct UTF-16 code-unit distance:
import { distanceUnicode } from "lightning-levenshtein/unicode";Use this path when your strings can contain code units above 255, such as many BMP characters. It is intentionally separate from the default entrypoint so the default distance hot path does not pay for wider PEQ tables or per-call character-set routing. Use lightning-levenshtein/unicode/min for the pre-built bundle.
Notes
Which one should I pick?
Use the default package entrypoint if you want the stable general-purpose API with the smallest production build:
import { distance, distanceMax, closest } from "lightning-levenshtein";Use the v2 subpath if you specifically want the specialized levenshteinLightning runtime and are comfortable with the larger payload:
import { levenshteinLightning } from "lightning-levenshtein/v2";Use the unicode subpath if you need UTF-16 code-unit correctness beyond the default low-memory PEQ table:
import { distanceUnicode } from "lightning-levenshtein/unicode";Character table strategy
The default API is tuned for the hottest low-memory path and assumes ASCII/Latin-1-style input for its PEQ tables.
The unicode subpath exposes distanceUnicode, which is backed by full UTF-16 code-unit PEQ tables. That path is intentionally separate so callers can opt into wider character support without adding per-call detection to the default distance hot path.
The design direction is:
- keep
distancefast and pre-routed for common ASCII/Latin-1 workloads - expose Unicode-correct behavior through the deliberate
/unicodesubpath - share kernel implementations by binding the PEQ table before entering the hot function
- avoid naive automatic routing that scans both strings on every call
Dispatch strategy
The runtime selects the cheapest correct kernel for the current input size.
- 1–32 chars: precompiled bit-parallel kernels
- 33–64 chars: fixed-width Myers specialization
- 65–96 chars: fixed-width Myers specialization
- 97–128 chars: fixed-width Myers specialization
- 129–224 chars: generalized macro-block Myers dispatch
- 225–256 chars: fixed-width Myers specialization
- 257–512 chars: generalized macro-block Myers dispatch
- 513+ chars: large-input generalized Myers dispatch
This keeps tiny inputs fast without sacrificing larger-input performance.
Benchmark
The benchmark harness generates the same string pairs for every library at each tested length and seed. Benchmarks were run in Node.js v24.11.0.
Methodology:
- 500 random equal-length string pairs per test size
- 3 seeds:
1337,7331,20250321 - 500 ms measurement window per seed
- 3 warm-up rounds before timing
- alphabet:
A-Z,a-z,0-9 - reported table values: mean ops/ms across 3 seeds
Mean ops/sec:
| Test Target | N=1 | N=2 | N=4 | N=8 | N=16 | N=32 | N=64 | N=128 | N=256 | N=512 | N=1024 | |---|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:| | lightning-levenshtein-v2 | 180629 | 91386 | 50257 | 32369 | 11949 | 6514 | 1805 | 581.9 | 158.0 | 35.92 | 9.327 | | lightning-levenshtein-v1 | 94625 | 61719 | 40905 | 25507 | 8577 | 4938 | 1224 | 466.0 | 134.5 | 35.25 | 9.117 | | fastest-levenshtein | 100634 | 74359 | 43909 | 23150 | 8307 | 4240 | 1087 | 290.2 | 78.41 | 19.55 | 5.092 | | js-levenshtein | 115773 | 87076 | 23668 | 11343 | 3373 | 924.0 | 241.7 | 61.35 | 15.92 | 3.985 | 1.006 | | leven | 82490 | 41685 | 21482 | 8188 | 1785 | 420.2 | 114.2 | 29.90 | 7.607 | 1.913 | 0.481 | | levenshtein-edit-distance | 114674 | 55327 | 24957 | 8463 | 1792 | 396.6 | 106.6 | 27.38 | 6.992 | 1.754 | 0.438 |
Relative throughput vs fastest-levenshtein. This chart normalizes fastest-levenshtein to 100% at each string length and shows every other library relative to that baseline. Use it for an apples-to-apples comparison against the package most people already know. Values above 100% mean faster than fastest-levenshtein; values below 100% mean slower.
Throughput across input sizes. Mean ops/sec shown on a log-scaled Y axis across the full tested range.
Relative throughput, lightning-levenshtein only. Same baseline as above, narrowed to just lightning-levenshtein for a clearer read.
Rank by input length. Where each library ranks at each tested string length. Useful because raw throughput can be noisy to read at a glance, while rank makes the ordering obvious. If a library is consistently ranked first across the range, you can see that immediately without squinting at the absolute numbers.
Results
lightning-levenshteinis the fastest library in this benchmark set at every tested length.- It leads at
N=1,N=2,N=4,N=8,N=16,N=32,N=64,N=128,N=256,N=512, andN=1024. - At
N=1024, mean throughput is 9.36 ops/ms versus 4.98 ops/ms forfastest-levenshtein. - At
N=32, mean throughput is 6568 ops/ms versus 4240 ops/ms forfastest-levenshtein. - At
N=8, mean throughput is 33126 ops/ms versus 23288 ops/ms forfastest-levenshtein.
Reproducing the benchmark
pnpm run bench:packages
pnpm run bench:packages:table
pnpm run bench:packages:chartGenerated files are written to bench/packages/.
Project layout
bench/bolt/
lev-dispatch.js
levenshtein-lightning-v2.js
levenshtein_Direct_Matrix.js
myers32_v4.js
myers_64.js
myers_96.js
myers_128.js
myers_256.js
myers_x.js
myers_x64.js
bench/packages/
run-bench.js
render-readme-table.js
render-readme-line-chart.js
render-readme-rank-chart.js
render-relative-bar-chart.js
render-readme-relative-fastest-chart.js
results.jsonLicense
Licensed under AGPL-3.0 with WATT3D Additional Terms. See LICENSE and ADDITIONAL_TERMS.md. Commercial AI/model-training use requires compliance with those terms or a separate WATT3D license. © WATT3D.
