@ieviev/resharp
v0.1.1
Published
WebAssembly bindings for the resharp regex engine.
Downloads
267
Maintainers
Readme
resharp-wasm
A regex library for JavaScript, built on resharp. It supports intersection, complement, and lookarounds, and always runs in time proportional to the input length (no catastrophic backtracking).
For pattern syntax and supported features see the resharp README.
Pre-1.0: the API may change between minor versions.
Install
npm install @ieviev/resharpWorks in Node and with bundlers (Vite, webpack, rollup, esbuild).
Usage
import { Regex } from "@ieviev/resharp";
const r = new Regex("\\d+");
r.isMatch("abc 42"); // true
Array.from(r.findAll("a12 b34 c5")); // [1, 3, 5, 7, 9, 10] string positions
Array.from(r.findAllBytes("αX")); // [2, 3] byte positionsEach match is reported as a pair of numbers: where it starts and where
it ends. findAll counts in string positions (the same units JavaScript
strings use). findAllBytes counts in bytes of the UTF-8 encoding,
which is useful when working with Uint8Array data.
Inputs can be a string or a Uint8Array. By default \w, \d, \s,
and . follow JavaScript's built-in regex (without the u flag).
Options
The constructor accepts an optional RegexOptions object.
new Regex("hello", { caseInsensitive: true });
new Regex("a.b", { dotMatchesNewLine: true });
new Regex(pat, {
mode: "js", // "js" | "resharp" | "full"
caseInsensitive: false,
dotMatchesNewLine: false,
ignoreWhitespace: false,
hardened: false, // prevents adversarial blowup
dfaThreshold: 0, // states to eagerly precompile
maxDfaCapacity: 65535, // cached DFA states
lookaheadContextMax: 800,
});Unknown keys and wrong-typed values are rejected by the constructor.
Cleanup
Each Regex holds memory outside of JavaScript's garbage collector.
Release it with r.free(), or let using do it for you:
{
using r = new Regex("\\d+");
r.isMatch("42");
}API
class Regex {
constructor(pattern: string, options?: RegexOptions);
isMatch(input: string | Uint8Array): boolean;
findAll(input: string | Uint8Array): Uint32Array; // string positions
findAllBytes(input: string | Uint8Array): Uint32Array; // byte positions
findAnchored(input: string | Uint8Array): Uint32Array | undefined; // string positions
free(): void;
[Symbol.dispose](): void;
}Limitations
No capture groups. No replace or split. Results give you only the positions of each match. The constructor throws on invalid patterns; matching can throw if the engine hits an internal size or number limit.
The compiled bundle is around 900 KB, mostly Unicode tables. A smaller ASCII-only build is planned.
Build
./build.sh # build via nix shell
nix build # pinned toolchain, ./result is the npm layout
nix develop # devshellLicense
MIT
