@sarfarajey/fuzzy-match
v1.0.0
Published
Token-based fuzzy string matching with Levenshtein distance — zero dependencies
Maintainers
Readme
@sarfarajey/fuzzy-match
Token-based fuzzy string matching with Levenshtein distance fallback. Zero dependencies.
Designed for entity deduplication and voice/text input matching — mapping raw user input to a known list of canonical candidates.
Install
npm install @sarfarajey/fuzzy-matchUsage
import { findBestMatchId, scoreTerm } from '@sarfarajey/fuzzy-match';
const candidates = [
{ id: 'acme', label: 'Acme Corp', aliases: ['Acme', 'ACME Corporation'] },
{ id: 'globex', label: 'Globex', aliases: ['Globex Corp', 'GlobEx'] },
];
findBestMatchId('acme corp', candidates); // → 'acme'
findBestMatchId('globex corporation', candidates); // → 'globex'
findBestMatchId('xyz unknown', candidates); // → null (below threshold)
// With custom threshold (0–100, default 58)
findBestMatchId('acmee', candidates, 70); // → 'acme' (typo tolerance)
findBestMatchId('acmee', candidates, 95); // → null (strict)
// Score a single term pair
scoreTerm('acme', 'Acme Corp'); // → 86 (substring match)
scoreTerm('acme', 'Acme'); // → 100 (exact after normalization)
scoreTerm('akme', 'Acme'); // → 75 (Levenshtein)API
findBestMatchId(input, candidates, threshold?)
Returns the id of the best-matching candidate, or null if no candidate meets the threshold.
| Param | Type | Default | Description |
|-------|------|---------|-------------|
| input | string | — | Raw input to match |
| candidates | MatchCandidate[] | — | Known entity list |
| threshold | number | 58 | Minimum score (0–100) |
scoreTerm(input, term)
Score a single input string against a single candidate term. Returns 0–100.
MatchCandidate
interface MatchCandidate {
id: string; // returned on match
label: string; // primary display label
aliases: string[]; // alternate names / abbreviations
}Scoring
| Condition | Score | |-----------|-------| | Exact match (after normalization) | 100 | | Substring containment (either direction) | 86 | | Levenshtein similarity | 0–85 |
Normalization lowercases input, strips non-alphanumeric characters (except spaces), and collapses whitespace.
License
MIT
