similar-routes
v1.0.0
Published
Framework-agnostic fuzzy path matcher for typo-tolerant URL redirects and suggestions.
Maintainers
Readme
similar-routes
Framework-agnostic fuzzy path matcher for typo-tolerant URL redirects and suggestions.
Turn /blogs/ → /blog/, /articles/ → /blog/, /guide/getting-startd/ → /guide/getting-started/, and render "Did you mean …?" lists on 404 — all from your existing data, no hand-authored synonym tables.
Why
Unmatched content URLs normally dead-end. similar-routes gives them a useful response:
- Typos / plurals (
blogs,produt,abuot) redirect with high confidence. - Synonyms derived from your own copy (nav labels, tags, titles) become aliases automatically.
- Slug drift within a section is recovered segment-by-segment.
Framework-agnostic: the core is pure TypeScript with zero runtime dependencies. Plug it into Express, Fastify, Hono, Astro, Next, or anything that produces a 404.
Install
npm install similar-routesQuick start
import { buildIndex, findSimilar } from 'similar-routes';
const index = buildIndex({
sections: [
{ token: 'guide', aliases: ['docs', 'documentation'], children: [
{ token: 'getting-started', children: [
{ token: 'step-1' }, { token: 'step-2' }
]},
{ token: 'configuration' }
]},
{ token: 'blog', aliases: ['articles', 'posts', 'news'], children: [
{ token: 'release-notes' }
]}
]
});
findSimilar(['blogs'], index);
// → { confidence: 'high', replacement: ['blog'] }
findSimilar(['articles'], index);
// → { confidence: 'high', replacement: ['blog'] }
findSimilar(['guide', 'getting-started', 'stpe-1'], index);
// → { confidence: 'high', replacement: ['guide', 'getting-started', 'step-1'] }
findSimilar(['bananas'], index);
// → { confidence: 'none' }
findSimilar(['guide', 'banana'], index);
// → { confidence: 'none', suggestions: [['guide']] }API
findSimilar(segments, index, tuning?)
Matches a list of path segments against an index. Returns:
{
confidence: 'high' | 'medium' | 'low' | 'none';
replacement?: string[]; // present for high/medium/low
suggestions?: string[][]; // alternatives; up to maxSuggestions
}Scoring (per segment):
| Tier | Examples | How it's detected |
|---|---|---|
| EXACT | guide | segment === node.token |
| HIGH | blogs, produt, articles (alias), doc (prefix) | plural/stem equality, alias hit, prefix within prefixLenDelta, or edit distance ≤ 1 (OSA — Levenshtein + transpositions) |
| MEDIUM | chnglg (two-char typo) | edit distance ≤ 2 |
| NONE | bananas | no signal |
Multi-segment paths downgrade: two HIGH segments in a row → MEDIUM overall (compound uncertainty).
buildIndex(input)
Builds a SimilarIndex from a declarative { sections: SectionDef[] } tree. Each SectionDef / NodeDef has a token, optional aliases, and optional children. The builder trusts its input — run alias hygiene first if your alias sources overlap section names.
crossFilterAliases(sections, options?)
Strips alias tokens that would let one section claim another's name (or its naive singular/plural variant). Essential when aliases come from free-text content where the same word appears across sections.
import { crossFilterAliases } from 'similar-routes';
const filtered = crossFilterAliases([
{ token: 'product', candidates: ['items', 'blog', 'catalog'] },
{ token: 'blog', candidates: ['articles', 'product', 'posts'] }
]);
// filtered.get('product') → Set { 'items', 'catalog' }
// filtered.get('blog') → Set { 'articles', 'posts' }Pass options.variants to swap in a language-aware inflector or stemmer. Default is English singular/plural.
tokenize(value, options?)
Unicode-aware: lowercases, splits on any non-letter/non-number, drops tokens shorter than minLen (default 3). Handy for turning nav labels or titles into alias candidates.
tokenize('Docs & Guides'); // ['docs', 'guides']
tokenize('Şükran günü'); // ['şükran', 'günü']
tokenize('编程 学习', { minLen: 2 }); // ['编程', '学习']Tuning
All tunables are optional per-call:
findSimilar(segments, index, {
maxSegmentLen: 64, // reject pathological inputs (DoS bound on edit distance)
maxSuggestions: 3, // cap on suggestions[] length
prefixLenDelta: 3 // longer.length - shorter.length ≤ N counts as prefix match
});Exported as DEFAULT_TUNING for reference.
HTTP header contract — similar-routes/http
Optional sub-export for setups where a producer (e.g. a reverse proxy or edge function) hands off to a consumer (e.g. an SSR 404 page) via headers.
import { SIMILAR_HEADERS, parseSuggestionsHeader } from 'similar-routes/http';
// Producer:
req.headers[SIMILAR_HEADERS.ORIGINAL_PATH] = '/blogs/getting-started/';
req.headers[SIMILAR_HEADERS.SUGGESTIONS] = JSON.stringify(['/blog/']);
req.headers[SIMILAR_HEADERS.FALLBACK_MARKER] = 'yes';
// Consumer (inside SSR page):
const suggestions = parseSuggestionsHeader(req.headers[SIMILAR_HEADERS.SUGGESTIONS]);
// Safe: rejects //evil.com, /\backslash, javascript:, non-arrays, malformed JSON.Design choices
- OSA edit distance over plain Levenshtein.
stpe→stepis 1 edit under OSA (transposition), 2 under Levenshtein. Matches user intuition without the full Damerau–Levenshtein machinery. - Segment-by-segment matching, not whole-path. A typo in segment 2 shouldn't lose the correct segment 1.
- Compound uncertainty. Two HIGH segments in a row downgrades to MEDIUM. Prevents silent drift on paths with several near-misses.
- No I/O, no framework deps. The core is ~5KB minified+gzipped and works in any JS runtime.
License
MIT
