@oomfware/safe-idn

v0.1.2

Published

a month ago

detect IDN homograph spoofing

0High
0Medium
0Low

externdefs

@oomfware/safe-idn

detect IDN homograph spoofing.

npm install @oomfware/safe-idn

browsers and other user agents must decide whether to display internationalized domain names (IDN) in their decoded Unicode form or keep them as punycode. showing Unicode unconditionally enables phishing — for example, xn--80ak6aa92e.com decodes to аррӏе.com, which uses Cyrillic letters to impersonate apple.com.

this library implements Chromium's IDN display algorithm in TypeScript. it checks each label of a domain for script mixing, confusable characters, dangerous patterns, and other spoofing vectors, then returns either the safe Unicode form or the original punycode.

usage

quick check

use safeDisplay() to get the safe display form of a domain:

import { safeDisplay } from '@oomfware/safe-idn';

// safe domains decode to Unicode
safeDisplay('xn--nxasmq6b.com');
// -> "βόλος.com"

// spoofed domains stay as punycode
safeDisplay('xn--80ak6aa92e.com');
// -> "xn--80ak6aa92e.com" (Cyrillic "аррӏе" impersonating "apple")

detailed results

use checkDomain() for per-label verdicts:

import { checkDomain } from '@oomfware/safe-idn';

const result = checkDomain('xn--80ak6aa92e.com');

console.log(result.display);
// -> "xn--80ak6aa92e.com"

console.log(result.labels[0]);
// -> { input: 'xn--80ak6aa92e', unicode: 'аррӏе', result: 'unsafe' }

each label result contains:

input — the original label as it appeared in the domain
unicode — the decoded Unicode form (even if unsafe)
result — 'safe', 'unsafe', or 'invalid'

what it checks

the following safety checks are performed, matching Chromium's behavior:

script mixing — blocks unsafe combinations of Unicode scripts (e.g., Latin + Cyrillic)
whole-script confusables — detects labels where every character in a script has a Latin lookalike (e.g., Cyrillic "а" for Latin "a")
skeleton confusables — compares UTS #39 skeletons against a list of top domains to catch near-lookalikes
character blocklist — blocks characters known to cause confusion (symbols, ligatures, IPA extensions, etc.)
dangerous patterns — catches combining mark abuse, dot-after-i/j tricks, and RTL mark misplacement
mixed digit systems — prevents mixing digits from different scripts
digit lookalikes — detects non-digit characters that resemble ASCII digits
kana confusables — catches Hiragana/Katakana interchange and context violations
TLD-specific rules — restricts characters like þ, ð, ə, and · to their appropriate TLDs
deviation characters — blocks ZWNJ and ZWJ
IDN TLD spoofing — detects punycode TLDs whose skeletons match common ASCII TLDs

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@oomfware/safe-idn

usage

quick check

detailed results

what it checks