khmer-username-filter
v1.1.1
Published
Detect inappropriate Khmer usernames, impersonation, spam patterns, and obfuscated Khmer-Latin slang
Maintainers
Readme
khmer-username-filter
A dependency-free Node.js package for detecting inappropriate Khmer usernames, Khmer-Latin slang, impersonation names, and spam-like account patterns.
It is built for signup forms, account moderation, admin dashboards, bots, and APIs that need a simple check(username) call but still want enough detail to explain why a username was rejected.
Features
- Khmer Unicode and Khmer-Latin transliteration matching
- Obfuscation detection for separators, repeated punctuation, and common leet text like
4dm1n - Profanity, impersonation, spam, URL, email, phone, contact-handle, scam-term, and bot-like pattern checks
- Risk score, risk level, match severity, highlighted output, and machine-readable match metadata
- Runtime wordlist extension plus isolated checker instances for multi-tenant apps
- TypeScript declarations and a small CLI included
- No external dependencies
Installation
npm install khmer-username-filterBasic Usage
const { check } = require("khmer-username-filter");
const result = check("អាឆ្កែ_admin_user");
console.log(result.isClean); // false
console.log(result.riskLevel); // "high"
console.log(result.score); // 100
console.log(result.highlighted); // "**អាឆ្កែ**_**admin**_user"
console.log(result.matches);Example result:
{
isClean: false,
username: "អាឆ្កែ_admin_user",
matches: [
{
word: "អាឆ្កែ",
type: "profanity",
description: "Contains offensive/profane Khmer content",
severity: "high",
score: 70,
evidence: "direct"
},
{
word: "admin",
type: "impersonation",
description: "Contains reserved or impersonation term",
severity: "medium",
score: 50,
evidence: "direct"
}
],
highlighted: "**អាឆ្កែ**_**admin**_user",
score: 100,
riskLevel: "high"
}API
check(username, options?)
const result = check("official.support", {
checkProfanity: true,
checkImpersonation: true,
checkSpam: true,
checkObfuscation: true,
includeNormalized: false,
allowlist: ["trusted_username"]
});Options:
| Option | Type | Default | Description |
| --- | --- | --- | --- |
| checkProfanity | boolean | true | Check built-in and custom profanity words |
| checkImpersonation | boolean | true | Check reserved, admin, official, government, media, and brand-like terms |
| checkSpam | boolean | true | Check bot, URL, email, phone, repeated character, and scam patterns |
| checkObfuscation | boolean | true | Detect separator and leet obfuscation |
| includeNormalized | boolean | false | Include normalized username in the result |
| allowlist | Array<string \| RegExp> | [] | Exact username or regex allowlist |
isClean(username, options?)
const { isClean } = require("khmer-username-filter");
if (!isClean("4dm1n")) {
throw new Error("Username is not allowed");
}checkMany(usernames, options?)
const { checkMany } = require("khmer-username-filter");
const results = checkMany(["សុខា_123", "john_official", "spam.com"]);createChecker(customWordlists?)
Use this when you want custom words without mutating the package-level default checker.
const { createChecker } = require("khmer-username-filter");
const checker = createChecker({
profanity: ["custom_bad_word", /very_bad_\d+/i],
impersonation: ["my_app", "my_brand"],
spam: [
{
pattern: /^bot_\d+$/i,
label: "bot_pattern",
description: "Looks like a generated bot account"
}
]
});
const result = checker.check("my_brand_support");Runtime Wordlist Extension
const {
addProfanityWord,
addProfanityPattern,
addImpersonationWord,
addSpamPattern,
loadCustomWordlists,
} = require("khmer-username-filter");
addProfanityWord("yourBadWord");
addProfanityPattern(/bad_word_\d+/i, "Custom regex profanity");
addImpersonationWord("yourBrandName");
addSpamPattern(/^test_bot_\d+$/i, "test_bot", "Test bot pattern");
loadCustomWordlists({
profanity: ["word1", /word_\d+/i],
impersonation: ["brand1"],
spam: [{ pattern: /spam/i, label: "spam", description: "Spam word" }]
});Express Example
const express = require("express");
const { check } = require("khmer-username-filter");
const app = express();
app.use(express.json());
app.post("/signup", (req, res) => {
const username = String(req.body.username || "");
const result = check(username);
if (!result.isClean) {
return res.status(400).json({
message: "Username is not allowed",
riskLevel: result.riskLevel,
matches: result.matches.map(({ type, label, description }) => ({
type,
label,
description,
})),
});
}
return res.status(204).send();
});CLI
npx khmer-username-filter "4dm1n"
npx khmer-username-filter --json "អាឆ្កែ_admin_user"
echo "john_official" | npx khmer-username-filterCLI exits with code 1 when any username is flagged.
TypeScript
Types are included.
import { check, type CheckResult } from "khmer-username-filter";
const result: CheckResult = check("official.support");Contributing
Contributions are welcome, especially improvements to Khmer bad-word coverage, transliteration variants, impersonation terms, and spam patterns. If you notice missing words or false positives, please open an issue or pull request with clear examples so the wordlists can improve over time.
