khmer-username-filter

v1.1.1

Published

3 days ago

Detect inappropriate Khmer usernames, impersonation, spam patterns, and obfuscated Khmer-Latin slang

0High
0Medium
0Low

khmer cambodia username username-validation profanity profanity-filter filter moderation abuse-prevention khmer-unicode khmer-slang badwords content-moderation

khmer-username-filter

A dependency-free Node.js package for detecting inappropriate Khmer usernames, Khmer-Latin slang, impersonation names, and spam-like account patterns.

It is built for signup forms, account moderation, admin dashboards, bots, and APIs that need a simple check(username) call but still want enough detail to explain why a username was rejected.

Features

Khmer Unicode and Khmer-Latin transliteration matching
Obfuscation detection for separators, repeated punctuation, and common leet text like 4dm1n
Profanity, impersonation, spam, URL, email, phone, contact-handle, scam-term, and bot-like pattern checks
Risk score, risk level, match severity, highlighted output, and machine-readable match metadata
Runtime wordlist extension plus isolated checker instances for multi-tenant apps
TypeScript declarations and a small CLI included
No external dependencies

Installation

npm install khmer-username-filter

Basic Usage

const { check } = require("khmer-username-filter");

const result = check("អាឆ្កែ_admin_user");

console.log(result.isClean); // false
console.log(result.riskLevel); // "high"
console.log(result.score); // 100
console.log(result.highlighted); // "**អាឆ្កែ**_**admin**_user"
console.log(result.matches);

Example result:

{
  isClean: false,
  username: "អាឆ្កែ_admin_user",
  matches: [
    {
      word: "អាឆ្កែ",
      type: "profanity",
      description: "Contains offensive/profane Khmer content",
      severity: "high",
      score: 70,
      evidence: "direct"
    },
    {
      word: "admin",
      type: "impersonation",
      description: "Contains reserved or impersonation term",
      severity: "medium",
      score: 50,
      evidence: "direct"
    }
  ],
  highlighted: "**អាឆ្កែ**_**admin**_user",
  score: 100,
  riskLevel: "high"
}

API

`check(username, options?)`

const result = check("official.support", {
  checkProfanity: true,
  checkImpersonation: true,
  checkSpam: true,
  checkObfuscation: true,
  includeNormalized: false,
  allowlist: ["trusted_username"]
});

Options:

| Option | Type | Default | Description | | --- | --- | --- | --- | | checkProfanity | boolean | true | Check built-in and custom profanity words | | checkImpersonation | boolean | true | Check reserved, admin, official, government, media, and brand-like terms | | checkSpam | boolean | true | Check bot, URL, email, phone, repeated character, and scam patterns | | checkObfuscation | boolean | true | Detect separator and leet obfuscation | | includeNormalized | boolean | false | Include normalized username in the result | | allowlist | Array<string \| RegExp> | [] | Exact username or regex allowlist |

`isClean(username, options?)`

const { isClean } = require("khmer-username-filter");

if (!isClean("4dm1n")) {
  throw new Error("Username is not allowed");
}

`checkMany(usernames, options?)`

const { checkMany } = require("khmer-username-filter");

const results = checkMany(["សុខា_123", "john_official", "spam.com"]);

`createChecker(customWordlists?)`

Use this when you want custom words without mutating the package-level default checker.

const { createChecker } = require("khmer-username-filter");

const checker = createChecker({
  profanity: ["custom_bad_word", /very_bad_\d+/i],
  impersonation: ["my_app", "my_brand"],
  spam: [
    {
      pattern: /^bot_\d+$/i,
      label: "bot_pattern",
      description: "Looks like a generated bot account"
    }
  ]
});

const result = checker.check("my_brand_support");

Runtime Wordlist Extension

const {
  addProfanityWord,
  addProfanityPattern,
  addImpersonationWord,
  addSpamPattern,
  loadCustomWordlists,
} = require("khmer-username-filter");

addProfanityWord("yourBadWord");
addProfanityPattern(/bad_word_\d+/i, "Custom regex profanity");
addImpersonationWord("yourBrandName");
addSpamPattern(/^test_bot_\d+$/i, "test_bot", "Test bot pattern");

loadCustomWordlists({
  profanity: ["word1", /word_\d+/i],
  impersonation: ["brand1"],
  spam: [{ pattern: /spam/i, label: "spam", description: "Spam word" }]
});

Express Example

const express = require("express");
const { check } = require("khmer-username-filter");

const app = express();
app.use(express.json());

app.post("/signup", (req, res) => {
  const username = String(req.body.username || "");
  const result = check(username);

  if (!result.isClean) {
    return res.status(400).json({
      message: "Username is not allowed",
      riskLevel: result.riskLevel,
      matches: result.matches.map(({ type, label, description }) => ({
        type,
        label,
        description,
      })),
    });
  }

  return res.status(204).send();
});

CLI

npx khmer-username-filter "4dm1n"
npx khmer-username-filter --json "អាឆ្កែ_admin_user"
echo "john_official" | npx khmer-username-filter

CLI exits with code 1 when any username is flagged.

TypeScript

Types are included.

import { check, type CheckResult } from "khmer-username-filter";

const result: CheckResult = check("official.support");

Contributing

Contributions are welcome, especially improvements to Khmer bad-word coverage, transliteration variants, impersonation terms, and spam patterns. If you notice missing words or false positives, please open an issue or pull request with clear examples so the wordlists can improve over time.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

khmer-username-filter

Features

Installation

Basic Usage

API

check(username, options?)

isClean(username, options?)

checkMany(usernames, options?)

createChecker(customWordlists?)

Runtime Wordlist Extension

Express Example

CLI

TypeScript

Contributing

`check(username, options?)`

`isClean(username, options?)`

`checkMany(usernames, options?)`

`createChecker(customWordlists?)`