npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

anti-trojan-source

v1.11.2

Published

Detect trojan source attacks that employ unicode bidi attacks to inject malicious code

Readme

About

Detects cases of trojan source attacks that employ unicode bidi attacks to inject malicious code, as well as other attacks that use confusable characters (such as glassworm attacks). The tool uses both an explicit list of dangerous Unicode characters and category-based detection to catch invisible characters by their Unicode category (Format and Control categories).

https://github.com/user-attachments/assets/8f10628f-3746-469e-a296-01523beeaa42

If you're using ESLint:

Detection Capabilities

anti-trojan-source provides comprehensive protection by detecting:

  • 281 explicit confusable scalars — bidirectional controls, zero-width characters, BMP variation selectors, a small set of non-Cf/Cc invisibles (Hangul fillers, U+034F), plus 240 supplementary variation selectors (U+E0100–U+E01EF)
  • All Unicode Format characters (Cf category) — invisible formatting characters by category (including Unicode tag letters used for ASCII smuggling / hidden payloads)
  • All Unicode Control characters (Cc category) — except commonly-used whitespace (TAB, LF, CR)

Category-based Cf/Cc detection keeps the tool future-proof as Unicode adds new format or control code points. The explicit list covers characters that matter for security but are not Cf/Cc (e.g. BMP variation selectors are Mn, not Cf).

Scope

This project scans decoded Unicode text — the string you get after reading a UTF‑8 (or other Unicode encoding) file the usual way. It does not inspect raw bytes, URLs, or tokenizer-specific behavior.

In scope

| Topic | Detection approach | | ----- | ------------------ | | Trojan Source (bidi embeddings, overrides, isolates, PDF, etc.) | Cf ranges + explicit list | | Zero-width / word joiner / BOM / soft hyphen (where Cf or listed) | Cf + explicit list | | Unicode Tags block (U+E0001, U+E0020–U+E007F) — invisible ASCII-shaped payloads (background) | Cf | | Variation selectors (BMP U+FE00–U+FE0F + supplement U+E0100–U+E01EF) | Explicit list (Mn in Unicode, not Cf) | | Strict explicit blocklist of a few non-Cf/Cc scalars that often render invisibly (U+034F, U+115F, U+1160, U+3164) | Explicit list only | | Any other Format (Cf) or Control (Cc) code point | Category tables (Cc minus TAB/LF/CR) | | Dangerous confusables on the maintained explicit list (e.g. NO-BREAK SPACE) | Explicit list | | Optional extended blocklist (CLI --extended / --all, library extended: true) — small set of ASCII-lookalike homoglyphs and extra invisible letters | src/extended-blocklist.js; findings use severity: "low" vs default "high" |

Out of scope

| Topic | Reason | | ----- | ------ | | Full homoglyph / mixed-script confusable-IDN databases (“every Cyrillic lookalike of Latin”) | Default scan still uses a small curated set; --extended adds more lookalikes but not a complete IDN/confusables database | | UTF‑8 “sneaky” byte patterns, overlong encodings, non-Unicode steganography | Needs byte-level analysis, not scalar-by-scalar Unicode | | URL / percent-encoded layers, HTML entities | Decode/normalize elsewhere first | | Full rendering, grapheme clusters, locale-specific display rules | Tooling is scalar-based and intentionally simple | | Whether a finding is malicious | High-signal alert for human review |

Invisible Characters Support Matrix

The following table summarizes attack styles versus what this tool flags:

| Attack Type | Supported | Notes | | ----------- | :-------: | ----- | | Trojan Source | ✅ | Bidi / format controls per trojansource.codes. | | Glassworm / confusable identifiers | ✅ (partial) | Flags explicit confusables and all Cf/Cc — not a complete homoglyph alphabet. | | Unicode tag / “ASCII smuggling” | ✅ | Tag letters are Cf; see Embrace The Red. | | Extended variation selectors | ✅ | U+E0100–U+E01EF on explicit list. | | Category-based Cf / Cc | ✅ | Future-proof for new format/control code points. | | Invisible letters (strict list) | ✅ | U+034F, Hangul fillers — explicit blocklist only. | | Extended homoglyphs / extra invisibles | ✅ (opt-in) | Use CLI --extended or --all, or hasConfusables({ extended: true }). Noisier; see src/extended-blocklist.js. |

Why is Confusable Unicode Character detection important?

The following publication on the topic of unicode characters attacks, dubbed Trojan Source: Invisible Vulnerabilities, has caused a lot of concern from potential supply chain attacks where adversaries are able to inject malicious code into the source code of a project, slipping by unseen in the code review process. This project expands on that to detect other forms of confusable characters that can be used in similar attacks.

For more information on the topic, you're welcome to read on the official website trojansource.codes and the following source code repository which contains the source code of the publication.


Table of Contents


Use as a CLI

anti-trojan-source is an npm package that supports detecting files that contain confusable unicode characters in them, per the research.

Detect confusable characters using file globbing

The following command will detect all files that contain confusable unicode characters in them based on the file matching pattern that was provided to it:

npx anti-trojan-source --files='src/**/*.js'

If it doesn't find anything it will return with a 0 exit code and print to stdout:

[✓] No confusable characters detected

Detect confusable characters using file paths

npx anti-trojan-source '/src/index.js' '/src/helper.js'

If it found any matching confusable unicode characters, it will return with an exit code of 1 and print to stderr:

[x] Detected cases of confusable characters in the following files:
|
 - /src/index.js
 - /src/helper.js
Note: For backward compatibility, `hasTrojanSource({...})` is still exported as an alias to `hasConfusables({...})`. It is deprecated and will be removed in a future major version. Prefer `hasConfusables` going forward.

Detect confusable characters by piping input

If you just run npx anti-trojan-source and pipe in a file contents, it will detect the confusable unicode characters in that file:

cat /src/index.js | npx anti-trojan-source

Verbose output mode

Use the --verbose (or -v) flag to get detailed information about each detected character, including line and column numbers, character names, and Unicode code points:

npx anti-trojan-source --files='src/**/*.js' --verbose

Example output:

[x] Detected cases of trojan source in the following files:
| 
 - src/utils.js

   Line 12:34 - U+200B ZERO WIDTH SPACE [Cf (Format)]
   Snippet: const value = getUserInput()
   Line 45:10 - U+202E RIGHT-TO-LEFT OVERRIDE [Cf (Format)]
   Snippet: if (isAdmin) { // Check permissions

This mode is particularly useful for:

  • Code reviews: Quickly identify where invisible characters are located
  • Debugging: Understand which specific characters are causing issues
  • Security audits: Get detailed reports of all suspicious characters

Extended scan (--extended / --all)

By default, the tool only reports high-severity matches: all Cf/Cc (except TAB/LF/CR) plus the core explicit list in src/constants.js. To also flag a curated set of ASCII-lookalike homoglyphs and a few extra invisible letters (see src/extended-blocklist.js), pass --extended or --all. Those findings are labeled severity: "low" in JSON and in verbose CLI output (still exit code 1 when any finding is present).

Low-severity hits can appear in legitimate localized text; treat them as review prompts, not automatic malice.

npx anti-trojan-source --files='src/**/*.js' --extended
npx anti-trojan-source --files='src/**/*.js' --extended --verbose

JSON output mode

Use the --json (or -j) flag to get machine-readable JSON output, perfect for CI/CD integration and automated processing:

npx anti-trojan-source --files='src/**/*.js' --json

Example output:

[
  {
    "file": "src/utils.js",
    "findings": [
      {
        "line": 12,
        "column": 34,
        "codePoint": "U+200B",
        "name": "ZERO WIDTH SPACE",
        "category": "Cf (Format)",
        "severity": "high",
        "snippet": "const value = getUserInput()"
      }
    ]
  }
]

This mode enables:

  • CI/CD integration: Parse results programmatically in your pipeline
  • Custom reporting: Build your own reporting tools on top of the detection
  • Automated workflows: Trigger specific actions based on findings

Use as an eslint plugin

Refer to the ESLint Plugin for this CLI and the README on that repository which clearly explains how to set it up: eslint-plugin-anti-trojan-source.

Use as a library

Simple boolean check

To use it as a library and pass it file contents to detect (backward compatible):

import { hasConfusables } from 'anti-trojan-source'

const isDangerous = hasConfusables({
  sourceText: 'if (accessLevel != "user‮ ⁦// Check if admin⁩ ⁦") {'
})

console.log(isDangerous) // true or false

hasConfusables returns a boolean when called without the detailed option.

Detailed findings

Get comprehensive information about detected characters including their location, names, and categories:

import { hasConfusables } from 'anti-trojan-source'

const findings = hasConfusables({
  sourceText: 'const value\u200b = 123', // ZERO WIDTH SPACE
  detailed: true
})

// Optional: pass extended: true to include homoglyphs / extra invisibles (severity "low").

console.log(findings)
// [
//   {
//     line: 1,
//     column: 12,
//     codePoint: "U+200B",
//     name: "ZERO WIDTH SPACE",
//     category: "Cf (Format)",
//     severity: "high",
//     snippet: "const value = 123"
//   }
// ]

Each finding includes:

  • line: Line number where the character was found
  • column: Column number where the character was found
  • codePoint: Unicode code point (e.g., "U+200B")
  • name: Descriptive name of the character
  • category: Unicode category, Confusable, Variation Selector, or Extended blocklist (when severity is low)
  • severity: "high" (default scan: Cf/Cc + core explicit list) or "low" (extended blocklist only, when extended: true)
  • snippet: Context from the line (up to 80 characters)

The package also exports extendedConfusableChars if you need to introspect the opt-in list.

You can also check multiple files at once:

import { hasConfusablesInFiles } from 'anti-trojan-source'

const results = hasConfusablesInFiles({
  filePaths: ['src/index.js', 'src/utils.js'],
  detailed: true // Optional: get detailed findings
})

console.log(results)
// [
//   {
//     file: "src/index.js",
//     findings: [ /* array of findings */ ]
//   }
// ]

Use as a pre-commit hook

To add this tool to your project as a pre-commit hook, try this sample configuration in .pre-commit-config.yaml:

repos:
  - repo: https://github.com/lirantal/anti-trojan-source
    rev: v1.8.1  # choose the release you want
    hooks:
      - id: anti-trojan-source

References

Contributing

Please consult CONTRIBUTING for guidelines on contributing to this project.

Author

anti-trojan-source © Liran Tal, Released under the Apache-2.0 License.