npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

deghost

v0.0.1

Published

Strip invisible Unicode characters and normalize whitespace. Chainable, typesafe, zero dependencies.

Downloads

94

Readme

deghost

Strip invisible Unicode characters and normalize whitespace. Chainable, typesafe, zero dependencies.

npm install deghost

Why

Text from binary formats, APIs, and user input is full of invisible Unicode characters — non-breaking spaces, zero-width joiners, directional marks, BOM, control characters. They break string comparison, corrupt search indexes, and produce garbled output.

Existing tools either strip everything indiscriminately or miss entire character categories. deghost gives you category-level control with a chainable API that distinguishes between stripping (remove entirely) and normalizing (replace with a visible substitute).

Quick start

import { deghost } from 'deghost'

// Sensible defaults — handles the common cases
;`${deghost('Plant\u00a064\u00a0-\u00a0Woodbridge')}`
// → 'Plant 64 - Woodbridge'

`${deghost('hello\u200Bworld')}`
// → 'helloworld'

// Also works as a tagged template literal
`${deghost`Plant\u00a064\u00a0-\u00a0Woodbridge`}`
// → 'Plant 64 - Woodbridge'

Chainable API

Fine-grained control over what gets stripped vs. normalized:

import { deghost } from 'deghost'

deghost('text\u200B\u00a0here')
  .strip('format') // zero-width joiners, directional marks, soft hyphens
  .strip('control') // C0/C1 control characters
  .normalize('spaces') // NBSP, en/em space → regular space
  .trim()
  .toString()
// → 'text here'

The chain is immutable — each method returns a new instance, so you can branch without side effects.

Chain methods

| Method | Returns | Description | | ------------------------------------ | -------------- | -------------------------------------------------------------------- | | .strip(category) | DeghostChain | Remove all characters in a category | | .normalize(category, replacement?) | DeghostChain | Replace characters with a substitute (default: ' ') | | .replace(category, mapper) | DeghostChain | Replace characters using a function that receives detection metadata | | .highlight(category?, formatter?) | DeghostChain | Replace ghosts with visible markers like [U+200B] | | .collapse() | DeghostChain | Collapse runs of whitespace into a single space | | .trim() | DeghostChain | Trim leading/trailing whitespace | | .clean() | DeghostChain | Apply the default preset | | .detect(categories?) | Detection[] | Return detections for the current value | | .hasGhosts(categories?) | boolean | Check if invisible characters remain | | .isClean(categories?) | boolean | Inverse of .hasGhosts() | | .count(categories?) | Record | Count ghosts by category | | .summary(categories?) | string | Human-readable report of ghosts found | | .toString() | string | Extract the string |

Categories:

| Category | What it matches | Default behavior | | --------- | ------------------------------------------------------------ | ------------------ | | format | Zero-width joiners, directional marks, soft hyphens (\p{Cf}) | Strip | | control | C0/C1 control characters (\p{Cc}) | Strip | | spaces | NBSP, en/em space, thin space, ideographic space (\p{Zs}) | Normalize to ' ' | | bom | Byte order mark (U+FEFF) | Strip | | tag | Unicode tag characters (U+E0001–U+E007F) | — | | fillers | Hangul, Khmer, Mongolian, Ogham fillers | — | | math | Invisible math operators (U+2061–U+2064) | — |

Reusable cleaners

Build a cleaning pipeline once, apply it to many strings with no per-call chain allocation:

import { cleaner } from 'deghost'

const clean = cleaner().strip('format').strip('control').normalize('spaces').trim().build()

clean('dirty\u00a0string') // 'dirty string'
clean('another\u200Bone') // 'anotherone'

Cleaners also support .replace() and .highlight() for dynamic transformations:

const annotate = cleaner().highlight('format').normalize('spaces').build()

annotate('a\u200Bb\u00a0c') // 'a[U+200B]b c'

Detection

Find out what's hiding in your strings:

import { detect, hasGhosts, isClean, count, first, scan } from 'deghost'

detect('sneaky\u200Btext')
// [{
//   char: '\u200B',
//   codepoint: 'U+200B',
//   name: 'ZERO WIDTH SPACE',
//   category: 'format',
//   offset: 6
// }]

hasGhosts('hello\u200Bworld') // true
isClean('hello world') // true

count('a\u00a0b\u200Bc\u200Bd')
// { spaces: 1, format: 2 }

// Get just the first detection (stops early)
first('a\u200Bb\u00a0c')
// { char: '\u200B', codepoint: 'U+200B', ... }

// Lazy iterator for large strings
for (const d of scan(largeString)) {
  if (d.category === 'format') break
}

All detection functions accept an optional categories array to filter:

detect('a\u200Bb\u00a0c', ['spaces'])
// Only returns the NBSP detection

Highlighting

Make invisible characters visible for debugging:

import { highlight } from 'deghost'

highlight('hello\u200Bworld')
// 'hello[U+200B]world'

// Custom formatter
highlight('a\u200Bb', (d) => `{${d.name}}`)
// 'a{ZERO WIDTH SPACE}b'

// Filter by category
highlight('a\u00a0b\u200Bc', { categories: ['format'] })
// 'a\u00a0b[U+200B]c'

Summary

Get a human-readable report of all invisible characters:

import { summary } from 'deghost'

summary('hello\u200Bworld\u00a0here')
// 2 invisible characters found.
//
// By category:
//   format: 1
//   spaces: 1
//
// Details:
//   U+200B  ZERO WIDTH SPACE  (format, offset 5)
//   U+00A0  NO-BREAK SPACE  (spaces, offset 11)

Character lookup

Identify a single character or codepoint:

import { identify } from 'deghost'

identify('\u200B')
// { codepoint: 'U+200B', name: 'ZERO WIDTH SPACE', category: 'format' }

identify(0x00a0)
// { codepoint: 'U+00A0', name: 'NO-BREAK SPACE', category: 'spaces' }

identify('a') // undefined — not a ghost

Presets

import { presets } from 'deghost'

// Default: strip format + control + BOM, normalize spaces
presets.clean('text\u00a0with\u200Bghosts')
// → 'text with ghosts'

// Aggressive: strip everything invisible
presets.aggressive('text\u2061with\u200Bghosts')
// → 'textwithghosts'

// Spaces only: just normalize whitespace
presets.spaces('text\u00a0here')
// → 'text here'

How it works

deghost uses ES2018 Unicode property escapes (\p{Cf}, \p{Cc}, \p{Zs}) for broad category matching, plus curated codepoint sets for categories not covered by a single Unicode general category (tag characters, script-specific fillers, invisible math operators).

The key design choice: strip vs. normalize. A non-breaking space (U+00A0) should become a regular space, not disappear — otherwise "Plant\u00a064" becomes "Plant64". deghost handles this by default; out-of-character does not.

Comparison

| Feature | deghost | out-of-character | | ------------------------------- | ------- | ---------------- | | Strip invisible chars | yes | yes | | Normalize spaces (NBSP → space) | yes | no (strips) | | Chainable API | yes | no | | Reusable cleaners | yes | no | | Detection with metadata | yes | yes | | Category-level control | yes | no | | Highlighting / debugging | yes | no | | Tagged template literal | yes | no | | TypeScript-native | yes | no | | Presets | yes | no | | CLI | not yet | yes | | Zero dependencies | yes | yes |

Requirements

Node.js >= 18. Uses ES2018 Unicode property escapes (supported in all modern runtimes).

License

MIT