unicode-to-plain-text

v0.0.6

Published

16 hours ago

Convert fancy Unicode text to plain ASCII with smart language preservation

0High
0Medium
0Low

mafintosh

chocky335

unicode ascii text-processing normalization unicode-to-ascii string-manipulation text-sanitization text-cleaning fancy-text plain-text text-normalization unicode-normalization

unicode-to-plain-text

Convert fancy Unicode text to plain ASCII with smart language preservation

Install

npm i unicode-to-plain-text

Usage

Basic usage:

import { toPlainText } from 'unicode-to-plain-text'

// Mathematical styles
toPlainText('𝐇𝐞𝐥𝐥𝐨 𝐖𝐨𝐫𝐥𝐝') // => 'Hello World'

// Enclosed characters
toPlainText('🅣🅔🅢🅣') // => 'TEST'

// Fullwidth forms
toPlainText('ＨＥＬＬＯ') // => 'HELLO'

Language preservation:

// Real languages are automatically preserved
toPlainText('Hello Γεια σας')  // => 'Hello Γεια σας' (Greek preserved)
toPlainText('Test Привет')     // => 'Test Привет' (Cyrillic preserved)

// But lookalike characters are converted
toPlainText('Α test')  // => 'A test' (Greek Alpha → Latin A)

Custom pipelines:

import {
  pipe,
  handleUpsideDown,
  mapCharacters,
  normalizeUnicode,
  removeDecorations,
  normalizeWhitespace,
  normalizeCasing
} from 'unicode-to-plain-text'

// Create a custom pipeline
const customTransform = pipe(
  handleUpsideDown,
  mapCharacters,
  normalizeUnicode,
  removeDecorations,
  normalizeWhitespace
)

const result = customTransform('𝐓𝐄𝐒𝐓')

API

toPlainText(text, options?)

Converts fancy Unicode text to plain ASCII

| Property | Type | Description | | --------- | ------ | ---------------------------------- | | text | string | Input text with Unicode characters | | options | object | Optional configuration object |

Options

| Option | Type | Default | Description | | ---------------- | ------- | ------- | ------------------------------------------------------------------------------------ | | normalizeSpaces| boolean | true | Collapse multiple spaces and trim whitespace | | skipEmoji | boolean | false | Preserve emoji characters (still removes other decorations like box drawing, arrows) |

Examples

// Default behavior - emojis removed
toPlainText('Hello 🎉 World') // => 'Hello World'

// Preserve emojis
toPlainText('Hello 🎉 World', { skipEmoji: true }) // => 'Hello 🎉 World'

// Preserve spacing
toPlainText('Hello   World', { normalizeSpaces: false }) // => 'Hello   World'

// Combined options
toPlainText('𝐇𝐞𝐥𝐥𝐨  🎉  𝐖𝐨𝐫𝐥𝐝', { skipEmoji: true, normalizeSpaces: false })
// => 'Hello  🎉  World'

Returns a plain ASCII string with normalized whitespace and casing

Individual Functions

handleUpsideDown(text) - Reverses upside-down text
mapCharacters(text) - Maps Unicode to ASCII equivalents
normalizeUnicode(text) - Removes diacritics from Latin text
removeDecorations(text) - Removes emojis and decorations
normalizeWhitespace(text) - Normalizes and trims whitespace
normalizeCasing(text) - Normalizes inconsistent casing
pipe(...fns) - Composes functions into a pipeline

License

Apache-2.0

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

unicode-to-plain-text

Install

Usage

API

toPlainText(text, options?)

Options

Examples

Individual Functions

License