npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

unicode-escaper

v1.0.1

Published

A robust Unicode escape/unescape library supporting multiple formats with streaming support

Readme

unicode-escaper

A robust, zero-dependency Unicode escape/unescape library for JavaScript and TypeScript. Supports multiple escape formats, bidirectional conversion, and streaming for large files.

npm version License

Features

  • Multiple escape formats: \uXXXX, \u{XXXXX}, \xNN, &#xNNNN;, &#NNNN;, U+XXXX
  • Bidirectional: Both escape and unescape in one package
  • Streaming support: Process large files efficiently with Node.js and Web Streams
  • Full Unicode support: Handles BMP, supplementary planes, surrogate pairs, and emoji
  • Zero dependencies: Lightweight and fast
  • TypeScript-first: Written in TypeScript with strict types
  • Dual ESM/CJS: Works with both module systems
  • Customizable filters: Control exactly which characters to escape

Installation

npm install unicode-escaper
# or
pnpm add unicode-escaper
# or
yarn add unicode-escaper

Quick Start

import { escape, unescape } from "unicode-escaper";

// Escape non-ASCII characters
escape("Hello 世界");
// => 'Hello \u4E16\u754C'

// Unescape back to original
unescape("Hello \\u4E16\\u754C");
// => 'Hello 世界'

Escape Formats

| Format | Example | Description | | -------------- | ---------- | -------------------------------------------- | | unicode | \u4E16 | Standard JavaScript Unicode escape (default) | | unicode-es6 | \u{4E16} | ES6 Unicode escape (supports full range) | | hex | \xE9 | Hex escape (0x00-0xFF only) | | html-hex | 世 | HTML hexadecimal entity | | html-decimal | 世 | HTML decimal entity | | codepoint | U+4E16 | Unicode code point notation |

API Reference

Core Functions

escape(input, options?)

Escapes Unicode characters in a string.

import { escape } from "unicode-escaper";

// Default: preserve ASCII, escape everything else
escape("Café 世界 😀");
// => 'Caf\u00E9 \u4E16\u754C \uD83D\uDE00'

// Use ES6 format for emoji (cleaner output)
escape("Hello 😀", { format: "unicode-es6" });
// => 'Hello \u{1F600}'

// HTML entities
escape("Café", { format: "html-hex" });
// => 'Café'

escape("Café", { format: "html-decimal" });
// => 'Café'

// Escape everything (including ASCII)
escape("Hi", { preserveAscii: false });
// => '\u0048\u0069'

// Preserve Latin-1 characters
escape("Café 世界", { preserveLatin1: true });
// => 'Café \u4E16\u754C'

// Lowercase hex digits
escape("世", { uppercase: false });
// => '\u4e16'

unescape(input, options?)

Unescapes Unicode sequences back to characters.

import { unescape } from "unicode-escaper";

// Automatically detects and unescapes all formats
unescape("\\u4E16"); // => '世'
unescape("\\u{1F600}"); // => '😀'
unescape("\\xE9"); // => 'é'
unescape("世"); // => '世'
unescape("世"); // => '世'
unescape("U+4E16"); // => '世'

// Handle surrogate pairs
unescape("\\uD83D\\uDE00"); // => '😀'

// Only unescape specific formats
unescape("\\u4E16 世", { formats: ["unicode"] });
// => '世 世'

// Strict mode (throws on invalid sequences)
unescape("\\uZZZZ", { lenient: false });
// => throws Error

Convenience Functions

import {
  escapeToUnicode, // \uXXXX format
  escapeToUnicodeES6, // \u{XXXXX} format
  escapeToHex, // \xNN format
  escapeToHtmlHex, // &#xNNNN; format
  escapeToHtmlDecimal, // &#NNNN; format
  escapeToCodePoint, // U+XXXX format
  escapeAll, // Escape all characters
  escapeNonPrintable, // Escape control chars and non-ASCII
} from "unicode-escaper";

escapeToUnicodeES6("😀"); // => '\u{1F600}'
escapeToHtmlHex("世"); // => '世'
escapeAll("Hi"); // => '\u0048\u0069'
import {
  unescapeUnicode, // Only \uXXXX
  unescapeUnicodeES6, // Only \u{XXXXX}
  unescapeHex, // Only \xNN
  unescapeHtmlHex, // Only &#xNNNN;
  unescapeHtmlDecimal, // Only &#NNNN;
  unescapeCodePoint, // Only U+XXXX
  unescapeHtml, // Both HTML formats
  unescapeJs, // All JavaScript formats
} from "unicode-escaper";

Custom Filters

Control which characters to escape using filter functions:

import { escape, isNotAscii, isNotBmp, and, or, oneOf } from "unicode-escaper";

// Escape only non-ASCII (default behavior)
escape("Hello 世界", { filter: isNotAscii });

// Escape only emoji (non-BMP characters)
escape("Hello 世界 😀", { filter: isNotBmp });
// => 'Hello 世界 \uD83D\uDE00'

// Escape vowels
escape("Hello", { filter: oneOf("aeiouAEIOU") });
// => 'H\u0065ll\u006F'

// Combine filters
escape("Test", { filter: and(isNotAscii, isNotBmp) });

Available filters:

  • isAscii / isNotAscii - ASCII range (0x00-0x7F)
  • isLatin1 / isNotLatin1 - Latin-1 range (0x00-0xFF)
  • isBmp / isNotBmp - Basic Multilingual Plane (0x0000-0xFFFF)
  • isPrintableAscii / isNotPrintableAscii - Printable ASCII (0x20-0x7E)
  • isControl - Control characters
  • isWhitespace - Whitespace characters
  • isSurrogate / isHighSurrogate / isLowSurrogate - Surrogate code points
  • inRange(start, end) / notInRange(start, end) - Custom range
  • oneOf(chars) / noneOf(chars) - Character set
  • and(...filters) / or(...filters) / not(filter) - Combinators
  • all / none - Always true/false

Utility Functions

import {
  getCodePoint, // Get code point of a character
  fromCodePoint, // Create character from code point
  getCharInfo, // Get detailed character information
  toCodePoints, // Convert string to code point array
  fromCodePoints, // Convert code point array to string
  codePointLength, // Get length in code points (not UTF-16)
  toHex, // Convert code point to hex string
  parseHex, // Parse hex string to code point
  isValidUnicode, // Check for unpaired surrogates
  normalizeNFC, // Normalize to NFC
  normalizeNFD, // Normalize to NFD
  unicodeEquals, // Compare Unicode equivalence
} from "unicode-escaper";

// Get code point
getCodePoint("😀"); // => 128512 (0x1F600)

// Character info
getCharInfo("😀");
// => {
//   char: '😀',
//   codePoint: 128512,
//   hex: '1F600',
//   isAscii: false,
//   isBmp: false,
//   isLatin1: false,
//   isHighSurrogate: false,
//   isLowSurrogate: false,
//   utf16Length: 2
// }

// Code point length (differs from string.length for emoji)
"😀".length; // => 2 (UTF-16 code units)
codePointLength("😀"); // => 1 (actual characters)

// Parse various formats
parseHex("U+1F600"); // => 128512
parseHex("0x4E16"); // => 19990
parseHex("\\u{4E16}"); // => 19990

Streaming Support

Process large files efficiently without loading everything into memory:

Node.js Streams

import { createReadStream, createWriteStream } from "fs";
import { pipeline } from "stream/promises";
import { EscapeStream, UnescapeStream } from "unicode-escaper";

// Escape a file
await pipeline(
  createReadStream("input.txt", "utf8"),
  new EscapeStream({ escapeOptions: { format: "unicode-es6" } }),
  createWriteStream("escaped.txt")
);

// Unescape a file
await pipeline(
  createReadStream("escaped.txt", "utf8"),
  new UnescapeStream(),
  createWriteStream("output.txt")
);

Web Streams API

import {
  createWebEscapeStream,
  createWebUnescapeStream,
} from "unicode-escaper";

// Works in browsers and modern Node.js
const response = await fetch("data.txt");
const escaped = response.body
  .pipeThrough(new TextDecoderStream())
  .pipeThrough(createWebEscapeStream({ format: "html-hex" }))
  .pipeThrough(new TextEncoderStream());

Detection Utilities

import { hasEscapeSequences, countEscapeSequences } from "unicode-escaper";

hasEscapeSequences("\\u4E16"); // => true
hasEscapeSequences("Hello"); // => false

countEscapeSequences("\\u4E16\\u754C"); // => 2

// Filter by format
hasEscapeSequences("\\u4E16", ["unicode"]); // => true
hasEscapeSequences("\\u4E16", ["html-hex"]); // => false

TypeScript Support

Full TypeScript support with strict types:

import type {
  EscapeFormat,
  EscapeOptions,
  UnescapeOptions,
  FilterFunction,
  CharacterInfo,
  EscapeResult,
} from "unicode-escaper";

// Type-safe options
const options: EscapeOptions = {
  format: "unicode-es6",
  preserveAscii: true,
  uppercase: true,
};

// Custom filter with proper typing
const myFilter: FilterFunction = (char, codePoint) => {
  return codePoint > 0x7f;
};

Comparison with escape-unicode

| Feature | escape-unicode | unicode-escaper | | --------------- | ---------------- | --------------- | | Escape formats | \uXXXX only | 6 formats | | Unescape | Separate package | Built-in | | Streaming | No | Yes | | Web Streams | No | Yes | | ESM + CJS | CJS only | Both | | Browser support | Node only | Both | | TypeScript | Yes | Yes (strict) | | Zero deps | Yes | Yes |

International Language Support

Fully tested with diverse Unicode scripts:

| Language | Script | Example | Escaped | | ---------- | ----------------------- | -------------- | -------------------------------------- | | Korean | Hangul | 안녕하세요 | \uC548\uB155\uD558\uC138\uC694 | | Japanese | Hiragana/Katakana/Kanji | こんにちは | \u3053\u3093\u306B\u3061\u306F | | Arabic | Arabic | مرحبا | \u0645\u0631\u062D\u0628\u0627 | | Thai | Thai | สวัสดี | \u0E2A\u0E27\u0E31\u0E2A\u0E14\u0E35 | | Russian | Cyrillic | Привет | \u041F\u0440\u0438\u0432\u0435\u0442 | | Hindi | Devanagari | नमस्ते | \u0928\u092E\u0938\u094D\u0924\u0947 | | Chinese | Han | 你好 | \u4F60\u597D | | Vietnamese | Latin Extended | Xin chào | Xin ch\u00E0o | | French | Latin Extended | Café | Caf\u00E9 | | Turkish | Latin Extended | Türkçe | T\u00FCrk\u00E7e | | Spanish | Latin Extended | ¡Hola! | \u00A1Hola! | | Portuguese | Latin Extended | São Paulo | S\u00E3o Paulo |

import { escape, unescape } from "unicode-escaper";

// Korean
escape("안녕하세요"); // => '\uC548\uB155\uD558\uC138\uC694'

// Japanese (mixed scripts)
escape("東京 とうきょう トウキョウ");

// Arabic (RTL)
escape("مرحبا"); // => '\u0645\u0631\u062D\u0628\u0627'

// Thai (with tone marks)
escape("สวัสดี");

// Russian
escape("Привет"); // => '\u041F\u0440\u0438\u0432\u0435\u0442'

// Hindi (with combining marks)
escape("नमस्ते"); // => '\u0928\u092E\u0938\u094D\u0924\u0947'

// Chinese
escape("你好世界"); // => '\u4F60\u597D\u4E16\u754C'

// Vietnamese (with diacritics)
escape("Xin chào"); // => 'Xin ch\u00E0o'

// Turkish (special i variants)
escape("İstanbul"); // => '\u0130stanbul'

// Spanish (inverted punctuation)
escape("¡Hola!"); // => '\u00A1Hola!'

// Portuguese (tildes and cedilla)
escape("São Paulo"); // => 'S\u00E3o Paulo'

// Mixed multi-language content
const mixed = "Hello 안녕 こんにちは 你好 مرحبا สวัสดี Привет नमस्ते";
unescape(escape(mixed)) === mixed; // => true

Supported Features

  • Combining characters: Thai tone marks, Arabic diacritics, Hindi matras/virama, Vietnamese diacritics
  • Bidirectional text: RTL markers, mixed LTR/RTL content
  • Native numerals: Thai ๒๐๒๔, Arabic ٢٠٢٤, Devanagari २०२४
  • Conjunct consonants: Hindi samyuktakshar (क्ष, त्र, ज्ञ)
  • Supplementary planes: Emoji, ancient scripts, mathematical symbols
  • Normalization: Handles NFC/NFD forms correctly
  • Extended Latin: French accents, Turkish special i (ı İ), Spanish ñ, Portuguese ã/õ

Browser Support

Works in all modern browsers that support ES2022. For older browsers, you may need polyfills for:

  • String.prototype.codePointAt
  • String.fromCodePoint
  • Web Streams API (if using streaming)

License

Apache-2.0