escape-unicode

v0.3.0

Published

6 months ago

Library to escape Unicode characters

0High
0Medium
0Low

neocotic

converter unicode escape

escape-unicode

escape-unicode is a Node.js package for converting Unicode characters into their corresponding Unicode escapes ("\uxxxx" notation).

Install

Install using npm:

npm install --save escape-unicode

Usage

`escapeUnicode(input[, options])`

Converts characters within input to Unicode escapes.

The filter option can be specified to control which characters are converted, and the replacer option can be specified for more granular control over how specific characters are escaped.

Characters within the Basic Multilingual Plane (BMP) as well as surrogate pairs for characters outside BMP are supported.

Options

| Option | Type | Default | Description | |------------|------------|---------|----------------------------------------------------------------------------------------------------------------------------------------| | filter | Filter | None | A function used to determine which Unicode code points should be converted to Unicode escapes. | | replacer | Replacer | None | A function that returns a replacement string for an individual Unicode character represented by a specific Unicode code point, if any. |

Examples

import { escapeUnicode, isNotAscii, replaceChars } from "escape-unicode";

escapeUnicode("I love Unicode!");
//=> "\\u0049\\u0020\\u006c\\u006f\\u0076\\u0065\\u0020\\u0055\\u006e\\u0069\\u0063\\u006f\\u0064\\u0065\\u0021"
escapeUnicode("I ♥ Unicode!");
//=> "\\u0049\\u0020\\u2665\\u0020\\u0055\\u006e\\u0069\\u0063\\u006f\\u0064\\u0065\\u0021"
escapeUnicode("I ♥ Unicode!", { filter: isNotAscii });
//=> "I \\u2665 Unicode!"
escapeUnicode("I	♥	Unicode!", { filter: isNotAscii, replacer: replaceChars({ "\t": "\\t" }) });
//=> "I\\t\\u2665\\tUnicode!"
escapeUnicode("𠮷𠮾");
//=> "\\ud842\\udfb7\\ud842\\udfbe"

`Filter(code, char)`

A function that returns whether the specified Unicode code point should be converted to a Unicode escape.

There are a several built-in Filter functions provided.

`composeFilter(...filters)`

Returns a Filter composed of the specified filters that returns true only if any of the filters provided return a truthy value.

import { composeFilter, escapeUnicode, isNotAscii } from "escape-unicode";

const filter = composeFilter(isNotAscii, (code) => code === 0x0020);
escapeUnicode("I ♥ Unicode!", { filter });
//=> "I\\u0020\\u2665\\u0020Unicode!"

`isAscii(code, char)`

A Filter that returns whether the specified Unicode code point is valid in ASCII encoding.

ASCII covers code points 0x00-0x7F (0-127).

`isNotAscii(code, char)`

A Filter that returns whether the specified Unicode code point is not valid in ASCII encoding.

ASCII covers code points 0x00-0x7F (0-127).

`isBmp(code, char)`

A Filter that returns whether the specified Unicode code point is in the Basic Multilingual Plane (BMP).

BMP covers code points 0x0000-0xFFFF (0-65535) and represents characters that can be encoded in a single UTF-16 code unit.

`isNotBmp(code, char)`

A Filter that returns whether the specified Unicode code point is not in the Basic Multilingual Plane (BMP).

BMP covers code points 0x0000-0xFFFF (0-65535) and represents characters that can be encoded in a single UTF-16 code unit.

`isLatin1(code, char)`

A Filter that returns whether the specified Unicode code point is valid in Latin-1 (ISO 8859-1) encoding.

Latin-1 covers code points 0x00-0xFF (0-255).

`isNotLatin1(code, char)`

A Filter that returns whether the specified Unicode code point is not valid in Latin-1 (ISO 8859-1) encoding.

Latin-1 covers code points 0x00-0xFF (0-255).

`Replacer(code, char)`

A function that returns a replacement string for an individual Unicode character represented by a specific Unicode code point, if any.

If a non-empty string is returned, the Unicode character will be replaced by that string in the returned string instead of its Unicode escape.

If an empty string is returned, the Unicode character will be removed from the returned string. If either null or undefined are returned, the Unicode character will be replaced with its Unicode escape in the returned string.

There are a several built-in Replacer functions provided.

`composeReplacer(...replacers)`

Returns a Replacer composed of the specified replacers that returns the replacement string returned from the first Replacer to return a string, where possible.

import { composeReplacer, escapeUnicode, replaceChars, replaceCodes } from "escape-unicode";

const replacer = composeReplacer(
  replaceChars({ "\f": "\\f", "\n": "\\n", "\r": "\\r" }),
  replaceCodes({ 0x009: "\\t" }),
);
escapeUnicode("I	♥	Unicode!", { replacer });
//=> "\\u0049\\t\\u2665\\t\\u0055\\u006e\\u0069\\u0063\\u006f\\u0064\\u0065\\u0021"

`replaceChar(char, replacement)`

Returns a Replacer that returns the specified replacement string for the individual Unicode character provided.

import { escapeUnicode, replaceChar } from "escape-unicode";

escapeUnicode("I	♥	Unicode!", { replacer: replaceChar("\t", "\\t") });
//=> "\\u0049\\t\\u2665\\t\\u0055\\u006e\\u0069\\u0063\\u006f\\u0064\\u0065\\u0021"

`replaceChars(...replacers)`

Returns a Replacer that returns replacement strings looked up from the specified replacements, where possible.

The keys within replacements are expected to be the individual Unicode character.

import { escapeUnicode, replaceChars } from "escape-unicode";

const replacements = new Map([
  ["\f", "\\f"],
  ["\n", "\\n"],
  ["\r", "\\r"],
  ["\t", "\\t"],
]);
escapeUnicode("I	♥	Unicode!", { replacer: replaceChars(replacements) });
//=> "\\u0049\\t\\u2665\\t\\u0055\\u006e\\u0069\\u0063\\u006f\\u0064\\u0065\\u0021"

`replaceCode(code, replacement)`

Returns a Replacer that returns the specified replacement string for the Unicode code point representing the individual Unicode character provided.

import { escapeUnicode, replaceCode } from "escape-unicode";

escapeUnicode("I	♥	Unicode!", { replacer: replaceCode(0x0009, "\\t") });
//=> "\\u0049\\t\\u2665\\t\\u0055\\u006e\\u0069\\u0063\\u006f\\u0064\\u0065\\u0021"

`replaceCodes(...replacers)`

Returns a Replacer that returns replacement strings looked up from the specified replacements, where possible.

The keys within replacements are expected to be Unicode code points representing the Unicode characters.

import { escapeUnicode, replaceCodes } from "escape-unicode";

const replacements = {
  0x000c: "\\f",
  0x000a: "\\n",
  0x000d: "\\r",
  0x0009: "\\t",
};
escapeUnicode("I	♥	Unicode!", { replacer: replaceCodes(replacements) });
//=> "\\u0049\\t\\u2665\\t\\u0055\\u006e\\u0069\\u0063\\u006f\\u0064\\u0065\\u0021"

Bugs

If you have any problems with this package or would like to see changes currently in development, you can do so here.

Contributors

If you want to contribute, you're a legend! Information on how you can do so can be found in CONTRIBUTING.md. We want your suggestions and pull requests!

A list of all contributors can be found in AUTHORS.md.

License

See LICENSE.md for more information on our MIT license.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

escape-unicode

Install

Usage

escapeUnicode(input[, options])

Options

Examples

Filter(code, char)

composeFilter(...filters)

isAscii(code, char)

isNotAscii(code, char)

isBmp(code, char)

isNotBmp(code, char)

isLatin1(code, char)

isNotLatin1(code, char)

Replacer(code, char)

composeReplacer(...replacers)

replaceChar(char, replacement)

replaceChars(...replacers)

replaceCode(code, replacement)

replaceCodes(...replacers)

Related

Bugs

Contributors

License

`escapeUnicode(input[, options])`

`Filter(code, char)`

`composeFilter(...filters)`

`isAscii(code, char)`

`isNotAscii(code, char)`

`isBmp(code, char)`

`isNotBmp(code, char)`

`isLatin1(code, char)`

`isNotLatin1(code, char)`

`Replacer(code, char)`

`composeReplacer(...replacers)`

`replaceChar(char, replacement)`

`replaceChars(...replacers)`

`replaceCode(code, replacement)`

`replaceCodes(...replacers)`