@chr33s/pdf-unicode-properties

v5.0.11

Published

5 months ago

Provides fast access to unicode character properties

0High
0Medium
0Low

chr33s

unicode metadata character codepoint

@chr33s/pdf-unicode-properties

Fast lookup of Unicode character metadata packaged as modern ES modules.

@chr33s/pdf-unicode-properties is part of the chr33s/pdf monorepo and continues the Hopding/unicode-properties fork of the original foliojs project. This edition is native ES modules only:

ships native ES modules with NodeNext resolution (Node.js 18+ or a modern bundler required),
is authored in TypeScript with generated declaration files, and
keeps the compressed trie assets embedded for seamless usage across Node.js, browsers, and React Native.

unicode-properties

Provides fast access to unicode character properties. Uses @chr33s/pdf-unicode-trie to compress the properties for all code points into just 12KB.

Usage

import unicodeProperties, {
	getCategory,
	getNumericValue,
} from "@chr33s/pdf-unicode-properties";

getCategory("2".codePointAt(0) ?? 0); //=> 'Nd'
getNumericValue("2".codePointAt(0) ?? 0); //=> 2

// The default export bundles all helpers together when that is convenient.
unicodeProperties.isDigit("9".codePointAt(0) ?? 0); //=> true

Installation

npm install @chr33s/pdf-unicode-properties

The package is distributed as native ES modules. Use Node.js 18+ or configure your bundler to resolve NodeNext-style imports.

API

getCategory(codePoint)

Returns the unicode general category for the given code point.

getScript(codePoint)

Returns the script for the given code point.

getCombiningClass(codePoint)

Returns the canonical combining class for the given code point.

getEastAsianWidth(codePoint)

Returns the East Asian width for the given code point.

getNumericValue(codePoint)

Returns the numeric value for the given code point, or null if there is no numeric value for that code point.

isAlphabetic(codePoint)

Returns whether the code point is an alphabetic character.

isDigit(codePoint)

Returns whether the code point is a digit.

isPunctuation(codePoint)

Returns whether the code point is a punctuation character.

isLowerCase(codePoint)

Returns whether the code point is lower case.

isUpperCase(codePoint)

Returns whether the code point is upper case.

isTitleCase(codePoint)

Returns whether the code point is title case.

isWhiteSpace(codePoint)

Returns whether the code point is whitespace: specifically, whether the category is one of Zs, Zl, or Zp.

isBaseForm(codePoint)

Returns whether the code point is a base form. A code point of base form does not graphically combine with preceding characters.

isMark(codePoint)

Returns whether the code point is a mark character (e.g. accent).

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@chr33s/pdf-unicode-properties

unicode-properties

Usage

Installation

API

getCategory(codePoint)

getScript(codePoint)

getCombiningClass(codePoint)

getEastAsianWidth(codePoint)

getNumericValue(codePoint)

isAlphabetic(codePoint)

isDigit(codePoint)

isPunctuation(codePoint)

isLowerCase(codePoint)

isUpperCase(codePoint)

isTitleCase(codePoint)

isWhiteSpace(codePoint)

isBaseForm(codePoint)

isMark(codePoint)

License