@vermaysha/windows-1252
v1.0.0
Published
a Fast way to encode/decode windows-1252 encoding on Javascript/Typescript work on NodeJS/Bun/Browser
Readme
@vermaysha/windows-1252
The version I created for my personal use. If you want to try it and find that it suits you, congratulations! You can use this library.
Technically, this can be used throughout the JavaScript/Typescript/EcmaScript ecosystem or whatever it is.
Features
- Encode from blabla string to windows-1252 encoding
- Decode from windows-1252 encoding to blablabla
It's as simple as encoding and decoding, just that simple! Don't expect any other features.
Installation
# npm
npm install @vermaysha/windows-1252
# yarn
yarn add @vermaysha/windows-1252
# pnpm
pnpm add @vermaysha/windows-1252
# bun
bun add @vermaysha/windows-1252CDN Usage (esm.sh)
You can also use this package directly from a CDN without installation:
// Import everything
import { encode, decode } from "https://esm.sh/@vermaysha/windows-1252";
// Import only encode function (tree-shakeable)
import { encode } from "https://esm.sh/@vermaysha/windows-1252/encode";
// Import only decode function (tree-shakeable)
import { decode } from "https://esm.sh/@vermaysha/windows-1252/decode";This package exports modular functions, so you can import only what you need to reduce bundle size. Each export path (/encode and /decode) contains only the specific functionality without importing the other.
Usage
Basic Encoding
import { encode } from "@vermaysha/windows-1252";
// Encode a string to Windows-1252 bytes
const bytes = encode("Hello, world!");
console.log(bytes); // Uint8Array [72, 101, 108, 108, 111, 44, 32, 119, 111, 114, 108, 100, 33]
// Encode with special characters
const euroBytes = encode("Price: €100");
console.log(euroBytes); // Uint8Array with € encoded as 0x80Basic Decoding
import { decode } from "@vermaysha/windows-1252";
// Decode Windows-1252 bytes to a string
const bytes = new Uint8Array([72, 101, 108, 108, 111]);
const text = decode(bytes);
console.log(text); // "Hello"
// Decode with Windows-1252 specific characters
const euroBytes = new Uint8Array([0x80]); // € symbol
const euroText = decode(euroBytes);
console.log(euroText); // "€"Encoding Options
The encode function accepts an optional options parameter:
import { encode } from "@vermaysha/windows-1252";
// Fatal mode (default) - throws error for unencodable characters
try {
encode("Hello 世界"); // Contains Chinese characters not in Windows-1252
} catch (error) {
console.error(error.message); // "Character '世' (U+4E16) cannot be encoded in Windows-1252"
}
// Replacement mode - replaces unencodable characters with '?'
const bytes = encode("Hello 世界", { mode: "replacement" });
console.log(decode(bytes)); // "Hello ??"Decoding Options
The decode function also accepts an optional options parameter:
import { decode } from "@vermaysha/windows-1252";
// Replacement mode (default) - replaces invalid bytes with �
const invalidBytes = new Uint8Array([0x48, 0x81, 0x65]); // 0x81 is undefined in Windows-1252
const text = decode(invalidBytes);
console.log(text); // "H�e"
// Fatal mode - throws error for invalid bytes
try {
decode(invalidBytes, { mode: "fatal" });
} catch (error) {
console.error(error.message); // "Invalid Windows-1252 byte: 0x81"
}Complete Example
import { encode, decode } from "@vermaysha/windows-1252";
// Round-trip encoding/decoding
const original = "Café — €100";
const encoded = encode(original);
const decoded = decode(encoded);
console.log(original === decoded); // true
// Working with binary data
const buffer = encode("Hello");
// Send buffer over network, save to file, etc.
const restored = decode(buffer);
console.log(restored); // "Hello"API Reference
encode(str: string, options?: EncodeOptions): Uint8Array
Encodes a UTF-16 string to Windows-1252 byte array.
Parameters:
str- The string to encodeoptions(optional):mode:'fatal'(default) |'replacement''fatal': Throws an error if a character cannot be encoded'replacement': Replaces unencodable characters with?
Returns: Uint8Array of bytes in Windows-1252 encoding
Throws: Error if the string contains characters not representable in Windows-1252 and mode is 'fatal'
decode(bytes: Uint8Array | number[], options?: DecodeOptions): string
Decodes a Windows-1252 encoded byte array to a UTF-16 string.
Parameters:
bytes- Array of bytes in Windows-1252 encodingoptions(optional):mode:'replacement'(default) |'fatal''replacement': Replaces invalid bytes with � (U+FFFD)'fatal': Throws an error if an invalid byte is encountered
Returns: Decoded string
Throws: Error if bytes contain invalid values and mode is 'fatal'
Windows-1252 Character Set
Windows-1252 is a single-byte character encoding that includes:
- 0x00-0x7F: ASCII characters (direct mapping)
- 0x80-0x9F: Windows-1252 specific characters (€, ', ", —, etc.)
- 0xA0-0xFF: Latin-1 supplement (direct mapping)
Note: Bytes 0x81, 0x8D, 0x8F, 0x90, and 0x9D are undefined in Windows-1252 and will be replaced with � (U+FFFD) in replacement mode or throw an error in fatal mode.
Special Characters
The library correctly handles Windows-1252 specific characters in the 0x80-0x9F range:
| Byte | Character | Unicode | Description | | ---- | --------- | ------- | ------------------------------------------ | | 0x80 | € | U+20AC | Euro sign | | 0x82 | ‚ | U+201A | Single low-9 quotation mark | | 0x83 | ƒ | U+0192 | Latin small letter f with hook | | 0x84 | „ | U+201E | Double low-9 quotation mark | | 0x85 | … | U+2026 | Horizontal ellipsis | | 0x86 | † | U+2020 | Dagger | | 0x87 | ‡ | U+2021 | Double dagger | | 0x88 | ˆ | U+02C6 | Modifier letter circumflex accent | | 0x89 | ‰ | U+2030 | Per mille sign | | 0x8A | Š | U+0160 | Latin capital letter S with caron | | 0x8B | ‹ | U+2039 | Single left-pointing angle quotation mark | | 0x8C | Œ | U+0152 | Latin capital ligature OE | | 0x8E | Ž | U+017D | Latin capital letter Z with caron | | 0x91 | ' | U+2018 | Left single quotation mark | | 0x92 | ' | U+2019 | Right single quotation mark | | 0x93 | " | U+201C | Left double quotation mark | | 0x94 | " | U+201D | Right double quotation mark | | 0x95 | • | U+2022 | Bullet | | 0x96 | – | U+2013 | En dash | | 0x97 | — | U+2014 | Em dash | | 0x98 | ˜ | U+02DC | Small tilde | | 0x99 | ™ | U+2122 | Trade mark sign | | 0x9A | š | U+0161 | Latin small letter s with caron | | 0x9B | › | U+203A | Single right-pointing angle quotation mark | | 0x9C | œ | U+0153 | Latin small ligature oe | | 0x9E | ž | U+017E | Latin small letter z with caron | | 0x9F | Ÿ | U+0178 | Latin capital letter Y with diaeresis |
Browser Support
Works in all modern browsers and environments that support:
Uint8ArrayString.fromCodePoint()String.prototype.codePointAt()
Development
Install Dependencies
bun installRun Tests
bun testBuild
bun run build.tsLicense
MIT © Ashary Vermaysha
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
