@chr33s/pdf-restructure
v5.0.6
Published
Declaratively encode and decode binary data
Downloads
805
Readme
@chr33s/pdf-restructure
Declarative binary encoding and decoding primitives packaged as modern ES modules.
@chr33s/pdf-restructure lives in the chr33s/pdf monorepo and carries forward the
Hopding/restructure fork of Devon Govett’s original project. This edition provides:
- native ES modules targeting NodeNext (Node.js 18+ or a modern bundler required),
- TypeScript sources with generated declaration files for first-class editor support, and
- updated test coverage powered by Vitest.
Installation
npm install @chr33s/pdf-restructureUse Node.js 18+ or configure your bundler to resolve NodeNext-style ES modules.
Overview
Restructure allows you to declaratively encode and decode binary data. It supports a wide variety of types to enable you to express a multitude of binary formats without writing any parsing code.
Some of the supported features are C-like structures, versioned structures, pointers, arrays of any type, strings of a large number of encodings, enums, bitfields, and more. See the documentation below for more details.
Example
This is just a small example of what Restructure can do. Check out the API documentation below for more information.
import fs from "node:fs";
import {
DecodeStream,
EncodeStream,
Struct,
String,
uint8,
} from "@chr33s/pdf-restructure";
const Person = new Struct({
name: new String(uint8, "utf8"),
age: uint8,
});
// decode a person from a buffer
const decodeStream = new DecodeStream(buffer);
const person = Person.decode(decodeStream);
// encode a person back to a stream
const encodeStream = new EncodeStream();
encodeStream.pipe(fs.createWriteStream("out.bin"));
Person.encode(encodeStream, person);
encodeStream.end();API
All of the following types support three standard methods:
decode(stream)- decodes an instance of the type from the given DecodeStreamsize(value)- returns the amount of space the value would take if encodedencode(stream, value)- encodes the given value into the given EncodeStream
Restructure supports a wide variety of types, but if you need to write your own for some custom use that cannot be represented by them, you can do so by just implementing the above methods. Then you can use your type just as you would any other type, in structures and whatnot.
Number Types
The following built-in number types are available:
uint8, uint16, uint24, uint32, int8, int16, int24, int32, float, double, fixed16, fixed32Numbers are big-endian (network order) by default, but little-endian is supported, too:
uint16le, uint24le, uint32le, int16le, int24le, int32le, floatle, doublele, fixed16le, fixed32leTo avoid ambiguity, big-endian may be used explicitly:
uint16be, uint24be, uint32be, int16be, int24be, int32be, floatbe, doublebe, fixed16be, fixed32beBoolean
Booleans are encoded as 0 or 1 using one of the above number types.
let bool = new r.Boolean(r.uint32);Reserved
The Reserved type simply skips data in a structure, where there are reserved fields.
Encoding produces zeros.
// 10 reserved uint8s (default is 1)
let reserved = new r.Reserved(r.uint8, 10);Optional
The Optional type only encodes or decodes when given condition is truthy.
// includes field
let optional = new r.Optional(r.uint8, true);
// excludes field
let optional = new r.Optional(r.uint8, false);
// determine whether field is to be included at runtime with a function
let optional = new r.Optional(r.uint8, function() {
return this.flags & 0x50;
});Enum
The Enum type maps a number to the value at that index in an array.
let color = new r.Enum(r.uint8, ['red', 'orange', 'yellow', 'green', 'blue', 'purple']);Bitfield
The Bitfield type maps a number to an object with boolean keys mapping to each bit in that number,
as defined in an array.
let bitfield = new r.Bitfield(r.uint8, ['Jack', 'Kack', 'Lack', 'Mack', 'Nack', 'Oack', 'Pack', 'Quack']);
bitfield.decode(stream);
let result = {
Jack: true,
Kack: false,
Lack: false,
Mack: true,
Nack: true,
Oack: false,
Pack: true,
Quack: true
};
bitfield.encode(stream, result);Buffer
Extracts a slice of the buffer to a Node Buffer. The length can be a constant, or taken from
a previous field in the parent structure.
// fixed length
let buf = new r.Buffer(2);
// length from parent structure
let struct = new r.Struct({
bufLen: r.uint8,
buf: new r.Buffer('bufLen')
});String
A String maps a JavaScript string to and from binary encodings. The length can be a constant, taken
from a previous field in the parent structure, or encoded using a number type immediately before the string.
Supported encodings include 'ascii', 'utf8', 'ucs2', 'utf16le', 'utf16be', and if you also install
iconv-lite, many other legacy codecs.
// fixed length, ascii encoding by default
let str = new r.String(2);
// length encoded as number before the string, utf8 encoding
let str = new r.String(r.uint8, 'utf8');
// length from parent structure
let struct = new r.Struct({
len: r.uint8,
str: new r.String('len', 'utf16be')
});
// null-terminated string (also known as C string)
let str = new r.String(null, 'utf8')Array
An Array maps to and from a JavaScript array containing instances of a sub-type. The length can be a constant,
taken from a previous field in the parent structure, encoded using a number type immediately
before the string, or computed by a function.
// fixed length, containing numbers
let arr = new r.Array(r.uint16, 2);
// length encoded as number before the array containing strings
let arr = new r.Array(new r.String(10), r.uint8);
// length computed by a function
let arr = new r.Array(r.uint8, function() { return 5 });
// length from parent structure
let struct = new r.Struct({
len: r.uint8,
arr: new r.Array(r.uint8, 'len')
});
// treat as amount of bytes instead (may be used in all the above scenarios)
let arr = new r.Array(r.uint16, 6, 'bytes');LazyArray
The LazyArray type extends from the Array type, and is useful for large arrays that you do not need to access sequentially.
It avoids decoding the entire array upfront, and instead only decodes and caches individual items as needed. It only works when
the elements inside the array have a fixed size.
Instead of returning a JavaScript array, the LazyArray type returns a custom object that can be used to access the elements.
let arr = new r.LazyArray(r.uint16, 2048);
let res = arr.decode(stream);
// get a single element
let el = res.get(2);
// convert to a normal array (decode all elements)
let array = res.toArray();Struct
A Struct maps to and from JavaScript objects, containing keys of various previously discussed types. Sub structures,
arrays of structures, and pointers to other types (discussed below) are supported.
let Person = new r.Struct({
name: new r.String(r.uint8, 'utf8'),
age: r.uint8
});VersionedStruct
A VersionedStruct is a Struct that has multiple versions. The version is typically encoded at
the beginning of the structure, or as a field in a parent structure. There is an optional header
common to all versions, and separate fields listed for each version number.
// the version is read as a uint8 in this example
// you could also get the version from a key on the parent struct
let Person = new r.VersionedStruct(r.uint8, {
// optional header common to all versions
header: {
name: new r.String(r.uint8, 'utf8')
},
0: {
age: r.uint8
},
1: {
hairColor: r.Enum(r.uint8, ['black', 'brown', 'blonde'])
}
});Pointer
Pointers map an address or offset encoded as a number, to a value encoded elsewhere in the buffer.
There are a few options you can use: type, relativeTo, allowNull, and nullValue.
The type option has these possible values:
local(default) - the encoded offset is relative to the start of the containing structureimmediate- the encoded offset is relative to the position of the pointer itselfparent- the encoded offset is relative to the parent structure of the immediate containerglobal- the encoded offset is global to the start of the file
The relativeTo option specifies that the encoded offset is relative to a field on the containing structure.
By default, pointers are relative to the start of the containing structure (local).
The allowNull option lets you specify whether zero offsets are allowed or should produce null. This is
set to true by default. The nullValue option is related, and lets you override the encoded value that
represents null. By default, the nullValue is zero.
The lazy option allows lazy decoding of the pointer's value by defining a getter on the parent object.
This only works when the pointer is contained within a Struct, but can be used to speed up decoding
quite a bit when not all of the data is needed right away.
let Address = new r.Struct({
street: new r.String(r.uint8),
zip: new r.String(5)
});
let Person = new r.Struct({
name: new r.String(r.uint8, 'utf8'),
age: r.uint8,
ptrStart: r.uint8,
address: new r.Pointer(r.uint8, Address)
});If the type of a pointer is set to 'void', it is not decoded and the computed address in the buffer
is simply returned. To encode a void pointer, create a new r.VoidPointer(type, value).
