isobmff-inspector
v0.5.0
Published
Simple ISOBMFF parser, compatible with JavaScript and Node.JS
Maintainers
Readme
isobmff-inspector
The ISOBMFF-inspector is a simple module compatible with Node.js and JavaScript to facilitate ISOBMFF file parsing.
This is most of all useful for debugging purposes.
You can see it working online through the demo page of the AISOBMFFWVDFBUTFAII , available here . AISOBMFFWVDFBUTFAII is an online ISOBMFF visualizer based on this parser.
Usage
You can install it through npm:
npm install isobmff-inspectorThen you can directly use the inspector in your JavaScript or Node file:
import inspectISOBMFF from "isobmff-inspector";
const parsed = await inspectISOBMFF(MY_ISOBMFF_FILE);
console.log(parsed);The same entry point can also progressively parse usual byte sources:
import inspectISOBMFF from "isobmff-inspector";
const parsedFile = await inspectISOBMFF(fileInput.files[0]);
const response = await fetch("https://example.com/video.mp4");
const parsedResponse = await inspectISOBMFF(response);The inspector only buffers the bytes it needs to parse a box. Boxes without a
parser or children, including mdat, are skipped progressively when their size
is known, so large media payloads do not have to be kept in memory.
If you want parsed metadata as it becomes available, use the event iterator:
import { parseEvents } from "isobmff-inspector";
for await (const event of parseEvents(response)) {
if (event.event === "box-complete") {
console.log("box parsed", event.path.join("/"), event.box);
}
}You can also opt in to raw payload chunks for selected boxes while the parser is consuming them:
import { parseEvents } from "isobmff-inspector";
for await (const event of parseEvents(response, {
payloads: {
include: ["mdat"],
onChunk(info, chunk) {
console.log(
"mdat bytes",
info.payloadAbsoluteOffset,
info.payloadAbsoluteOffset + chunk.length,
);
},
},
})) {
// metadata events are still yielded as usual
}Payload callbacks are forward-only and do not retain media data in memory. If the offsets you need are only known after the relevant payload bytes have already passed, you need to re-open the resource and parse again.
Command line
You can also run the inspector directly from npm:
npx isobmff-inspector myfile.mp4This prints the parsed box tree as formatted JSON.
Use --format simple to print a lighter JSON tree intended for quick
inspection:
npx isobmff-inspector --format simple myfile.mp4The default is --format full.
The command reads the input file progressively, so large media payloads do not have to be loaded fully in memory before parsing. The current output is emitted once the parse is complete.
API
inspectISOBMFF(input, options)
import inspectISOBMFF from "isobmff-inspector";Parses an ISOBMFF input.
The same function is also available as a named export:
import { parse } from "isobmff-inspector";Supported inputs:
ArrayBuffer- any TypedArray, such as
Uint8Array BloborFileRequestorResponse- Web
ReadableStream - Node.js readable streams
- sync or async iterables of byte chunks
Return value:
Promise<ParsedBox[]>
Options:
{
format: "full" // or "simple"
}"full" is the default and returns the rich ParsedBox[] structure documented
below. "simple" returns a SimpleParsedBox[] structure with parsed field
values projected to plain JavaScript values for console and CLI inspection.
parseBuffer(input, options)
import { parseBuffer } from "isobmff-inspector";Synchronously parses an ArrayBuffer or TypedArray input. It accepts the same
format option as inspectISOBMFF.
The default return value is:
ParsedBox[]parseEvents(input, options)
import { parseEvents } from "isobmff-inspector";Progressively parses an ISOBMFF input and yields metadata events as soon as they are available.
for await (const event of parseEvents(input)) {
// event.event is "box-start" or "box-complete"
}Options:
{
payloads: {
include: ["mdat"],
onChunk(info, chunk) {
// called with forward-only raw payload chunks for the selected boxes
}
}
}payloads is optional. When provided, include selects the box types whose raw
payload should be forwarded as the parser consumes them. This is especially
useful for large payload boxes such as mdat.
Events:
{
event: "box-start",
path: ["moov", "trak"],
type: "tkhd",
offset: 140,
size: 92,
headerSize: 8,
sizeField: "size"
}box-start events only expose metadata known from the header. The actual
number of bytes available for a box is exposed on the final ParsedBox sent by
the matching box-complete event.
{
event: "box-complete",
path: ["ftyp"],
box: ParsedBox
}Payload callback info:
{
path: ["mdat"], // "Path" of containers from top-level boxes, to the current one included
type: "mdat", // FourCc of this box as string
// Absolute start position of the box (including size and name) in the whole given
// resource, in bytes
boxOffset: 1024,
boxSize: 4096, // announced box size, in bytes; stays 0 for extends-to-end boxes
headerSize: 8, // size of the box header, in bytes
// indicates how the box declared its size: `"size"` for the normal
// 32-bit size field, `"largeSize"` for a 64-bit large-size field, or
// `"extendsToEnd"` for boxes declared with size `0`.
sizeField: "size",
// Start position, in bytes, of the payload chunk communicated here, relative
// to the beginning of this box's payload.
// This is 0 for the first payload chunk of a box, then increases for later
// chunks of the same box.
payloadOffset: 512,
// Absolute start position, in bytes, of the payload chunk communicated here
// in the whole given resource.
payloadAbsoluteOffset: 1544
}The payload callback is invoked between the matching box-start and
box-complete events. It is a zero-retention, forward-only stream of bytes: the
parser does not keep those chunks after delivering them.
ParsedBox
The parsed result is an array of boxes, in the order they are encountered.
In the previous example, parsed will have something like the following
structure:
[ // boxes, in the order they are encountered
// A simple parsed styp leaf box at the root:
{
type: "styp", // 4-character box type
name: "Segment Type Box", // Optional. More human-readable name for the box
offset: 0, // offset from the beginning of the input, in bytes
size: 24, // announced box size, in bytes; stays 0 for extends-to-end boxes
actualSize: 24, // bytes actually present in the input for that box
headerSize: 8, // size of the box header, in bytes
// Optional box human-readable description
description: "Identifies the brands and compatibility of a media segment.",
// indicates how the box declared its size: `"size"` for the normal
// 32-bit size field, `"largeSize"` for a 64-bit large-size field, or
// `"extendsToEnd"` for boxes declared with size `0`.
sizeField: "size",
// values in the box, in the order they are encountered
values: [
{
key: "major_brand", // stable key for the value
kind: "string", // kind of parsed field
value: "iso6" // ...value. Displayable one are JS strings
},
{
key: "minor_version",
kind: "number",
value: 0 // Number values are usually JS Numbers
},
{
key: "compatible_brands",
kind: "string",
value: "iso6, msdh", // here brands are separated by a comma
}
],
issues: [ // issues detected while parsing this box. Empty for no issue
{
severity: "error",
message: "Truncated box: declared 24 byte(s), only 20 available."
}
]
},
// Another example for a container box: it has a `children` key but no `values`:
{
type: "moof",
name: "Movie Fragment Box",
size: 788,
children: [ // children boxes, in the order they are encountered
{
type: "mfhd",
name: "Movie Fragment Header Box",
values: [
{
key: "version",
kind: "number",
value: 0
},
{
key: "flags",
kind: "number",
value: 0
},
{
key: "sequence_number",
kind: "number",
value: 2
}
]
}
]
}
// ...
]An uuid property is also present only on uuid boxes and contains
the user-defined box UUID as an uppercase hexadecimal string.
actualSize is always present on parsed boxes. It equals size for complete
fixed-size boxes, is lower when the input is truncated, and carries the
observed extent for extendsToEnd boxes whose declared size stays 0.
When possible, the inspector keeps parsing after an error to return as much
information as it can. Parsing issues are reported on the corresponding box
through an issues array.
Each issue entry has:
severity: either"warning"or"error"message: a human-readable description of the issue
severity: "error" means the inspector could not reliably parse part of the
file, for example because a box is truncated, has an invalid size, or a field
could not be read. severity: "warning" means parsing could continue, but the
parsed result may be incomplete or suspicious, for example when a known box
parser left unread bytes.
Field values
Each parsed field in ParsedBox.values has a stable key, a kind, and
kind-specific data. description is optional and is present when the parser has
extra human-readable context for that field. offset and byteLength are also
available when the parser knows which input bytes produced that field.
Most scalar fields follow this shape:
{
key: "sequence_number",
kind: "number",
value: 2,
offset: 12,
byteLength: 4,
description: "Movie fragment sequence number." // Optional
}Applications should switch on kind when reading fields:
number: Used for 8-bit to 32-bit integer fields.valueis a JavaScriptnumber.bigint: Used for 64-bit integer fields.valueis a JavaScriptbigint.string: Used for string fields.valueis a JavaScriptstring.bytes: Used for binary data fields.valueis a JavaScriptstringof its hex-encoded value (uppercase, with no prefix).boolean:valueis a JavaScript boolean.null:valueis a parsed null value.fixed-point: For most ISOBMFF floating numbers.valueis a JavaScriptnumber. More advanced info is also available (described below).date: For what are semantically dates.valueis either anumberorbigintdepending on its size. More advanced info is also available (described below).bits: a packed integer split into named bit ranges.valueis a JavaScriptnumber. More advanced info is also available (described below).flags: a packed integer interpreted as named boolean flags.valueis a JavaScriptnumber. More advanced info is also available (described below).array: an ordered list of parsed fields. This list is in anitemsarray property. It has novalueproperty. More advanced info is also available (described below).struct: a named group of parsed fields. Those fields are defined in afieldsarray property. It has novalueproperty. More advanced info is also available (described below).
For a very simple exploitation, you can thus just read the value property of
all of those but array (which relies on an items array of further field
values objects) and struct (similar, but they rely on a fields array).
For more advanced usages, you can read below.
fixed-point
fixed-point fields corresponds to cases where the ISOBMFF format encodes fixed
point numbers explicitly (which is often preferred by this format instead of
less precize IEEE 754 float values like number in JavaScript).
It exposes the decoded number through value, plus the raw integer and its
declared format:
{
key: "horizontal_resolution",
kind: "fixed-point",
value: 72,
raw: 4718592,
format: "16.16",
signed: false,
bits: 32
}The bits property is the size of the raw fixed-point integer before fractional scaling.
date
date fields corresponds to ISOBMFF properties encoding a date.
As a value they expose the raw ISOBMFF value (a timestamp, /!\ they generally do not
rely on the unix epoch, the ISO-8601 epoch is given through the epoch property instead)
and, when it can be represented by JavaScript's Date, an ISO-8601 string:
{
key: "creation_time",
kind: "date",
value: 3846096077n,
date: "2025-11-21T09:21:17.000Z",
epoch: "1904-01-01T00:00:00.000Z",
unit: "seconds"
}date is null when the corresponding Unix timestamp cannot be converted to a
finite, valid JavaScript Date. For bigint values, this also happens when the
timestamp is outside JavaScript's safe integer range.
bits
bits fields keep the original integer in raw and describe each named
sub-field in fields. value is a convenience for minimal consumers: it is the
most meaningful decoded value when the parser identifies one, or the raw integer
otherwise. Consumers that need precise bit-level meaning should read fields.
{
key: "lengthSizeMinusOne",
kind: "bits",
value: 3,
raw: 255,
bits: 8,
fields: [
{ key: "reserved", value: 63, bits: 6, shift: 2, mask: 252 },
{ key: "value", value: 3, bits: 2, shift: 0, mask: 3 }
]
}flags
flags fields keep the original integer in both value and raw, then expose
the named flags as booleans:
{
key: "flags",
kind: "flags",
value: 131072,
raw: 131072,
bits: 24,
flags: [
{ key: "default-base-is-moof", value: true, mask: 131072 }
]
}array and struct
array and struct fields are recursive. Array items contain parsed fields
without a key; struct fields contain normal keyed ParsedBoxValue entries.
For example, an array of AVC parameter-set objects is represented as an array of struct fields:
{
key: "sequenceParameterSets",
kind: "array",
items: [
{
kind: "struct",
fields: [
{ key: "length", kind: "number", value: 24 },
{ key: "data", kind: "string", value: "6742c00d..." }
]
}
]
}A struct may also expose a layout hint when the parser knows how the fields
should be displayed. Current layout values are:
"matrix-3x3": a 3 by 3 transformation matrix."iso-639-2-t": an ISO 639-2/T language code plus its packed raw value."cenc-pattern": Common Encryption crypt/skip byte-block pattern fields.
{
key: "matrix",
kind: "struct",
layout: "matrix-3x3",
fields: [
{
key: "a",
kind: "fixed-point",
value: 1,
raw: 65536,
format: "16.16",
signed: true,
bits: 32
}
]
}Simple format
The "simple" format keeps box-level metadata but replaces values with a
plain fields object:
const parsed = await inspectISOBMFF(input, { format: "simple" });{
type: "ftyp",
offset: 0,
size: 24,
actualSize: 24,
headerSize: 8,
sizeField: "size",
fields: {
major_brand: "iso6",
minor_version: 0,
compatible_brands: "iso6, msdh"
}
}Simple boxes use the following shape:
{
type: "moov",
offset: 24,
size: 1024,
actualSize: 1024,
headerSize: 8,
sizeField: "size",
uuid: "001122...", // only for uuid boxes
fields: {},
children: [
// SimpleParsedBox
],
issues: [
// only present when non-empty
]
}Field keys are kept unchanged. Packed bits and flags fields become plain
objects containing the decoded named entries plus the original integer in
$raw:
{
fields: {
lengthSizeMinusOne: {
$raw: 255,
reserved: 63,
value: 3
},
flags: {
$raw: 131072,
"default-base-is-moof": true
}
}
}fixed-point fields become their decoded number. date fields become their
ISO-8601 string when available, otherwise their raw value. array and struct
fields are recursively simplified.
Integer types
Parsed integer values follow a fixed rule:
- 8-bit to 32-bit integers are returned as JavaScript
numbervalues - 64-bit integers are returned as JavaScript
bigintvalues
This means 64-bit ISOBMFF fields are always exact and never depend on the parsed value's magnitude.
Though this also means that applications will have to check for bigint when
handling numeric values, as those are mostly incompatible with number values.
Parsed boxes
The inspector only parses the following ISOBMFF boxes for now:
- ac-3
- av01
- avc1
- avc3
- avcC
- btrt
- cdsc
- co64
- colr
- cslg
- ctts
- dac3
- data
- dec3
- dOps
- dinf
- dref
- ec-3
- edts
- elng
- elst (and sub-boxes)
- emsg
- enca
- encv
- esds
- font
- free
- frma
- ftyp
- hdlr
- hev1
- hind
- hint
- hmhd
- hvc1
- hvcC
- keys
- ID32
- ilst
- iods
- leva
- mdat
- mdhd
- mdia
- mehd
- meta
- mfhd
- mfra
- mfro
- minf
- moof
- moov
- mp4a
- mvex
- mvhd
- nmhd
- Opus
- padb
- pasp
- pdin
- prft
- pssh
- saio
- saiz
- sbgp
- schi
- schm
- sdtp
- senc
- sgpd
- sidx
- sinf
- skip
- smhd
- stbl
- stco
- stdp
- sthd
- stsc
- stsd
- stsh
- stss
- stz2
- stsz
- stts
- styp
- subt
- tenc
- tfdt
- tfhd
- tfra
- tkhd
- traf
- trak
- tref
- trep
- trex
- trun
- udta
- url
- urn
- uuid
- vdep
- vmhd
- vplx
I plan to support each one of them but UUIDs (I may add support for some of them in the future, for example for Smooth Streaming ones).
Contribute
You can help me to add parsing logic for other boxes by updating the
src/boxes directory.
You can base yourself on already-defined boxes. Each of the parser functions
there receive a BoxReader object.
They should emit parsed fields directly through that BoxReader.
This can e.g. be done through a method like
reader.fieldUint("version", 1, "The box version").
Note that each of those call advance the BoxReader's internal cursor so
consecutive calls will progress through the file.
If a parser uses read* helpers plus addField, it can retrieve the current
cursor through reader.getCurrentOffset() and pass offset / byteLength
explicitly to addField(...) when that extra metadata is meaningful.
