npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@aoede/tamper

v1.0.0

Published

ESM encoder/decoder for Tamper - a compact format for bulk categorical datasets

Downloads

17

Readme

Tamper (ESM)

ESM encoder/decoder for Tamper - a compact format for bulk categorical datasets.

This repository contains an ESM-native implementation of the Tamper encoder and decoder format originally developed at the New York Times, plus strict parity tooling to ensure identical output to the frozen legacy implementation.

This project is an independent ESM implementation of the Tamper format. It does not define a new format and is not affiliated with the original NYT repository.

Tamper is a column-oriented packer for tabular categorical data (low-cardinality enums, booleans, bucketed integers) where JSON + compression becomes inefficient.

References


When to use this

Tamper is a good fit when your data is:

  • Tabular (many rows with the same attributes)
  • Categorical-heavy (enums, booleans, small integers)
  • Bulk (transferred or stored as snapshots)
  • Read-mostly / immutable
  • Required to match legacy Tamper output exactly

Use cases:

  • Analytics extracts for dashboards
  • Lookup / reference tables
  • ML-style categorical feature matrices shipped to JS or WASM

When not to use this

Do not use Tamper for:

  • Nested or hierarchical objects
  • General APIs or CRUD payloads
  • Arbitrary graphs
  • Free-form documents or HTML

If your data is not mostly categorical and tabular, JSON + Brotli/Zstd or a schema-based format (e.g. Protobuf, Arrow) will likely be a better fit.


Overview

Tamper is a data serialisation protocol originally developed at the New York Times to efficiently transfer large categorical datasets from server to browser.

This repository provides a modern ESM implementation of the original CommonJS codebase, with:

  • identical encoded output
  • identical decoded results
  • strict, automated parity checks against the frozen legacy implementation

Core encoding approach

Tamper packs categorical columns using bitwise encodings, automatically selecting the most efficient strategy per attribute:

  • Integer packing - sparse or bounded integer values
  • Bitmap packing - dense categorical values
  • Existence packing - tracks presence using run-length encoding

These strategies are chosen automatically by the encoder based on observed data characteristics.


Performance

Tamper achieves significant compression for categorical tabular data:

  • Sparse datasets: 10-15x compression (e.g., 500 events across 10K IDs)
  • Dense multi-value attributes: 20-30x compression (bitmap encoding)
  • Very sparse datasets: 4-5x compression at scale (existence encoding with RLE)

The compression ratio improves with dataset size due to fixed header overhead. See real examples with the size comparison script:

npm run example

This script demonstrates four scenarios showing Tamper vs plain JSON size, compression ratios, and the impact of:

  • Existence encoding for sparse data
  • Integer encoding for categorical values
  • Bitmap encoding for multi-value attributes
  • Fixed overhead on small vs large datasets

Note: These compression ratios are before any transport-level compression. Tamper packs can be further compressed with gzip/brotli for additional gains, often achieving better overall compression than gzip/brotli on plain JSON (due to Tamper's elimination of field name repetition and use of bit-packed encodings).


Repository structure

├── clients/js/src/         # ESM decoder (browser-side)
├── encoders/js/
│   ├── core/               # Environment-agnostic encoder logic
│   └── env/                # Node.js & browser adapters
├── legacy/                 # Frozen legacy implementation (reference only)
├── vendor/bitsy/           # Vendored bitset library (no npm deps)
├── scripts/                # Parity verification tools
└── test/                   # Test datasets & canonical outputs

Requirements

  • Node.js (ESM-capable; tested with current LTS)
  • npm (for installing dev tooling)
  • Encoder runtime uses a local vendor/bitsy shim (no network installs)

Install dev dependencies for TSX-driven scripts:

npm install

Usage

Decoder (ESM)

Exports:

  • createTamper() - decoder factory
  • Tamper - decoder methods
  • default export - alias of createTamper
import createTamper from "./clients/js/src/tamper.ts";
import fs from "node:fs/promises";

const tamper = createTamper();
const pack = JSON.parse(await fs.readFile("pack.json", "utf8"));
const items = tamper.unpackData(pack);

Encoder (ESM)

Entry points:

  • Node / standard ESM: encoders/js/index.ts
  • Browser / edge: compose core + environment adapter

Exports:

  • createPackSet, PackSet
  • Pack, IntegerPack, BitmapPack, ExistencePack
import { createPackSet } from "./encoders/js/index.ts";

const tamp = createPackSet();
// configure attributes + pack data...
const json = tamp.toJSON();

Browser / edge example:

import createEncoder from "./encoders/js/core/createEncoder.ts";
import browserEnv from "./encoders/js/env/browser.ts";

const { createPackSet } = createEncoder(browserEnv);

const tamp = createPackSet();
// configure attributes + pack data...
const json = tamp.toJSON();

Parity verification (strict)

Decoder parity compares decoded output from the legacy and ESM implementations:

tsx scripts/compare-decoders.ts

Encoder parity builds packs from test datasets and compares full JSON output against canonical fixtures:

tsx scripts/compare-encoders.ts

The ESM implementation's parity is verified by ensuring all canonical fixtures match byte-for-byte.


Notes

  • Encoder output is tuned to exactly match canonical JSON fixtures (including legacy fields such as max_guid and existence metadata).
  • The legacy implementation is retained only for parity verification and reference; it is not used at runtime.
  • The browser encoder uses Uint8Array and DataView and does not depend on Node.js Buffer.

Expected output

PASS large.json
PASS run.json
PASS run2.json
PASS small.json
PASS small2.json
PASS sparse.json
PASS spstart.json

All 7 file(s) passed parity checks.
...
All 7 file(s) passed encoder parity checks.