npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

json-kiln

v0.1.1

Published

Streaming JSON array generator by target size (bytes). Exact-size files, per-item min size, deterministic, EJSON. No OOM.

Readme

json-kiln

Streaming JSON array generator by target size (bytes) with constant memory usage.
Features: exact final file size, per-item minimum size, deterministic output, EJSON _id.$oid, progress, Docker image, and CLI-friendly defaults.

Bake JSON arrays to an exact size with streaming heat and zero OOM.


Contents


Why

Most mock-data tools optimize for shape (schemas), not for exact file size.
json-kiln focuses on byte budget and streaming:

  • Generate 100 MB, 500 MB, 5 GB JSON without keeping the full array in RAM.
  • Guarantee per-item minimum size (e.g., each element ≥ 200 MB for edge-case testing).
  • Make the whole file match the exact target size via a smart pad field on the last element.
  • Deterministic output with --seed.

Features

  • Target file size in bytes: --size 500MB
  • Streaming output (no full-array buffering, backpressure-aware)
  • Per-item minimum size: --minItemSize 200MB
  • Exact final size: --exact pads the last element to match the target
  • Deterministic via --seed
  • EJSON _id: { "$oid": "…" } via --oid
  • TTY progress bar: --progress
  • Pretty-print option
  • Docker image and npm CLI

Install

npm (CLI)

npm i -g json-kiln
# or locally
npm i -D json-kiln

Now you have a json-kiln CLI (alias jk if linked).

Docker

docker pull shyrokyivladislav/json-kiln:latest

Quick start

Generate a 200 MB file with a single element:

json-kiln --size 200MB --items 1 --minItemSize 200MB --exact --progress --out one-200mb.json

Generate a 500 MB file with two elements, each ≥ 200 MB:

json-kiln --size 500MB --items 2 --minItemSize 200MB --exact --progress --out two-250mb.json

Stream to gzip (no temporary files):

json-kiln --size 2GB --depth 5 --progress | gzip > data.json.gz

Generate with EJSON ObjectId and deterministic output:

json-kiln --size 300MB --oid --seed 1337 --progress --out dataset.json

CLI

json-kiln [options]

Options:
  --size=VALUE         Target file size (bytes). Supports B|KB|MB|GB (e.g., 500MB, 1GB).
  --depth=N            Max nesting depth (default: 3, range: 0..12).
  --items=N            Fixed number of array elements. If omitted, generator fills by size.
  --minItemSize=VALUE  Per-item minimum size in bytes (B|KB|MB|GB). Each element is padded
                       to be at least this long. With --items, helps split the budget.
  --exact              Force exact final file size by padding the last element's "pad" field.
  --oid                Use EJSON-style id: { "_id": { "$oid": "<24-hex>" } }.
                       Default is string "id": "cfg_<i>_<oid>".
  --seed=N             Deterministic RNG seed (number).
  --pretty             Pretty-print JSON (more readable, larger output).
  --progress           Show single-line progress bar on TTY, log lines on non-TTY.
  --out=PATH           Write to file. Default is stdout.

Short alias (if enabled): `jk`

Exit codes

  • 0 success
  • 1 invalid arguments or I/O error

Typical recipes

1) Single element ≈ 200 MB

json-kiln --size 200MB --items 1 --minItemSize 200MB --exact --out one-200mb.json

2) 500 MB total, two elements, each ≥ 200 MB

json-kiln --size 500MB --items 2 --minItemSize 200MB --exact --out twox250mb.json

3) Big file with streaming and gzip

json-kiln --size 5GB --depth 5 --progress | gzip > five-gigs.json.gz

4) Deterministic dataset with EJSON and pretty-print

json-kiln --size 120MB --depth 4 --oid --seed 42 --pretty --out demo.json

Docker

Run with output file mounted to a volume:

docker run --rm -it \
  -v "$PWD:/out" \
  shyrokyivladislav/json-kiln:latest \
  --size 500MB --items 2 --minItemSize 200MB --exact --progress --out /out/twox250mb.json

Pipe to gzip in container:

docker run --rm -i shyrokyivladislav/json-kiln:latest \
  --size 2GB --depth 5 --progress | gzip > data.json.gz

Library API (Node.js)

import { generateStream } from 'json-kiln';

const fs = await import('node:fs');

const stream = fs.createWriteStream('data.json', { encoding: 'utf8' });
await generateStream(stream, {
  size: '500MB', // string or number (bytes)
  depth: 4,
  items: 2,
  minItemSize: '200MB', // string or number (bytes)
  exact: true,
  oid: true,
  seed: 1337,
  pretty: false,
  progress: true, // progress to stderr
});
// resolves when stream is finished

Options (library) mirror CLI flags. String sizes ("200MB") or numeric bytes are accepted. generateStream respects backpressure and never buffers the full array.


Performance notes

  • Streaming: output is written incrementally. Memory stays near the size of a single element string plus small buffers.
  • Backpressure-aware: waits for drain when writing to slow destinations (stdout, filesystems, pipes).
  • Pretty-print (--pretty) increases file size and CPU due to whitespace; disable for tighter packing and maximum throughput.
  • Depth increases natural content size and variety; for larger elements with less padding, raise --depth.

Validation & tests

Recommended quick checks:

# Validate JSON shape
jq -e type < data.json > /dev/null

# Verify array length (count elements)
jq length < data.json

# Check file size (exactness)
stat -f%z data.json   # macOS
stat --format=%s data.json  # Linux

CI suggestions:

  • Unit tests: id format, deterministic seed, pad logic, exact-size contract.
  • Integration: generate big file (e.g., 2 GB) with RSS cap check (e.g., < 200 MB).

Limitations

  • Padding is implemented as a "pad" string field on elements. This is valid JSON and easy to strip if needed.
  • EJSON _id.$oid is syntactic, not a real MongoDB ObjectId type. For test data it is sufficient.
  • Exact-size fit uses binary search plus small linear tuning; for extreme constraints and --pretty it may fail to hit exact size for intermediate elements. Final exactness is always enforced with --exact on the last element.

FAQ

Q: How is json-kiln different from schema faker tools?

A: We focus on size and streaming, not schema contract. Needing exactly 500 MB'' or each item ≥ 200 MB'' is our zone.

Q: Can we reduce ``pad'' part?

A: Yes. Increase --depth' to make the content naturally heavier; or raise --minItemSize' and allocate budget via --items.

Q: Is it safe for CI?

A: Yes. The process does not keep the entire array in memory, works with backpressure, progress and logging go to stderr.


Contributing

  1. Fork & clone
  2. npm i
  3. npm run build
  4. npm run test
  5. PR with a short description of motivation

Please follow the code style and add tests for public changes.


License

MIT © You. See LICENSE.