npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

stream-data-converter

v0.1.4

Published

Node.js stream transforms for processing large data files

Downloads

3

Readme

This library provides tools for stream-processing large, "iterable" data files (e.g. text, CSV, DSV, JSONP, array-like JSON, GeoJSON, Shapefile) that cannot be loaded into memory all at once.

It is best for tasks that can be performed independently on each iteration, such as:

  • Inspect data and generate column-wise statistics
  • Output a subset of the input data based on a filter
  • Convert input data to another format
  • Download and save data from remote API in batches

Install

npm install stream-data-converter

Usage

The core exports implement Node.js' stream.Transform class.

const {JSONReader, Transform, CSVWriter} = require('stream-data-converter');

/*
 * input.geojson
   {"type": "FeatureCollection", "features: [
     {"type":"Feature","properties":{"COUNT":1,"DATESTR":"2020-01-01 16:37:03Z"},"geometry":{"type":"Point","coordinates":[-122.45,37.78]}},
     ...
   ]}

 * output.tsv
   count  timestamp longitude latitude
   1  1577896623  -122.45 37.78
   ...
 */

fs.createReadStream('path/to/input.geojson')
  .pipe(new JSONReader())
  .pipe(new Transform(processFeature))
  .pipe(new CSVWriter({delimiter: '\t'}))
  .pipe(fs.createWriteStream('path/to/output.tsv'));

function processFeature(feature, index) {
  if (feature.properties.COUNT === 0) {
    return null; // discard
  }
  return {
    count: feature.properties.COUNT,
    timestamp: Date(feature.properties.DATESTR) / 1000,
    longitude: feature.geometry.coordinates[0],
    latitude: feature.geometry.coordinates[1]
  };
}

API Reference

Readers

The reader classes take a text-based input and ouput a semantically parsed iterable down the stream.

LineReader

Outputs an iterable where each item is a line of text.

const {LineReader} = require('stream-data-converter');
const reader = new LineReader(options);

Options

  • id (String, optional): identifier of this reader instance.
  • delimiter (String, optional): delimiter (line break) of the input. Default '\n'.
  • skipEmptyLines (Boolean, optional): if true, empty lines are omitted. Default true.

CSVReader

Parses csv/dsv formatted input and outputs an iterable where each item is a JSON object describing a row.

const {CSVReader} = require('stream-data-converter');
const reader = new CSVReader(options);

Options

  • id (String, optional): identifier of this reader instance.
  • All options that are supported by Papaparse.parse. The following defaults are used instead of the original default config:
    • header: true
    • dynamicTyping: true
    • skipEmptyLines: true
    • newline: '\n'

JSONReader

Parses JSON input and outputs an iterable where each item is an element in the first object array (array of objects, in the form of [{}]).

If the input JSON is an array, then each iteration returns an item in this array.

If the input JSON is a nested object, then the reader looks for the first object array that it encounters, and returns its element one for each iteration. For example, a typical GeoJSON is in the shape of {type: 'FeatureCollection', features: [...]}, and this reader will iterate through each feature in the features array.

const {JSONReader} = require('stream-data-converter');
const reader = new JSONReader(options);

Options:

  • id (String, optional): identifier of this reader instance.
  • metadata (Boolean, optional): If false (default), just the array element itself is returned. If true, the element is wrapped an object in the shape of {type, data}:
    • type: 'arrayitem': data is an element of the object array
    • type: 'container': data is the outer JSON object, with the iterated object array empty.

Transform

Transforms an iterable from upstream to another iterable. Usually consumes the output of a Reader.

For each incoming iteration, Transform calls a callback function. This callback is expected to return one of the following:

  • null or undefined: ignore this item
  • Buffer or Uint8Array: push directly to the output iterable
  • Object or string: serialize and push a single item to the output iterable
  • Array: push multiple items to the output iterable
const {Transform} = require('stream-data-converter');
const transform = new Transform(options);

Options:

  • If options is a function, it is used as the iteration callback.
  • If options is an object, it can contain the following fields:
    • transform (Function): the iteration callback.
    • id (String, optional): identifier of this transform instance.
    • inputType (String, optional): json or text. Force the incoming item to be formatted. Default 'auto'.
    • maxConcurrency (Number, optional): if transform is an async function, use this option to limit how many can be running concurrently.
    • minTime (Number, optional): if transform is an async function, use this option to throttle the calls. Default 0 (no throttling).

Writers

Writers consume an iterable and output a formatted text stream that can be written to a file.

LineWriter

Writes each item to a line of text.

const {LineWriter} = require('stream-data-converter');
const writer = new LineWriter(options);

Options:

  • id (String, optional): identifier of this writer instance.
  • delimiter (String, optional): delimiter (line break) of the output. Default \n.
  • endEmptyLine (Boolean, optional): if true, always append an empty line at the end. Default true.

CSVWriter

Writes each item to a row in csv/dsv formatted text.

const {CSVWriter} = require('stream-data-converter');
const writer = new CSVWriter(options);

Options:

  • id (String, optional): identifier of this writer instance.
  • All options that are supported by Papaparse.unparse. The following defaults are used instead of the original default config:
    • newline: '\n'
  • batchSize: The number of items to encode in a batch. Default 100.

JSONWriter

Writes each item to a array elemnt in JSON formatted text.

const {JSONWriter} = require('stream-data-converter');
const writer = new JSONWriter(options);

Options:

  • id (String, optional): identifier of this writer instance.
  • container (Object, optional): the container object. Default [].
  • path (String, optional): path to the parent array in the container object, joined by .. Default null.

GeoJSONWriter

Writes each item as a feature in GeoJSON FeatureCollection.

const {GeoJSONWriter} = require('stream-data-converter');
const writer = new GeoJSONWriter();
// This is equivalent of
const writer = new JSONWriter({
  container: {type: 'FeatureCollection', features: []},
  path: 'features'
});

Helper Utilities

logProgress

Logs streaming progress to the console. Useful when the process takes a long time to run.


const {JSONReader, Transform, GeoJSONWriter, logProgress} = require('stream-data-converter');

const reader = new JSONReader();
// Only keep features in California
const transform = new Transform(f => f.properties.state === 'CA' ? f : null);
const writer = new GeoJSONWriter();

logProgress(reader, transform, writer);

fs.createReadStream('path/to/input.geojson')
  .pipe(reader)
  .pipe(transform)
  .pipe(writer)
  .pipe(fs.createWriteStream('path/to/output.geojson'));

// The console will print a growing counter that looks like this:
// jsonReader: 9999 | transform: 9999 | jsonWriter: 7213