@veksa/big-json
v1.0.4
Published
A stream based implementation of JSON.parse and JSON.stringify for big data
Downloads
35
Maintainers
Readme
@veksa/big-json
Optimized fork of https://github.com/DonutEspresso/big-json
A stream based implementation of JSON.parse and JSON.stringify for big data
There exist many stream based implementations of JSON parsing or stringifying for large data sets. These implementations typically target time series data, new line delimited data or other array-like data, e.g., logging records or other continuous flowing data.
This module hopes to fill a gap in the ecosystem: parsing large JSON objects that are just really big objects. With large in-memory objects, it is possible to run up against the V8 string length limitation, which is currently limited to 512MB. Thus, if your large object has enough keys or values, it is possible to exceed the string length limit when calling JSON.stringify.
Similarly, when retrieving stored JSON from disk or over the network, if the JSON stringified representation of the object exceeds the string length limit, the process will throw when attempting to convert the Buffer into a string.
The only way to work with such large objects is to use a streaming
implementation of both JSON.parse and JSON.stringify. This module does just
that by normalizing the APIs for different modules that have previously
published, combining both parse and stringify functions into a single module.
These underlying modules are subject to change at anytime.
The major caveat is that the reconstructed POJO must be able to fit in memory. If the reconstructed POJO cannot be stored in memory, then it may be time to reconsider the way these large objects are being transported and processed.
This module currently uses JSONStream for parsing, and json-stream-stringify for stringification.
Getting Started
Install the module with: npm install @veksa/big-json
Usage
To parse a big JSON coming from an external source:
import fs from 'fs';
import {createParseStream} from '@veksa/big-json';
const readStream = fs.createReadStream('big.json');
const parseStream = createParseStream();
parseStream.on('data', function(data) {
// => receive reconstructed data
});
readStream.pipe(parseStream);To stringify JSON:
import {createStringifyStream} from '@veksa/big-json';
const stringifyStream = createStringifyStream({
body: BIG_DATA
});
stringifyStream.on('data', function(strChunk) {
// => BIG_DATA will be sent out in JSON chunks as the object is traversed
});Promise-based API for parsing:
import {parse} from '@veksa/big-json';
const result = await parse({
body: jsonString
});
// => result contains the parsed JSON object/arrayPromise-based API for stringifying:
import {stringify} from '@veksa/big-json';
const jsonString = await stringify({
body: bigObject
});
// => jsonString contains the stringified representation of bigObjectAPI
createParseStream()
Create a JSON.parse that uses a stream interface. The underlying implementation is handled by JSONStream. This is merely a thin wrapper for convenience that handles the reconstruction/accumulation of each individually parsed field.
The advantage of this approach is that by also using a streams interface, any JSON parsing or stringification of large objects won't block the CPU.
Returns: Stream - A stream interface for parsing JSON
createStringifyStream(params)
Create a JSON.stringify readable stream.
Parameters:
paramsObject - An options objectparams.bodyObject - The JS object to JSON.stringifyparams.spacenumber (optional) - Number of spaces for indentationparams.bufferSizenumber (optional) - Size of the internal buffer (default: 512)
Returns: Stream - A readable stream with JSON string chunks
parse(params)
Stream based JSON.parse with promise-based API.
Parameters:
paramsObject - Options to pass to parse streamparams.bodyString|Buffer - String or buffer to parse
Returns: Promise<Object|Array> - The parsed JSON
stringify(params)
Stream based JSON.stringify with promise-based API.
Parameters:
paramsObject - Options to pass to stringify streamparams.bodyObject - The object to be stringifiedparams.spacenumber (optional) - Number of spaces for indentation
Returns: Promise<string> - The stringified JSON
Contributing
This project welcomes contributions and suggestions. To contribute:
- Fork and clone the repository
- Install dependencies:
yarn install - Make your changes
- Run linting:
yarn lint - Run tests:
yarn test - Submit a pull request
Available Scripts
yarn clean- Remove the dist directoryyarn build- Clean and build the projectyarn compile- Run TypeScript compiler without emitting filesyarn lint- Run ESLintyarn test- Run tests
