npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@streamparser/json-whatwg

v0.0.20

Published

Streaming JSON parser in Javascript for Node.js, Deno and the browser

Downloads

34,527

Readme

@streamparser/json-whatwg

npm version npm monthly downloads Build Status Coverage Status

Fast dependency-free library to parse a JSON stream using utf-8 encoding in Node.js, Deno or any modern browser. Fully compliant with the JSON spec and JSON.parse(...).

tldr;

import { JSONParser } from '@streamparser/json-whatwg';

const inputStream = new ReadableStream({
  async start(controller) {
    controller.enqueue('{ "test": ["a"] }');
    controller.close();
  },
});

const parser = new JSONParser();
const reader = inputStream.pipeThrough(jsonparser).pipeTo(destinationStream)

// Or manually getting the values

const reader = inputStream.pipeThrough(jsonparser).getReader();
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  processValue(value);
  // There will be 3 value:
  // "a"
  // ["a"]
  // { test: ["a"] }
}

@streamparser/json ecosystem

There are multiple flavours of @streamparser:

Dependencies / Polyfilling

@streamparser/json requires a few ES6 classes:

If you are targeting browsers or systems in which these might be missing, you need to polyfil them.

Components

Tokenizer

A JSON compliant tokenizer that parses a utf-8 stream into JSON tokens

import { Tokenizer } from '@streamparser/json-whatwg';

const tokenizer = new Tokenizer(opts, writableStrategy, readableStrategy);

Writable and readable strategy are standard WhatWG Stream settings (see MDN).

The available options are:

{
  stringBufferSize: <number>, // set to 0 to don't buffer. Min valid value is 4.
  numberBufferSize: <number>, // set to 0 to don't buffer.
  separator: <string>, // separator between object. For example `\n` for nd-js.
  emitPartialTokens: <boolean> // whether to emit tokens mid-parsing.
}

If buffer sizes are set to anything else than zero, instead of using a string to apppend the data as it comes in, the data is buffered using a TypedArray. A reasonable size could be 64 * 1024 (64 KB).

Buffering

When parsing strings or numbers, the parser needs to gather the data in-memory until the whole value is ready.

Strings are inmutable in Javascript so every string operation creates a new string. The V8 engine, behind Node, Deno and most modern browsers, performs a many different types of optimization. One of this optimizations is to over-allocate memory when it detects many string concatenations. This increases significatly the memory consumption and can easily exhaust your memory when parsing JSON containing very large strings or numbers. For those cases, the parser can buffer the characters using a TypedArray. This requires encoding/decoding from/to the buffer into an actual string once the value is ready. This is done using the TextEncoder and TextDecoder APIs. Unfortunately, these APIs creates a significant overhead when the strings are small so should be used only when strictly necessary.

TokenParser

A token parser that processes JSON tokens as emitted by the Tokenizer and emits JSON values/objects.

import { TokenParser} from '@streamparser/json-whatwg';

const tokenParser = new TokenParser(opts, writableStrategy, readableStrategy);

Writable and readable strategy are standard WhatWG Stream settings (see MDN).

The available options are:

{
  paths: <string[]>,
  keepStack: <boolean>, // whether to keep all the properties in the stack
  separator: <string>, // separator between object. For example `\n` for nd-js. If left empty or set to undefined, the token parser will end after parsing the first object. To parse multiple object without any delimiter just set it to the empty string `''`.
  emitPartialValues: <boolean>, // whether to emit values mid-parsing.
}
  • paths: Array of paths to emit. Defaults to undefined which emits everything. The paths are intended to suppot jsonpath although at the time being it only supports the root object selector ($) and subproperties selectors including wildcards ($.a, $.*, $.a.b, , $.*.b, etc).
  • keepStack: Whether to keep full objects on the stack even if they won't be emitted. Defaults to true. When set to false the it does preserve properties in the parent object some ancestor will be emitted. This means that the parent object passed to the onValue function will be empty, which doesn't reflect the truth, but it's more memory-efficient.

JSONParser

The full blown JSON parser. It basically chains a Tokenizer and a TokenParser.

import { JSONParser } from '@streamparser/json-whatwg';

const parser = new JSONParser();

Usage

You can use both components independently as

const tokenizer = new Tokenizer(opts);
const tokenParser = new TokenParser();
const jsonParser = tokenizer.pipeTrough(tokenParser);

You can subscribe to the resulting data using the

import { JSONParser } from '@streamparser/json-whatwg';

const inputStream = new ReadableStream({
  async start(controller) {
    controller.enqueue(parser.write('"Hello world!"'));  // will log "Hello world!"
    // Or passing the stream in several chunks
    parser.write('"');
    parser.write('Hello');
    parser.write(' ');
    parser.write('world!');
    parser.write('"');// will log "Hello world!"
    controller.close();
  },
});

const parser = new JSONParser({ stringBufferSize: undefined, paths: ['$'] });
const reader = inputStream.pipeThrough(jsonparser).getReader();
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  console.log(value);
}

Write is always a synchronous operation so any error during the parsing of the stream will be thrown during the write operation. After an error, the parser can't continue parsing.

import { JSONParser } from '@streamparser/json-whatwg';

const inputStream = new ReadableStream({
  async start(controller) {
    controller.enqueue(parser.write('"""'));  // will log "Hello world!"
    controller.close();
  },
});
const parser = new JSONParser({ stringBufferSize: undefined });

try {
  const reader = inputStream.pipeThrough(parser).getReader();
  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    console.log(value);
  }
} catch (err) {
  console.log(err); // logs 
}

Examples

Stream-parsing a fetch request returning a JSONstream

Imagine an endpoint that send a large amount of JSON objects one after the other ({"id":1}{"id":2}{"id":3}...).

  import { JSONParser} from '@streamparser/json-whatwg';

  const parser = new JSONParser();

  const response = await fetch('http://example.com/');
  const reader = response.body.pipeThrough(parser).getReader();
  while(true) {
    const { done, value } = await reader.read();
    if (done) break;
    // TODO process element
  }

Stream-parsing a fetch request returning a JSON array

Imagine an endpoint that send a large amount of JSON objects one after the other ([{"id":1},{"id":2},{"id":3},...]).

  import { JSONParser } from '@streamparser/json-whatwg';

  const parser = new JSONParser({ stringBufferSize: undefined, paths: ['$.*'], keepStack: false });

  const response = await fetch('http://example.com/');

  const reader = response.body.pipeThrough(parser).getReader();
  while(true) {
    const { done, value: parsedElementInfo } = await reader.read();
    if (done) break;

    const { value, key, parent, stack } = parsedElementInfo;
    // TODO process element
  }

Stream-parsing a fetch request returning a very long string getting previews of the string

Imagine an endpoint that send a large amount of JSON objects one after the other ("Once upon a midnight <...>").

  import { JSONParser } from '@streamparser/json-whatwg';

  const parser = new JSONParser({ stringBufferSize: undefined, paths: ['$.*'], keepStack: false });

  const response = await fetch('http://example.com/');

  const reader = response.body.pipeThrough(parser).getReader();
  while(true) {
    const { done, value: parsedElementInfo } = await reader.read();
    if (done) break;

    const { value, key, parent, stack, partial } = parsedElementInfo;
    if (partial) {
      console.log(`Parsing value: ${value}... (still parsing)`);
    } else {
      console.log(`Value parsed: ${value}`);
    }
  }

License

See [LICENSE.md].