npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

crack-json

v1.3.0

Published

Extracts all JSON objects from an arbitrary text document.

Downloads

45,626

Readme

crack-json 🥊

Travis build status Coveralls NPM version Canonical Code Style Twitter Follow

Extracts all JSON objects from an arbitrary text document.

Use case

The primary use-case is extracting structured data from non-structured documents, e.g. when scraping websites, it is common that HTML embeds JSON or JSON-like data structures.

<script>
$(document).on('BookingApp:SeatingPlan:Ready', () => {
  $(document).trigger('BookingApp:StartSeatingPlanOnly', {
    "sessionId": "438a8373-5fab-4d36-ac92-053ae2d04e9c"
  });
});
</script>

The way that the crack-json is intended to be used is that the scraper must narrow down the document to the HTML containing the subject JSON data and then crack-json is used to extract all JSON-like objects. If in the above example we are interested in extracting the sessionId, then it would be sufficient to get innerHTML of the script tag, use crack-json to extract all JSON-like objects, and search for the matching object, e.g.

const session = extractJson(document.querySelector('script').innerHTML)
  .find((maybeTargetSubject) => {
    return maybeTargetSubject.sessionId;
  });

session;
// {
//   "sessionId": "438a8373-5fab-4d36-ac92-053ae2d04e9c"
// }

Implementation

crack-json iterates through the input text by searching for characters that indicate the start of a JSON object, array or text entity, and attempts to match the closing character and parse the resulting string. crack-json iterates through document this way until it finds all text entities that can be parsed as JSON.

API

crack-json extracts a single function: extractJson.

import {
  extractJson
} from 'crack-json';

extractJson API

/**
 * @property filter Used to filter out strings before attempting to decode them.
 * @property parser A parser used to extract JSON from the suspected strings. Default: `JSON.parse`.
 */
type ExtractJsonConfigurationType = {|
  +filter?: (input: string) => boolean,
  +parser?: (input: string) => any,
|};

type ExtractJsonType = (subject: string, configuration?: ExtractJsonConfigurationType) => any;

extractJson: ExtractJsonType;

Usage

import {
  extractJson
} from 'crack-json';

const payload = `
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus ultricies laoreet malesuada. In feugiat augue non tristique pharetra. Duis nisl odio, vulputate maximus suscipit sit amet, ultrices vel lacus.

{"foo": "bar"}

Suspendisse volutpat risus id nibh lacinia, in placerat urna luctus. Phasellus condimentum nec ipsum ut tincidunt. Nullam aliquam euismod ante, vitae accumsan leo egestas a. Aliquam sed lacus nisl. Pellentesque nec hendrerit sem.

[{"baz": "qux"}]

Phasellus iaculis dui nec purus imperdiet placerat non sit amet odio. Donec pretium, arcu ac suscipit imperdiet, tellus orci convallis leo, non laoreet tortor lectus at dolor. Aenean tellus diam, imperdiet nec eleifend at, fermentum sit amet tellus. Vestibulum id purus ac mauris eleifend iaculis.

"quux"

Vestibulum sit amet quam tellus. Nulla facilisi.

`;

console.log(extractJson(payload));

Output:

[
  {
    foo: 'bar'
  },
  [
    {
      baz: 'qux'
    }
  ],
  'quux'
]

Filtering out matches

You can use filter to exclude strings before they are parsed using an arbitrary condition. This will improve performance and reduce output only to the desirable objects, e.g.

import {
  extractJson
} from 'crack-json';

const payload = `
  <script>
  const foo = {
    cinemaId: '1',
  };
  const bar = {
    venueId: '1',
  };
  const baz = {
    userId: '1',
  };
  </script>
`;

console.log(extractJson(payload, {
  filter: (input) => {
    return input.includes('userId')
  },
}));

Output:

[
  {
    userId: '1',
  },
]