npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

census-csv-parser

v1.0.7

Published

tool for parsing census csvs into .json objects

Downloads

22

Readme

census-csv-parser

v 1.0.5

https://github.com/klm127/census-csv-parser

Description

census-csv-parser aims to ease the cleaning of csv data, with a focus on data gathered from census.gov, by providing utility functions and objects for rapidly manipulating csv files by operating on them as two dimensional arrays.

The usual goal is to convert .csv data into .json objects, whereby column or header rows become nested properties of those objects, for use in a variety of applications.

Dependencies

node npm

Documentation

This code aims to be thoroughly documented. If you are viewing this readme on NPM, navigate to the github repo to see the link.

Available here

To create the documentation from the GitHub repo, first run 'npm install' to install the theme dependency. Then execute, e.g., npm run document-minami for the minami themed documentation.

Running

  • Use Node.js
  • Create your project folder
  • Run npm init
  • Run npm install census-csv-parser
  • To access the parser, use const Parser = require('census-csv-parser/parser')
  • To access the utility functions, use const util = require('census-csv-parser/util')
  • To create a new parser object, `const parser = new Parser()
  • To access a util function, use, i.e. util.csvArray(mycsvtext)

Testing

Run npm run test to run unit tests

What it does

A section of geograph data downloaded from the census website might look like this:

| "GEO_ID" | "NAME" | "S1201_C01_001E" | "S1201_C01_001M" | "S1201_C01_002E" | "S1201_C01_002M" | |---------------|------------------------|-------------------------------------------------|--------------------------------------------------------|---------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------| | "id" | "Geographic Area Name" | "Estimate!!Total!!Population 15 years and over" | "Margin of Error!!Total!!Population 15 years and over" | "Estimate!!Total!!Population 15 years and over!!AGE AND SEX!!Males 15 years and over" | "Margin of Error!!Total!!Population 15 years and over!!AGE AND SEX!!Males 15 years and over" | | "0400000US01" | "Alabama" | "4004468" | "3955" | "1909410" | "4332" | | "0400000US02" | "Alaska" | "578225" | "856" | "301975" | "1721" | | "0400000US04" | "Arizona" | "5919085" | "1614" | "2926106" | "2405" | | "0400000US05" | "Arkansas" | "2439812" | "2922" | "1181259" | "3972" | | "0400000US06" | "California" | "32124112" | "4693" | "15865087" | "6043" | | "0400000US08" | "Colorado" | "4720810" | "2634" | "2372402" | "4047" | | "0400000US09" | "Connecticut" | "2975029" | "1595" | "1441236" | "2275" |

And we want the output to look like this:

{
    "Alabama": {
        "Total": {
            "Population 15 years and over": {
                "Population 15 years and over": 4004468,
                "AGE AND SEX": {
                    "Males 15 years and over": 1909410
                }
            }
        }
    },
    "Alaska": {
        "Total": {
            "Population 15 years and over": {
                "Population 15 years and over": 578225,
                "AGE AND SEX": {
                    "Males 15 years and over": 301975
                }
            }
        }
    },
    "Arizona": {
        "Total": {
            "Population 15 years and over": {
                "Population 15 years and over": 5919085,
                "AGE AND SEX": {
                    "Males 15 years and over": 2926106
                }
            }
        }
    },
    "Arkansas": {
        "Total": {
            "Population 15 years and over": {
                "Population 15 years and over": 2439812,
                "AGE AND SEX": {
                    "Males 15 years and over": 1181259
                }
            }
        }
    },
    "California": {
        "Total": {
            "Population 15 years and over": {
                "Population 15 years and over": 32124112,
                "AGE AND SEX": {
                    "Males 15 years and over": 15865087
                }
            }
        }
    },
    "Colorado": {
        "Total": {
            "Population 15 years and over": {
                "Population 15 years and over": 4720810,
                "AGE AND SEX": {
                    "Males 15 years and over": 2372402
                }
            }
        }
    },
    "Connecticut": {
        "Total": {
            "Population 15 years and over": {
                "Population 15 years and over": 2975029,
                "AGE AND SEX": {
                    "Males 15 years and over": 1441236
                }
            }
        }
    }
}

census-csv-parser accomplishes this by using utility functions to clean the data. Properties are nested in the output .json by defining a property column and a header row in the Parser object. This property column is an array of arrays, each sub-array describing a path to map properties to. The header row describes the objects properties will be mapped to. As Parser converts the data array to an object, it maps values from the property column to objects created from the header row and places them all in a wrapper object.

This mapping, defined in the chain and chainMultiple functions in the util namespace has some interesting features. Here is how it operates.

Chain procedure

Definitions

Name | Definition | Example -------- | -------------- | ----------- | Property Array | An array corresponding to the "property" column, where each value is another array representing properties to be nested. | ["Total", "Population", "15 and Older"], ["Total", "Population", "15-20"], ["Total", "Population", "20-25"]... | | Header Array | An array corresponding to the "header" row, where each value is the first property of the final json object. | ["Alabama","Arkansas",....] | Data Array | The data to be mapped | [ 100000, 150000, ....], [200000, 40000...] | | Wrapper Object | The final object generated, stringable as a JSON | {"Alabama": {"Total": {"Population":{"15-20":100000, "20-25":150000}}}},{"Arkansas":...}....}

  1. Get the property array, header array, and 2-d data array for the intersecting area.
  2. Create a wrapper object.
  3. Iterate through the header array.
    1. Create a new object in the wrapper object where the key is equal to the current header array iteration.
    2. Iterate through the property array.
      1. For each row of the property array, Iterate through that sub-array.
        • Check next value of the property array sub-array to see if it already exists in the object as a property.
          • If it does, Check to see if it's a final value (not a sub-object)
            • If it's a final value check if there more properties to iterate.
              • If there are not more properties to iterate, overwrite the existing value with the value in the data array at the intersection of the current property array value and current header array value. (duplicate property mapping)
              • If there are more properties to map, create a new object at that key and map the existing final value to a key with the same name as the parent object. This accounts for the encountering of "total" type values. Continues the mapping process recursively with the new object and the unused portion of the properties sub-array.
            • If it's not a final value, (it's an object), continue the mapping process recursively with that object and the unused portion of the properties array.
          • If the property does not exist yet in the object, check to see whether its final property.
            • If it is a final property, create that property in the object and set its value to the data point at the intersection of the property array and header array in the data array.
            • If it is not the final property, create a new object for that property and continue the mapping process recursively with the new object and the unused portion of the properties sub-array.
      2. When finished with a property sub-array, move to the next property sub-array.
    3. When finished with a data array column, move to the next header array value and repeat the process.
  4. When it all data is mapped to an object, return that object.

Other

See the Example namespace for more example usage and the Tutorial for a stepwise breakdown.