npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

elastic-indexer

v1.0.0

Published

## installation

Readme

msa-elastic-indexer

installation

Install dependencies.

npm install elastic-indexer

Run kafka

First start zookeeper.

bin/zookeeper-server-start.sh config/zookeeper.properties 

Then run kafka.

bin/kafka-server-start.sh config/server.properties

Parser

Require the module. (Assuming file name is app.js)

const oodebe = require('elastic-indexer');

Create configuration of parser. Refer below for reference.

let consumerConfig = {
  "source": {
    "npm": {
      parser: {
        'description': {
          html: 'body > div.container.content > div.content-column > p',
          method: 'html'
        },
        'github': {
          html: 'body > div.container.content > div.sidebar > ul:nth-child(3) > li:nth-child(3) > a',
          method: 'html'
        }
      },
      partition: 0,
      offset: 0,
      readFromLastOffset: false
    }
  }
};

Each message in Kafka is associated with unique numeric ID called offset. If the key readFromLastOffset is set to true, then it will read the messages from the last read offset, if set to false, it will start reading from the 0 or number specified in offset key.

Initialize the Parser.

let parser = new oodebe.Parser();
parser.initializeConsumer(consumerConfig);

Create configuration for Indexer. Refer below for reference.

let indexerConfig = {
  "source": {
    "npm": {
      "indexer" : {
        index: 'npmjs',
        type: 'repos',
        fields: {
          doc: 'doc'
        }
      }
    }
  },
  "elastic" : {
    host: "localhost:9200"
  }
};

Initialize the indexer.

let indexer = new oodebe.Indexer();
indexer.initializeIndexer(indexerConfig);

Parsing and Indexing

While reading the messages from Kafka, on each message we will recieve the message event.

parser.on('message',function(topic,html,$) {
  // Parse the HTML
  let parseData = parser.parseHTML(topic,$);
  // Index it in ElasticSearch
  indexer.index(topic,html,parseData);
});

Events

It emits following events for Parser and Indexer.

parser.on('error',function(topic,error) {
});

parser.on('offsetOutOfRange',function(error) {
});

indexer.on('error',function(error) {
});

indexer.on('success',function() {
});

Finding HTML Selector path

One of the easiest way is to use chrome developers tool. For example, if you want to crawl the Github link of all the NPM packages, then follow steps mention below:

  • visit any npm package page and open up the chrome inspector.
  • Travrse to the target HTML element
  • Right click and click on copy then copy selector

Imgur

Complete code

var oodebe = require('elastic-indexer');
let consumerConfig = {
  "source": {
    "npm": {
      parser: {
        'description': {
          html: 'body > div.container.content > div.content-column > p',
          method: 'html'
        },
        'github': {
          html: 'body > div.container.content > div.sidebar > ul:nth-child(3) > li:nth-child(3) > a',
          method: 'html'
        }
      },
      partition: 0,
      offset: 0,
      readFromLastOffset: false
    }
  }
};

let indexerConfig = {
  "source": {
    "npm": {
      "indexer" : {
        index: 'npmjs',
        type: 'repos',
        fields: {
          doc: 'doc'
        }
      }
    }
  },
  "elastic" : {
    host: "localhost:9200"
  }
};

let parser = new oodebe.Parser();
let indexer = new oodebe.Indexer();

// Intialize parser and consumer
parser.initializeConsumer(consumerConfig);
indexer.initializeIndexer(indexerConfig);

// Register events

parser.on('message',function(topic,html,$) {
  let parseData = parser.parseHTML(topic,$);
  indexer.index(topic,html,parseData);
});

parser.on('error',function(topic,error) {
  console.log(topic);
  console.log(error);
});

parser.on('offsetOutOfRange',function(error) {
  console.log(error);
});

indexer.on('error',function(error) {
  console.log(error);
});

indexer.on('success',function() {
  console.log("indexed");
});

Running the app

Start the server.

node app.js

This should start the server and parse the HTML message and put it in ElasticSearch.