npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

bluemango-scraper

v1.0.0

Published

Scraper === This library is a helper for creating a script that scrapes values of a page

Downloads

4

Readme

Scraper

This library is a helper for creating a script that scrapes values of a page

var scraper = new Scraper({
  title: {
    type:'text',
    selector:'h1.the-title' // this is a jquery selector that will look for the first h1 with the class the-title
  }
})
return scraper.getResults() // {title:'hello world'}

The config

The config is an objects where where every key stands for a field that wil be returned in the results. the value of the config item can be a string, object or a function

###whitelistedDomain Only allow the library to work when the value matches the current domain. Value should be a regular expression of type string.

using an object

There are several types that you can select to configure your scraper via objects. the types you can choose from are listed below

text

Use text to extract text from a page

var scraper = new Scraper({
  title: {
    type:'text', // default value
    selector: '.title span',
    test: '/[0-9]*/g' // this wil test if the result only contains digits
  }
})

url

var scraper = new Scraper({
  clickUrl: {
    type:'url', // will return the current pageurl
    prefix: 'htttp://yourredirect.com/url=' (optional) use when you want to prefix your url,
    query: {myparam:''} // (optional) returns the url with only myparam appended to the url
  }
})

image

var scraper = new Scraper({
  imageUrl: {
    type:'image', // will return the src of an image
    selector: 'image#myImage'
  }
})

regex

var scraper = new Scraper({
  title: {
    type:'regex',
    selector: '.title span',
    test: '/€([0-9]*)/g' // wil return the first regex group
  }
})

template

var scraper = new Scraper({
  title: {
    type:'template',
    template: 'hello {{name}}' // will return the value of name
  },
  name: {
    type: 'text',
    selector: '.profile .name'
  }
})

dictionary

var scraper = new Scraper({
  custom1:{
    type:'dictionary',
    selector:'#yourdealCompareBlock > div > div > img',
    dictionary:{
      'VODAFONE': '/vodafone/g', // result is VODAFOME when vodafone is found in the selector text
      'TELFORT': '/telfort/g', // result is TELFORT when telfort is found in the selector text
      'T-MOBILE': '/tmobile/g', // result is T-MOBILE when tmobile is found in the selector text
      'TELE2': '/tele2/g',
      'BEN': '/ben/g',
      'KPN': '/kpn/g',
      'HI': '/hi/g'
    }
  },
})

using a string

Just a short hand for a selector with the type text (see example below)

var scraper1 = new Scraper({
  title: h1.the-title
})

var scraper2 = new Scraper({
  title: {
    type:'text',
    selector:'h1.the-title' // this is a jquery selector that will look for the first h1 with the class the-title
  }
})

using an function

var scraper = new Scraper({
  number1: {type:'template', template:'1'}
  number2: {type:'template', template:'3'}
  sum: function(scraper){
    var n1 = Number(scraper.getField('number1'))
    var n2 = Number(scraper.getField('number2'))
    return n1 + n2 // = 4
  }
})

Default fields

by default the scraper only returns the following fields ['id', 'available', 'title', 'imageUrl', 'clickUrl', 'category', 'basket', 'description', 'priceNormal', 'priceDiscount', 'logoUrl', 'stickerText', 'custom1', 'custom2', 'custom3', 'custom4']