npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

icrawl

v0.1.0

Published

Crawl pages and generate `html`s corresponding to the path.

Readme

icrawl

Crawl pages and generate htmls corresponding to the path.

Features

  • With nginx, you can do SEO on the front-end rendered page.
  • Built-in server, you can directly crawl the page based on the built folder
  • The html save path corresponds to the url path
  • Does not depend on any front-end framework
  • Provide API calls and command line calls

Examples

Node API

const path = require('path')
const Crawl = require('icrawl')
const crawl = new Crawl({
  requestTimeout: 10000,
  isNormalizeSourceURL: true,
  routes: [
    'https://nodejs.org/api/path.html',  
    'https://nodejs.org/api/url.html'
  ],
  path: path.resolve(__dirname, 'static')
})
crawl.start()

Configuration

.icrawlrc.js in your project root

const path = require('path')

module.exports = {
  isNormalizeSourceURL: true,
  routes: [
    'https://nodejs.org/api/path.html',  
    'https://nodejs.org/api/url.html'
  ],
  path: path.resolve(__dirname, 'static')
}

package.json

"scripts": {
  "build": "icrawl"
}

options

  • options <Object>
    • viewport <Object> viewport size
      • width <Number>
      • height <Number>
    • maxPageCount <Number> Number of pages that can be opened in parallel, default: 10
    • isNormalizeSourceURL <Boolean | Object> Whether to convert the relative path of images, anchors, links, scripts to absolute paths in your crawled html, for example, When you crawl the page url is http://www.example.com/example, it will be /favicon.ico to http://www.example.com/favicon.ico. You can also set each option individually. default: false
      • links <Boolean>
      • images <Boolean>
      • scripts <Boolean>
      • anchors <Boolean>
    • requestTimeout <Number> Number of milliseconds for request timeout, default: 30000ms, set to 0 to wait indefinitely
    • host <String> default: ''
    • routes <Array<String>> The list of routes to be crawled, the relative path needs to set the host option
    • outputPath <String> Html saved directory
    • saveHTML <Boolean> Whether to save the crawl page as html, default: true
    • depth <Number | Object> Specify page depth if it is a Number, The A page is configured on the routes, the A (depth: 0) page contains a link to B (depth: 1), and the B page contains a link to C (depth: 2), default: 0
      • value <Number> page depth
      • include <RegExp> Included link, default: null
      • exclude <RegExp> Excluded link, default: null
      • after <Function(Array<PageRoute>)> Callback after page link collection is complete, default: null
    • serverConfig <String | Object> If the page to be crawled is not on a server, you can specify this option to start a server locally. If it is a String, specify the directory where the page is located. default: null
      • path <String> The directory where the page is located, for example, your build directory path, then you can run icrawl after build command or put two commands together in scripts
      • port <Number> default: 3333
      • public <String> This option needs to be specified when the isNormalizeSourceURL option is specified as true at the same time. Relative paths will be converted relative to this option
      • isFallback <Boolean> For SPA, alwalys change the requested location to the index.html
    • requestInterception <Object> Filter requests, use this configuration reasonably to speed up crawling. For example, we don't need to wait for images, css, fonts, third-party scripts to load, after all, we only need to save the rendered html most of the time
      • include <RegExp>
      • exclude <RegExp>
    • progressBarStyle <Object> Progress bar style
      • prefix <String> default: ''
      • suffix <String> default: ''
      • remaining <String> default: '░'
      • completed <String> default: '█'

crawl.start()

return: Promise

PageRoute

  • url <String> The page url to crawl
  • root <PageRoute> The root of chain
  • referer <PageRoute> The parent of this url

Tips

  • By configuring nginx, you can enable SEO for front-end rendering pages.
  • If you use nginx you will need to install the set-misc-nginx-module module, or install the OpenResty directly.

License

MIT licensed.