npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

jsilk

v1.0.4

Published

A web scraping library for Node.js

Readme

JSilk

A web scraping library for Node.js with intelligent page loading. JSilk automatically detects whether a page requires JavaScript rendering and chooses the optimal loading strategy — fast HTTP requests for static pages, and headless browser rendering for SPAs.

Installation

npm install jsilk

For dynamic page loading, Playwright's Chromium browser must also be installed:

npx playwright install chromium

Quick Start

import JSilk from "jsilk";
const { Spider } = JSilk;

const spider = new Spider();
spider.addToQueue(["https://example.com"]);
await spider.start();

Page Loading Strategies

JSilk provides three page loading strategies controlled by the dynamic parameter on Spider:

| Strategy | dynamic value | Engine | Best for | | --- | --- | --- | --- | | Default | undefined (default) | Static first, then dynamic if needed | Unknown pages | | Static | false | Axios HTTP requests | Server-rendered HTML | | Dynamic | true | Playwright Chromium | SPAs (React, Vue, Angular, etc.) |

The default strategy loads the page over HTTP first, then analyzes the HTML using a heuristic scoring system. If the content appears to be a JavaScript-heavy SPA (score >= 7), it automatically re-fetches the page with a headless browser.

Heuristic signals include: low visible text content, SPA root containers (#app, #root, #__next), framework markers (React, Vue, Angular), heavy script presence, dynamic data fetching patterns, and JS-only navigation.

Usage

Basic Scraping

import JSilk from "jsilk";
const { Spider } = JSilk;

const spider = new Spider();
spider.addToQueue(["https://example.com", "https://example.com/about"]);
await spider.start();

Custom Callback

By default, loaded pages are logged to the console. Pass a custom callback to handle pages yourself:

const onSuccess = (page) => {
  console.log(page.url);      // The page URL
  console.log(page.content);  // HTML content
  console.log(page.status);   // HTTP status code
  console.log(page.lastLoaded); // Date timestamp
};

const spider = new Spider([], onSuccess);
spider.addToQueue(["https://example.com"]);
await spider.start();

Force Static or Dynamic Loading

// Static only (fast, no browser overhead)
const staticSpider = new Spider([], undefined, false);

// Dynamic only (full JS rendering via Chromium)
const dynamicSpider = new Spider([], undefined, true);

Proxy Support

import JSilk from "jsilk";
const { Spider, Proxy } = JSilk;

const proxy = new Proxy("host:port:username:password");
const spider = new Spider([proxy]);
spider.addToQueue(["https://example.com"]);
await spider.start();

When multiple proxies are provided, one is selected at random for each request.

Using Page Objects

You can enqueue Page objects directly instead of URL strings:

import JSilk from "jsilk";
const { Spider, Page } = JSilk;

const page = new Page("https://example.com");
const spider = new Spider();
spider.addToQueue([page]);
await spider.start();

Stopping the Spider

const spider = new Spider();
spider.addToQueue(urls);
spider.start(); // don't await — start in background
// ...
await spider.stop(); // stops after the current page finishes

API

Spider(proxies?, onSuccess?, dynamic?)

The main entry point for scraping.

  • proxies Proxy[] — Array of proxy objects. Default: []
  • onSuccess Function — Callback called with the loaded Page. Default: logs to console
  • dynamic boolean | undefined — Loading strategy. undefined = auto, false = static, true = dynamic

Methods:

  • addToQueue(pages) — Add URL strings, Page objects, or an array of either to the queue
  • start() — Process the queue. Returns a Promise that resolves when the queue is empty or stop() is called
  • stop() — Stop processing after the current page finishes

Page(url, content?)

Represents a web page.

  • url string — Absolute URL (automatically normalized)
  • content string | null — HTML content (populated after loading)
  • status number | null — HTTP status code
  • lastLoaded Date | null — Timestamp of last load

Proxy(proxy)

Proxy configuration.

  • proxy string — Format: "host:port:username:password" (username and password are optional)

Development

# Install dependencies
npm install

# Run tests
npm test

# Lint
npm run lint

# Format code
npm run format

# Check formatting
npm run checkformat