npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@arachnodex/job-sitemap

v1.0.1

Published

Arachnodex job for generating crawl sitemaps.

Readme

@arachnodex/job-sitemap

The Sitemap job creates an XML sitemap from crawlable internal URLs discovered while Arachnodex spiders the configured site.

It can include normal HTML pages, optionally include document URLs such as PDFs, and optionally skip URLs that point at a different canonical target.

Install

Projects created with npm create @arachnodex include this job by default. For a manual install, add it beside @arachnodex/core:

npm install @arachnodex/job-sitemap

The package uses @arachnodex/core as a peer dependency, so it should be installed in the same project as the crawler.

Usage

Run the job with the default crawler config:

npm exec -- arachnodex -c default -j sitemap

Use a job-specific config by placing -c after the job name:

npm exec -- arachnodex -c default -j sitemap -c sitemap

That loads the crawler config from config/default.json and the Sitemap job config from config/sitemap.json.

Config File

The package example config is available at:

config/sitemap.example.json

A generated Arachnodex project copies this to:

config/sitemap.json

For a manual install, copy the example into your Arachnodex project's config/ directory as sitemap.json before running the job.

Default config:

{
  "includeOnlyCanonical": true,
  "includeDocs": true,
  "emailReportEnabled": true,
  "outputFile": "../web/sitemap.xml",
  "includeDocPattern": "((x-)?pdf)|(ms-?excel)|(vnd.)|(ms-?word)|(ms-?powerpoint)|(ms-?access)|(download)"
}

Settings

| Setting | Type | Default | Description | | --- | --- | --- | --- | | includeOnlyCanonical | boolean | true | Only include HTML page URLs when the page has no canonical URL or its canonical URL matches the current crawled URL. Set to false to include crawled page URLs even when they point at another canonical target. | | includeDocs | boolean | true | Include matching non-HTML document URLs discovered from successful response headers. | | emailReportEnabled | boolean | true | Include the Sitemap job summary in Arachnodex report emails. | | outputFile | string | "../web/sitemap.xml" | Path for the generated sitemap file, resolved from the directory where the crawler is run. The default assumes the Arachnodex project sits beside a website document root at ../web. | | includeDocPattern | string | "((x-)?pdf)|(ms-?excel)|(vnd.)|(ms-?word)|(ms-?powerpoint)|(ms-?access)|(download)" | Regular expression used against response content-type headers when includeDocs is enabled. Matching URLs are written as document entries. |

Output Path

outputFile may point outside the Arachnodex project directory as long as the Node process has filesystem permission to write there. A common layout is:

site-audit/
web/

With that layout, running Arachnodex from site-audit/ and leaving outputFile as ../web/sitemap.xml writes the public sitemap beside the audit project.

The job overwrites the configured output file at the end of each successful crawl. Temporary working files are run-specific and are cleaned up after the final sitemap is written.

Switches

| Switch | Description | | --- | --- | | -v, --version | Print the Sitemap job version and exit without crawling. |