better-sitemap-crawler
v0.0.1
Published
To install:
Readme
better-sitemap-crawler
To install:
npm i -g better-sitemap-crawlerUsage:
Usage: better-sitemap-crawler --url <startUrl> [options]
Options:
-u, --url <url> Required: Starting URL for the crawl
-o, --out <file> Output sitemap.xml file path (default: sitemap.xml)
-c, --concurrency <num> Max concurrent requests (default: 10)
-p, --max-pages <num> Max pages to crawl (default: 1000)
-t, --timeout <ms> Request timeout in ms (default: 15000)
-d, --delay <ms> Base delay between requests in ms (default: 50)
-A, --user-agent <str> Custom User-Agent string
--strip-qs Remove query strings from URLs before processing
--no-lastmod Do not include <lastmod> tags in the sitemap
-h, --help Show this help message
Example: node crawler.js -u https://example.com/docs -o docs.xml -c 5 --strip-qs