super-simple-sitemap-generator
v2.1.0
Published
Node.js powered scraper that generates a sitemap.xml file from command line with just a URL. Works with CSR apps (React, Angular, Vue).
Maintainers
Readme
Super Simple Sitemap Generator
A Node.js powered scraper that crawls websites and generates sitemap.xml files. Works seamlessly with client-side rendered (CSR) applications like React, Angular, and Vue.
Built with TypeScript and Playwright for modern web scraping.
Features
- Works with dynamic CSR applications (React, Angular, Vue, etc.)
- Built with TypeScript for type safety
- Powered by Playwright for reliable browser automation
- Configurable wait times for dynamic content
- Customizable URL limits
- Custom output paths
- Comprehensive test coverage
- Memory-efficient with proper resource cleanup
Installation
Install globally:
npm install -g super-simple-sitemap-generatorOr use with npx (no installation required):
npx super-simple-sitemap-generator https://example.comCLI Usage
Basic usage:
sitemap https://example.comWith options:
sitemap --wait 2500 --limit 100 --output my-sitemap.xml https://example.comCLI Options
| Parameter | Type | Default | Description |
| --- | --- | --- | --- |
| <url> | string | required | The base URL to start crawling from |
| -w, --wait <ms> | number | 1500 | Time in milliseconds to wait for dynamic content to load |
| -l, --limit <n> | number | 99999 | Maximum number of URLs to crawl |
| -o, --output <path> | string | sitemap.xml | Output file path for the generated sitemap |
Programmatic Usage
You can also use this package as a library in your Node.js/TypeScript projects:
import { Sitemapper } from 'super-simple-sitemap-generator';
import * as fs from 'fs';
async function generateSitemap() {
const mapper = new Sitemapper(1500, 100, 'https://example.com');
try {
// Initialize the browser
await mapper.init();
// Parse the base URL
await mapper.parse('https://example.com');
// Parse all discovered URLs
while (mapper.getUrls().length > 0 && mapper.getParsedUrls().length < 100) {
const nextUrl = mapper.getUrls()[0];
await mapper.parse(nextUrl);
}
// Generate the XML
mapper.generateXml();
// Write to file
fs.writeFileSync('sitemap.xml', mapper.getXml());
console.log(`Sitemap generated with ${mapper.getParsedUrls().length} URLs`);
// Cleanup
await mapper.close();
} catch (error) {
console.error('Error generating sitemap:', error);
await mapper.close();
}
}
generateSitemap();Requirements
- Node.js >= 18.0.0
What's New in v2.0
- TypeScript rewrite - Full type safety and better developer experience
- Playwright instead of Puppeteer - More reliable and modern browser automation
- Better error handling - Graceful error recovery and detailed error reporting
- Memory leak fixes - Proper cleanup of browser resources
- Improved CLI - Better progress indication and error messages
- Custom output paths - Specify where to save the sitemap
- Bug fixes - Fixed URL matching for short domains
- Comprehensive tests - Full test coverage with Vitest
Breaking Changes from v1.x
- Node.js requirement: Now requires Node.js >= 18 (previously >= 10)
- Browser engine: Uses Playwright instead of Puppeteer (transparent to most users)
- Exit codes: Fixed to return 0 on success (previously returned 1)
Development
# Install dependencies
npm install
# Run tests
npm test
# Run tests with coverage
npm run test:coverage
# Build
npm run buildLicense
MIT - Josep Vidal
