mapdown
v1.0.2
Published
A CLI tool to crawl sitemap.xml and convert all pages to LLM-friendly Markdown
Maintainers
Readme
Mapdown
A CLI tool to crawl sitemap.xml and convert all pages to LLM-friendly Markdown.
Features
- Parse sitemap.xml files from websites
- Extract content from web pages
- Convert HTML content to clean Markdown format
- Optimized for LLM consumption
Installation
Install globally to use the CLI command:
npm install -g mapdownOr use directly without installation:
npx mapdown [sitemap-url-or-file]Usage
Basic Usage
# Using npx (recommended)
npx mapdown https://example.com/sitemap.xml
# If installed globally
mapdown https://example.com/sitemap.xml
# Using local sitemap file
mapdown ./sitemap.xmlExamples
# Crawl a website's sitemap
npx mapdown https://example.com/sitemap.xml
# Process a local sitemap file
npx mapdown ./my-sitemap.xmlThe tool will output consolidated Markdown content with:
- Table of contents
- Page metadata (title, description, URL)
- Clean content extracted from each page
- Progress tracking during crawling
Development
# Install dependencies
npm install
# Run in development mode
npm run dev
# Build the project
npm run build
# Run tests
npm test
# Run tests in watch mode
npm test:watchRequirements
- Node.js >= 18.0.0
License
MIT
