@knowcode/screenshotfetch
v1.0.0
Published
Web application spider with screenshot capture and customer journey documentation. Automate user flow documentation with authentication support.
Maintainers
Readme
@knowcode/screenshotfetch
A comprehensive web application spider with screenshot capture and customer journey documentation, built on Puppeteer. Automatically crawl web applications, handle authentication, and generate step-by-step user flow documentation with screenshots.
🚀 Features
- 📸 High-quality screenshot capture - Crystal clear screenshots with full customization
- 🕷️ Web application spidering - Intelligent crawling with flow documentation
- 🔐 Automated authentication - Username/password login with smart form detection
- 📋 Customer journey mapping - Step-by-step flows with visual documentation
- 🔗 URL tracking & cross-referencing - Complete traceability for all screenshots
- 🍪 Advanced cookie consent handling - Multiple strategies for banner removal
- 🚀 Fast and reliable - Built on Puppeteer with robust error handling
- 💻 CLI and programmatic API - Use via command line or integrate into your code
- 📦 Batch processing - Process multiple URLs efficiently
- 🎯 Smart filtering - Avoids destructive actions like logout/delete
📦 Installation
Global Installation (CLI Usage)
npm install -g @knowcode/screenshotfetchLocal Installation (Programmatic Usage)
npm install @knowcode/screenshotfetch⚡ Quick Start
Spider a Web Application
# Spider with authentication
screenshotfetch spider https://app.example.com/login -u username -p password
# Generate documentation in custom directory
screenshotfetch spider https://app.example.com -u [email protected] -p pass123 -o ./my-docsSingle Screenshot
# Basic screenshot
screenshotfetch capture https://example.com -o screenshot.png
# Full page screenshot
screenshotfetch capture https://example.com -o fullpage.png --fullpageCLI Usage
Web Application Spider (NEW)
# Spider a web application with authentication
screenshotfetch spider https://app.example.com/login -u username -p password
# Custom output directory and flow limits
screenshotfetch spider https://app.example.com -u [email protected] -p pass123 -o ./my-docs --max-flows 3
# Debug mode (visible browser)
screenshotfetch spider https://app.example.com -u username -p password --headless false
# Custom viewport and timing
screenshotfetch spider https://app.example.com -u user -p pass -w 1280 -h 720 --wait 3000Capture Single Screenshot
# Basic usage
screenshotfetch capture https://example.com -o ./screenshot.png
# Full page capture
screenshotfetch capture https://example.com -o ./fullpage.png --fullpage
# Custom viewport
screenshotfetch capture https://example.com -w 1280 -h 720
# Different cookie strategies
screenshotfetch capture https://example.com -s none # No cookie handling
screenshotfetch capture https://example.com -s click # Only try clicking
screenshotfetch capture https://example.com -s remove # Only remove banners
screenshotfetch capture https://example.com -s block # Only block services
screenshotfetch capture https://example.com -s all # Try all strategies (default)Batch Capture
Create a JSON file with your screenshots:
[
{
"url": "https://example.com",
"output": "./screenshots/home.png",
"fullPage": false
},
{
"url": "https://example.com/about",
"output": "./screenshots/about.png",
"fullPage": true
}
]Then run:
screenshotfetch batch screenshots.json -d 3000View Example Format
screenshotfetch exampleProgrammatic Usage
Web Application Spider API
const { ApplicationSpider } = require('@knowcode/screenshotfetch');
async function spiderApplication() {
const spider = new ApplicationSpider({
viewport: { width: 1920, height: 1080 },
cookieStrategy: 'all',
maxFlows: 5,
maxDepth: 10,
outputDir: './docs'
});
// Spider with authentication
const result = await spider.spiderApplication(
'https://app.example.com/login',
'username',
'password'
);
console.log(`Discovered ${result.summary.completedFlows} customer journeys`);
console.log(`Generated ${result.summary.screenshotCount} screenshots`);
console.log(`Documentation in: ${result.summary.outputDirectory}`);
}Screenshot Capture API
const { ScreenshotCapture } = require('@knowcode/screenshotfetch');
async function captureScreenshots() {
const capture = new ScreenshotCapture({
viewport: { width: 1920, height: 1080 },
cookieStrategy: 'all',
waitTime: 3000
});
await capture.init();
// Single screenshot
const result = await capture.captureScreenshot(
'https://example.com',
'./screenshot.png',
{ fullPage: false }
);
// Batch screenshots
const screenshots = [
{ url: 'https://example.com', output: './home.png' },
{ url: 'https://example.com/about', output: './about.png' }
];
const results = await capture.captureMultiple(screenshots);
await capture.close();
}📋 Generated Documentation Structure
When using the spider functionality, the tool generates comprehensive documentation:
docs/ # Output directory
├── index.md # Overview with flow summary
├── flows/ # Customer journey documentation
│ ├── flow-1/
│ │ ├── flow-1.md # Step-by-step journey with screenshots
│ │ ├── _images/ # Screenshots for this flow
│ │ │ ├── 01-login.png
│ │ │ ├── 02-dashboard.png
│ │ │ └── 03-settings.png
│ │ └── flow-1.json # Metadata and URL mappings
│ └── flow-2/
│ ├── flow-2.md # Another customer journey
│ ├── _images/ # Flow-specific screenshots
│ └── flow-2.json # Flow metadata
└── metadata/
├── url-index.json # Complete URL-to-screenshot mapping
└── flow-summary.json # Summary of all discovered flowsEach flow markdown file contains:
- Step-by-step customer journey with screenshots
- URL tracking for each step
- Action descriptions and navigation flow
- Metadata for programmatic access
Cookie Handling Strategies
The tool provides multiple strategies for handling cookie consent banners:
all(default) - Tries all strategies in sequenceclick- Attempts to click accept/agree buttonsremove- Removes banner elements from DOMblock- Blocks cookie consent service requestsnone- No cookie handling
Advanced Options
Constructor Options
{
viewport: { width: 1920, height: 1080, deviceScaleFactor: 1 },
timeout: 30000, // Navigation timeout
headless: 'new', // 'new' or false
cookieStrategy: 'all', // Cookie handling strategy
waitTime: 3000 // Wait after page load
}Screenshot Options
{
fullPage: false, // Capture full scrollable page
clip: { // Capture specific region
x: 0,
y: 0,
width: 800,
height: 600
},
omitBackground: false, // Transparent background
encoding: 'binary', // 'base64' or 'binary'
type: 'png', // 'png' or 'jpeg'
quality: 90 // JPEG quality (0-100)
}Examples
Capture Competitor Screenshots
const { ScreenshotCapture } = require('screenshotfetch');
async function captureCompetitors() {
const capture = new ScreenshotCapture({
cookieStrategy: 'all'
});
await capture.init();
const competitors = [
'https://mailchimp.com',
'https://activecampaign.com',
'https://convertkit.com'
];
for (const url of competitors) {
const name = new URL(url).hostname.replace('www.', '');
await capture.captureScreenshot(
url,
`./competitors/${name}.png`
);
}
await capture.close();
}Custom Cookie Handler
const { ScreenshotCapture, CookieHandler } = require('screenshotfetch');
const cookieHandler = new CookieHandler();
// Add custom selectors for specific sites
cookieHandler.addCustomSelectors([
'button[id="my-custom-accept"]',
'.my-site-cookie-accept'
]);
// Add domains to block
cookieHandler.addBlockedDomains([
'mycookieservice.com'
]);Troubleshooting
Navigation Timeout
If you're getting timeout errors, try:
- Using
domcontentloadedinstead ofnetworkidle2 - Reducing the timeout value
- Using headless: false to see what's happening
Cookie Banners Still Visible
Try different strategies:
- Use
-s allto try all methods - Add custom selectors for specific sites
- Use headless: false to debug
Memory Issues
For large batches:
- Increase delay between captures
- Process in smaller batches
- Monitor system resources
🤝 Use Cases
UX Research & Documentation
- Document user flows for analysis and improvement
- Create visual user journey maps
- Generate training materials automatically
Quality Assurance & Testing
- Automate UI regression testing with screenshots
- Document application state changes
- Validate user experience flows
Competitive Analysis
- Document competitor application flows
- Compare user experience patterns
- Generate competitive intelligence reports
Process Documentation
- Create step-by-step user guides
- Document internal workflows
- Generate training materials
📋 Requirements
- Node.js: 16.0.0 or higher
- Operating System: Windows, macOS, or Linux
- Memory: 512MB+ available RAM
- Disk Space: Varies based on screenshot quantity
🔧 Advanced Configuration
Spider Options
{
maxFlows: 5, // Maximum customer journeys to discover
maxDepth: 10, // Maximum steps per journey
maxPages: 100, // Maximum total pages to visit
waitTime: 2000, // Wait between actions (ms)
includeQueryParams: true, // Include URL query parameters
cookieStrategy: 'all' // Cookie consent handling strategy
}Screenshot Options
{
fullPage: false, // Capture full scrollable page
viewport: { // Browser viewport size
width: 1920,
height: 1080,
deviceScaleFactor: 1
},
type: 'png', // Image format ('png' or 'jpeg')
quality: 90 // JPEG quality (0-100)
}📈 Contributing
We welcome contributions! Please see our contributing guidelines for details on how to:
- Report bugs and request features
- Submit pull requests
- Improve documentation
📄 License
MIT © KnowCode
