repro-in-a-box
v2.6.0
Published
Production-grade automated bug detection with configuration system and 111+ tests. Find bugs with Playwright, configure detectors, record HAR files, validate reproducibility, and integrate with Claude Desktop.
Maintainers
Readme
Repro-in-a-Box v2.6 🎁
Find bugs. Freeze them. Ship them.
Autonomous QA agent that finds bugs on your site, captures reproducible evidence (HAR files + screenshots), validates reproducibility, and provides Claude Desktop integration via MCP.
✨ Features
- 7 Built-in Detectors: JavaScript errors, network failures, broken assets, accessibility (WCAG 2.1), web vitals, mixed content, broken links
- Production-Grade Testing: 112+ tests across 7 test files, ~85% code coverage
- Performance Benchmarked: <100ms detector attach, <500ms collect, <1s for 100 issues
- Multi-Page Crawler: Configurable depth, rate limiting, same-domain filtering
- Auto-Bundling: Creates reproducible ZIP packages with HAR files and screenshots
- HAR Replay: Validates reproducibility by replaying network traffic 3x
- Diff Comparison: Compare scan results across runs
- MCP Server: Claude Desktop integration for AI-powered bug hunting
🚀 Quick Start
# Install
npm install
# Build
npm run build
# Create config file (optional but recommended)
repro init
# Scan a website and create reproducible bundle
repro scan https://your-site.com --bundle
# Validate reproducibility
repro validate repro-your-site-com-*.zip
# Compare two scans
repro diff scan-results-1.json scan-results-2.json
# Start MCP server (for Claude Desktop)
npm run mcp📦 What's in the Box
✅ Week 1-5: Complete Implementation
7 Working Detectors
JavaScript Errors (
js-errors)- Console errors and warnings
- Uncaught exceptions
- Unhandled promise rejections
- Full stack traces
Network Errors (
network-errors)- Failed HTTP requests (4xx, 5xx)
- Timeouts and DNS failures
- Connection errors
- Request/response details
Broken Assets (
broken-assets)- Missing images, scripts, stylesheets
- Failed fonts and media
- Any resource with HTTP ≥400
Accessibility (
accessibility)- WCAG 2.1 Level A & AA via axe-core
- Missing alt text
- Color contrast issues
- Form label violations
- Landmark structure problems
Web Vitals (
web-vitals)- Core Web Vitals (CLS, INP, LCP)
- FCP, TTFB measurements
- Performance thresholds
- Only reports issues (<75th percentile)
Mixed Content (
mixed-content)- HTTP resources on HTTPS pages
- Active/passive mixed content
- Security downgrade detection
Broken Links (
broken-links)- Checks all links on page
- HTTP 4xx/5xx detection
- Network failure tracking
- HEAD request optimization
Auto-Bundling (Week 2)
- Creates ZIP bundles with:
- Scan results JSON
- HAR file (full network recording)
- Screenshots of issues
- Reproduction script
- Setup README
- One command to create reproducible packages
- Bundle size: typically 100-500KB
HAR Replay & Validation (Week 3-4)
- Replays HAR files using Playwright's
routeFromHAR - Runs scans 3x to validate reproducibility
- Calculates reproducibility score (0-100%)
- Detailed diff analysis
- Consistency tracking (always/never/sometimes present)
MCP Server (Week 5)
- stdio transport for Claude Desktop
- Three tools:
scan_site: Scan and bundle websitesvalidate_reproduction: Validate HAR replaydiff_scans: Compare scan results
- Ready for AI-powered bug hunting
📋 Commands
scan - Detect issues and create bundles
repro scan <url> [options]
Options:
-d, --max-depth <number> Maximum crawl depth (default: 2)
-p, --max-pages <number> Maximum pages to scan (default: 10)
-r, --rate-limit <ms> Rate limit between requests (default: 1000)
-o, --output <path> Output directory for results
--bundle Create reproducible ZIP bundle (includes HAR + screenshots)
--screenshots Capture screenshots when issues detected
--record-har Record HAR file during scan
--no-headless Run browser in visible mode
--same-domain-only Only crawl pages on the same domain (default: true)Examples:
# Quick scan with bundle
repro scan https://example.com --max-pages 1 --bundle
# Deep scan (multiple pages)
repro scan https://example.com --max-pages 50 --max-depth 3 --bundle
# Scan without bundling
repro scan https://example.com --output ./scan-results.json
# Watch the browser
repro scan https://example.com --no-headless --bundlevalidate - Verify reproducibility via HAR replay
repro validate <bundle.zip> [options]
Options:
-r, --runs <number> Number of replay runs (default: 3)
-t, --threshold <percent> Minimum reproducibility score (default: 70)
-o, --output <dir> Output directory for extracted bundle
-v, --verbose Show detailed diff and consistency analysis
--json Output results as JSONExamples:
# Validate a bundle (3 runs, 70% threshold)
repro validate repro-example-com-2026-02-15.zip
# Verbose output with 5 runs
repro validate repro-example-com-2026-02-15.zip --runs 5 --verbose
# Strict validation (90% threshold)
repro validate repro-example-com-2026-02-15.zip --threshold 90
# JSON output for parsing
repro validate repro-example-com-2026-02-15.zip --json > validation-results.jsonOutput:
🔍 Validating reproducibility...
📊 Original Scan
URL: https://example.com
Issues: 2
🔄 Replay Runs
Run 1: ✅ Success - 2 issues found
Run 2: ✅ Success - 2 issues found
Run 3: ✅ Success - 2 issues found
📈 Summary
Total runs: 3
Successful: 3/3
Average issues: 2.0
🎯 Reproducibility Score
100.0%
Grade: 🥇 Excellentdiff - Compare scan results
repro diff <baseline.json> <comparison.json>Coming soon: Full CLI implementation (currently available via MCP).
⚙️ Configuration
Repro-in-a-Box supports configuration files to set default values for all options. This eliminates the need to pass the same flags repeatedly.
Quick Start
Create a configuration file interactively:
repro initThis wizard will ask you questions and generate a .reprorc.json file with your preferences.
Configuration Files
Repro-in-a-Box searches for configuration in this order:
--config <path>flag (if specified).reprorc.jsonin current directory.reprorc.jsin current directory (JavaScript module)reprofield inpackage.json
Priority: CLI flags > Config file > Defaults
Example: .reprorc.json
{
"detectors": {
"enabled": ["javascript-errors", "network-errors", "broken-assets"],
"disabled": ["web-vitals"]
},
"crawler": {
"maxDepth": 3,
"maxPages": 100,
"rateLimit": 100,
"sameDomain": true,
"followRedirects": true
},
"browser": {
"headless": true,
"slowMo": 0,
"timeout": 30000
},
"output": {
"format": "json",
"path": "./repro-results",
"verbose": false
},
"thresholds": {
"minReproducibility": 70,
"failOn": ["error"]
},
"bundle": {
"enabled": true,
"includeScreenshots": true,
"includeHar": true,
"compression": "fast"
}
}Example: .reprorc.js (JavaScript)
export default {
crawler: {
maxDepth: 5,
maxPages: 50,
},
detectors: {
// Only run these detectors
enabled: ['javascript-errors', 'accessibility'],
},
bundle: {
enabled: true,
},
};Example: package.json
{
"name": "my-project",
"repro": {
"crawler": {
"maxDepth": 2,
"maxPages": 20
},
"bundle": {
"enabled": true
}
}
}Configuration Options
Detectors
{
"detectors": {
"enabled": [
"javascript-errors", // Console errors, exceptions
"network-errors", // Failed HTTP requests
"broken-assets", // Missing images, scripts
"accessibility", // WCAG 2.1 violations
"web-vitals", // Core Web Vitals
"mixed-content", // HTTP on HTTPS
"broken-links" // Check all links
],
"disabled": [] // Disable specific detectors
}
}Note: If enabled is empty or not specified, all detectors run. Use disabled to exclude specific ones.
Crawler
{
"crawler": {
"maxDepth": 3, // How many clicks deep (1-10)
"maxPages": 100, // Maximum pages to scan (1-1000)
"rateLimit": 100, // Delay between requests in ms (0-10000)
"sameDomain": true, // Only crawl same domain
"followRedirects": true // Follow HTTP redirects
}
}Browser
{
"browser": {
"headless": true, // Run browser in background
"slowMo": 0, // Slow down operations (ms)
"timeout": 30000, // Page load timeout (ms)
"userAgent": "..." // Custom user agent (optional)
}
}Output
{
"output": {
"format": "json", // json | text | csv | html
"path": "./repro-results", // Output directory
"verbose": false, // Detailed logging
"quiet": false // Suppress output except errors
}
}Thresholds
{
"thresholds": {
"minReproducibility": 70, // Min score to pass (0-100)
"maxIssues": null, // Max issues before failing
"failOn": ["error"] // Fail on: error | warning | info
}
}Bundle
{
"bundle": {
"enabled": false, // Create ZIP bundles
"includeScreenshots": true, // Include issue screenshots
"includeHar": true, // Include HAR file
"compression": "fast" // none | fast | best
}
}Using Configurations
1. With .reprorc.json in project
# Uses .reprorc.json settings
repro scan https://example.com
# CLI flags override config
repro scan https://example.com --max-depth 52. With custom config path
repro scan https://example.com --config ./configs/prod.json3. Mix and match
# Config: maxDepth=3, maxPages=100, bundle=true
# CLI overrides maxPages to 20
repro scan https://example.com --max-pages 20Real-World Examples
CI/CD Pipeline (.reprorc.json):
{
"crawler": { "maxPages": 50, "rateLimit": 0 },
"thresholds": { "minReproducibility": 80, "failOn": ["error", "warning"] },
"bundle": { "enabled": true }
}Local Development:
repro scan https://localhost:3000 --no-headless --max-pages 5Production Monitoring (repro-prod.json):
{
"detectors": {
"enabled": ["javascript-errors", "network-errors", "accessibility"]
},
"crawler": { "maxDepth": 5, "maxPages": 200 },
"thresholds": { "failOn": ["error"] }
}repro scan https://prod.example.com --config repro-prod.json� MCP Server Integration
Repro-in-a-Box includes an MCP server for Claude Desktop integration.
Setup
- Build the project:
npm run build- Add to your Claude Desktop config (
~/Library/Application Support/Claude/claude_desktop_config.jsonon macOS):
{
"mcpServers": {
"repro-in-a-box": {
"command": "node",
"args": ["/absolute/path/to/repro-in-a-box/dist/mcp/index.js"]
}
}
}Restart Claude Desktop
Verify it's working:
- Ask Claude: "What tools do you have?"
- You should see:
scan_site,validate_reproduction,diff_scans
Using with Claude
Ask Claude to scan websites:
"Scan https://example.com for bugs and create a reproducible bundle"Validate reproducibility:
"Validate the reproducibility of repro-example-com-2026-02-15.zip"Compare scans:
"Compare scan-results-1.json with scan-results-2.json"🧪 Testing
# Run all tests
npm test
# Run tests with UI
npm run test:ui
# Run tests in watch mode
npm test -- --watch
# Generate coverage report
npm test -- --coverageCurrent test coverage:
- ✅ Diff utility: 10 tests passing
- ✅ Detector registry: 5 tests passing
- ✅ 15 total tests passing
📊 Example Output
Scan Output
🔍 Repro-in-a-Box Scanner
========================
URL: https://example.com
Max Depth: 2
Max Pages: 10
Bundle: Yes (includes HAR + screenshots)
📦 Registered detectors:
- JavaScript Errors (js-errors)
- Network Errors (network-errors)
- Broken Assets (broken-assets)
- Accessibility (accessibility)
- Web Vitals (web-vitals)
- Mixed Content (mixed-content)
- Broken Links (broken-links)
🚀 Starting scan...
📄 Scanning: https://example.com/
⚠️ Found 2 issue(s)
📊 Scan Results
===============
Pages scanned: 1
Total issues: 2
Duration: 3.27s
Issues by severity:
error: 2
Issues by category:
accessibility: 2
💾 Results saved to: scan-results.json
📦 Creating reproducible bundle...
✅ Bundle created: repro-example-com-2026-02-15-16-03-28.zip
Size: 20.77 KB
Contents: 6 files
To reproduce:
unzip repro-example-com-2026-02-15-16-03-28.zip
chmod +x reproduce.sh
./reproduce.sh🗺️ Roadmap
✅ Week 1: Foundation (Complete)
- [x] Detector framework with lifecycle hooks
- [x] 7 core detectors (JS errors, network, assets, accessibility, web vitals, mixed content, broken links)
- [x] CLI framework with Commander
- [x] Multi-page crawler with rate limiting
✅ Week 2: Auto-Bundling (Complete)
- [x] ZIP creation with HAR + scan results
- [x] Screenshot capture on issues
- [x] Reproduction script generation
- [x] Bundle README generation
✅ Week 3-4: Determinism Engine (Complete)
- [x] HAR recording during scan
- [x] HAR replay validation (3x runs)
- [x] Reproducibility scoring (0-100%)
- [x] Diff utility for comparing scans
- [x] Consistency analysis
- [x]
validatecommand
✅ Week 5: MCP Server (Complete)
- [x] MCP server with stdio transport
- [x]
scan_sitetool - [x]
validate_reproductiontool - [x]
diff_scanstool - [x] Claude Desktop integration
✅ Week 6: Polish & Testing (Complete - v2.5.0)
- [x] 112+ comprehensive tests (5x increase)
- [x] ~85% code coverage across all modules
- [x] Performance benchmarks established
- [x] Security vulnerabilities fixed
- [x] npm published successfully
🎯 Future Enhancements (v2.6+)
Developer Experience:
- Interactive setup wizard (
repro init) - Configuration file support (
.reprorc.json) - Multiple output formats (JSON, CSV, GitHub Actions)
- Custom detector plugins API
New Detectors:
- SEO detector (meta tags, Open Graph, structured data)
- Performance detector (bundle size, render blocking resources)
- Security detector (CSP violations, mixed content warnings)
- Console warnings (separate from errors)
- Memory leak detection
Advanced Features:
- HTML report generator with charts
- GitHub Action for CI/CD integration
- Historical trending and comparison
- Scheduled scanning (cron-like)
- Webhook notifications (Slack, Discord)
UI/Visualization:
- Web dashboard for scan results
- Browser extension for one-click scanning
- Interactive issue explorer
- Visual diff comparison
🏗️ Architecture
src/
├── detectors/ # Issue detection plugins
│ ├── base.ts # Base detector interface
│ ├── registry.ts # Detector management
│ ├── js-errors.ts # JavaScript error detector
│ ├── network-errors.ts # Network failure detector
│ ├── broken-assets.ts # Asset loading detector
│ ├── accessibility.ts # WCAG violations (axe-core)
│ ├── web-vitals.ts # Core Web Vitals
│ ├── mixed-content.ts # HTTP/HTTPS mixed content
│ └── broken-links.ts # Broken link checker
├── crawler/ # Multi-page web crawler
│ └── index.ts # Configurable depth/rate limiting
├── scanner/ # Orchestrates detectors + crawler
│ └── index.ts # Scan lifecycle management
├── bundler/ # ZIP creation (Week 2)
│ └── index.ts # HAR + screenshots + script
├── determinism/ # HAR replay & validation (Week 3-4)
│ ├── replayer.ts # HAR replay via Playwright
│ ├── diff.ts # Scan comparison utility
│ └── __tests__/ # Unit tests
├── mcp/ # MCP server (Week 5)
│ ├── server.ts # MCP tool implementations
│ └── index.ts # stdio transport
└── cli/ # Command-line interface
├── index.ts # CLI entry point
└── commands/ # scan, validate, diff🔧 Development
# Install dependencies
npm install
# Watch mode for development
npm run dev -- scan https://example.com
# Build TypeScript
npm run build
# Run tests (112+ tests)
npm test
# Run tests with coverage
npm test -- --coverage --run
# Run tests with UI
npm run test:ui
# Start MCP server
npm run mcp
# Type check
npx tsc --noEmitTest Coverage
v2.5.0 Test Suite:
- 7 test files, 1,397 lines of test code
- 112+ tests across all modules
- ~85% code coverage
- Performance benchmarks included
Test Files:
tests/bundler.test.ts- Bundle creation & validationtests/crawler.test.ts- Multi-page crawling logictests/detectors.test.ts- All 7 detector implementationstests/cli.test.ts- CLI command parsing (22 tests)tests/detector-edge-cases.test.ts- Edge case coverage (22 tests)tests/performance.test.ts- Performance benchmarks (9 benchmarks)tests/integration/mcp-server.test.ts- MCP tool validation (35 tests)
🤝 Contributing
Contributions are welcome! This project follows a 6-week development plan:
- Weeks 1-6: Core features & testing (✅ Complete)
- v2.5.0: Production-grade quality with comprehensive test coverage
See ENHANCEMENT_SUMMARY.md for v2.5.0 improvements.
📝 License
MIT © 2026
Current Version: 2.5.0 🚀
Test Coverage: 112+ tests, ~85% coverage
Status: Production-ready
Status: Week 6 in progress - Final polish before v2.0.0 release
Tests: 15/15 passing ✅
