npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

playwright-smart-reporter

v0.6.0

Published

An intelligent Playwright HTML reporter with AI-powered failure analysis, flakiness detection, and performance regression alerts

Downloads

1,243

Readme

playwright-smart-reporter

Let's Build QA

An intelligent Playwright HTML reporter with AI-powered failure analysis, flakiness detection, and performance regression alerts.

Report Overview v0.5.0 dashboard showing: stability score filtering (A-F grades), trend charts with 2x2 grid layout, run comparison, attachment gallery, and comprehensive test results

Features

  • AI Failure Analysis - Get AI-powered suggestions to fix failing tests (Claude/OpenAI)
  • Flakiness Detection - Tracks test history to identify unreliable tests
  • Performance Regression Alerts - Warns when tests get significantly slower
  • Pass Rate Trend Chart - Visual graph showing pass rates across runs
  • Per-Test History - Sparklines and duration charts for each test
  • Step Timing Breakdown - See which steps are slowest with visual bars
  • Screenshot Embedding - Failure screenshots displayed inline
  • Video Links - Quick access to test recordings
  • One-Click Trace Viewing - Downloads trace and opens trace.playwright.dev
  • Collapsible File Groups - Tests organized by file
  • Search & Filter - Find tests by name, filter by status
  • JSON Export - Download results for external processing
  • Slack/Teams Notifications - Get alerted on failures
  • CI Integration - Auto-detects GitHub, GitLab, CircleCI, Jenkins, Azure DevOps
  • NEW: Test Retry Analysis - Track tests that frequently need retries
  • NEW: Failure Clustering - Group similar failures by error type
  • NEW: Stability Scoring - Composite health metrics (0-100 with letter grades)
  • NEW: Attachment Gallery - Grid view of all screenshots, videos, and traces
  • NEW: Run Comparison - Compare current run vs baseline to see what changed
  • NEW: Smart Recommendations - AI-powered actionable insights
  • NEW: Trace Viewer Integration - Open Playwright traces directly from report
  • NEW: Merge History CLI - Combine parallel CI run histories

New in v0.5.0

Test Retry Analysis

Track tests that require multiple attempts to pass. The report now identifies tests with high retry rates, helping you spot unreliable tests that pass inconsistently.

  • Automatically detects tests that passed only after retries
  • Shows retry count for each test
  • Highlights tests that exceed the retry threshold (default: 3)
  • Helps identify flaky tests that might pass/fail randomly

Failure Clustering

Automatically groups similar failures together by analyzing error messages and stack traces. This makes it easier to identify patterns and common root causes across multiple test failures.

  • Groups failures by error type (timeout, assertion, network, etc.)
  • Shows count of similar failures in each cluster
  • Helps prioritize fixes by identifying widespread issues
  • Reduces noise when debugging multiple failures

Stability Scoring

Comprehensive health metrics that give each test a stability score from 0-100 with letter grades (A+ to F).

  • A+ (95-100): Rock solid, consistently passing
  • A (90-94): Very stable with rare failures
  • B (80-89): Generally stable with occasional issues
  • C (70-79): Moderately stable, needs attention
  • D (60-69): Unstable, frequent failures
  • F (<60): Critically unstable, requires immediate attention

Scores are calculated based on:

  • Pass rate over recent runs
  • Retry frequency
  • Performance consistency
  • Failure patterns

Attachment Gallery

New gallery view displaying all test attachments (screenshots, videos, traces) in an organized grid layout.

  • Visual grid of all screenshots from test runs
  • Quick access to videos and trace files
  • Filter by test status or file name
  • Thumbnail previews with click-to-expand
  • Trace Viewer integration - open .zip traces directly in Playwright's trace viewer

Run Comparison

Compare the current test run against a baseline to see exactly what changed.

  • Side-by-side comparison of test results
  • Highlights new failures, new passes, and regressions
  • Shows performance changes (faster/slower)
  • Identifies newly added or removed tests
  • Configure baseline using baselineRunId option

Smart Recommendations

AI-powered actionable insights based on your test results. The reporter analyzes patterns and provides specific recommendations:

  • Suggests tests to investigate based on failure patterns
  • Identifies performance bottlenecks
  • Recommends retry configuration adjustments
  • Highlights flaky tests that need stabilization
  • Provides priority-ranked action items

Merge History CLI

New command-line tool for combining test histories from parallel CI runs.

# Merge histories from multiple parallel shards
npx playwright-smart-reporter-merge-history history1.json history2.json -o merged.json

# Use glob patterns for multiple files
npx playwright-smart-reporter-merge-history 'blob-reports/**/test-history.json' -o test-history.json --max-runs 10

This is essential for maintaining accurate history when running tests in parallel across multiple machines or shards.

Installation

npm install -D playwright-smart-reporter

Usage

Add to your playwright.config.ts:

import { defineConfig } from '@playwright/test';

export default defineConfig({
  reporter: [
    ['playwright-smart-reporter', {
      outputFile: 'smart-report.html',
      historyFile: 'test-history.json',
      maxHistoryRuns: 10,
      performanceThreshold: 0.2,
      slackWebhook: process.env.SLACK_WEBHOOK_URL,
      teamsWebhook: process.env.TEAMS_WEBHOOK_URL,
      // v0.5.0 features
      enableRetryAnalysis: true,
      enableFailureClustering: true,
      enableStabilityScore: true,
      enableGalleryView: true,
      enableComparison: true,
      enableAIRecommendations: true,
      stabilityThreshold: 70,
      retryFailureThreshold: 3,
      baselineRunId: 'main-branch-baseline', // optional
    }],
  ],
});

Configuration Options

| Option | Default | Description | |--------|---------|-------------| | outputFile | smart-report.html | Path for the HTML report | | historyFile | test-history.json | Path for test history storage | | maxHistoryRuns | 10 | Number of runs to keep in history | | performanceThreshold | 0.2 | Threshold for performance regression (20%) | | slackWebhook | - | Slack webhook URL for failure notifications | | teamsWebhook | - | Microsoft Teams webhook URL for notifications | | enableRetryAnalysis | true | Track tests that frequently need retries | | enableFailureClustering | true | Group similar failures by error type | | enableStabilityScore | true | Show stability scores (0-100) with letter grades | | enableGalleryView | true | Display attachment gallery view | | enableComparison | true | Enable run comparison against baseline | | enableAIRecommendations | true | Generate AI-powered recommendations | | stabilityThreshold | 70 | Minimum stability score (C grade) to avoid warnings | | retryFailureThreshold | 3 | Number of retries before flagging as problematic | | baselineRunId | - | Optional: Run ID to use as baseline for comparisons |

AI Analysis

To enable AI-powered failure analysis, set one of these environment variables:

# Using Anthropic Claude
export ANTHROPIC_API_KEY=your-api-key

# OR using OpenAI
export OPENAI_API_KEY=your-api-key

The reporter will automatically analyze failures and provide fix suggestions in the report.

Report Features

Summary Dashboard

  • Pass rate ring with percentage
  • Pass/fail/skip counts
  • Flaky test count
  • Slow test count
  • Total duration

Trend Charts

Trend Charts Responsive 2x2 grid layout with interactive tooltips showing exact values on hover

Visual charts showing test history across runs:

Main Trend Chart (top):

  • Green - Passed tests
  • Red - Failed tests
  • Gray - Skipped tests
  • Hover over any bar to see exact counts
  • Current run highlighted with border

Secondary Charts (2x2 grid):

  • Duration - Suite execution time per run with values on hover
  • Flaky - Number of flaky tests detected per run
  • Slow - Number of slow tests detected per run
  • Run Comparison - Diff against baseline showing changes

Flakiness Indicators

  • 🟢 Stable (<10% failure rate)
  • 🟡 Unstable (10-30% failure rate)
  • 🔴 Flaky (>30% failure rate)
  • New (no history yet)
  • Skipped (test was skipped)

Performance Trends

  • Regression - Test is slower than average
  • Improved - Test is faster than average
  • Stable - Test is within normal range

Per-Test Details

Expanded Test Expanded failed test showing: run history sparkline, duration trend chart, step timings with SLOWEST badge, clean error display (no ANSI codes), embedded screenshot, and AI-powered suggestions

When you expand a test, you'll see:

  • Pass/Fail Sparkline - Green/red/gray dots showing the pattern across runs
  • Duration Trend Chart - Bar chart showing performance over time
  • Step Timings - Visual breakdown with slowest step highlighted (SLOWEST badge)
  • Error Details - Clean error messages with ANSI escape codes stripped
  • Screenshot - Embedded failure screenshot (click to expand to full size)
  • AI Suggestion - Contextual fix recommendations from Claude or OpenAI
  • Retry Information - Shows which retry attempt passed (if applicable)
  • Stability Score - Letter grade (A-F) indicating test health
  • Current run highlighted with distinct styling

Step Timings

  • Visual bar chart of step durations
  • "SLOWEST" badge on the slowest step
  • Helps identify bottlenecks in your tests

Additional Features

  • Screenshot embedding - Failure screenshots shown inline
  • Video links - Quick access to recordings
  • Trace viewing - Downloads trace and opens Playwright's trace viewer
  • Retry badge - Shows if test passed on retry
  • Search - Filter tests by name or file
  • File groups - Collapsible groups by file
  • JSON export - Download full results

CI Integration

Persisting History Across Runs

For flakiness detection and performance trends to work in CI, you need to persist test-history.json between runs.

GitHub Actions

- name: Restore test history
  uses: actions/cache@v4
  with:
    path: test-history.json
    key: test-history-${{ github.ref }}
    restore-keys: |
      test-history-

- name: Run Playwright tests
  run: npx playwright test

- name: Save test history
  uses: actions/cache/save@v4
  if: always()
  with:
    path: test-history.json
    key: test-history-${{ github.ref }}-${{ github.run_id }}

GitLab CI

test:
  cache:
    key: test-history-$CI_COMMIT_REF_SLUG
    paths:
      - test-history.json
    policy: pull-push
  script:
    - npx playwright test

CircleCI

- restore_cache:
    keys:
      - test-history-{{ .Branch }}
      - test-history-

- run: npx playwright test

- save_cache:
    key: test-history-{{ .Branch }}-{{ .Revision }}
    paths:
      - test-history.json

CI Environment Detection

The reporter automatically detects CI environments and enriches history with:

  • Run ID - Unique identifier for the run
  • Branch - Current branch name
  • Commit - Short commit SHA
  • CI Provider - GitHub, GitLab, CircleCI, Jenkins, Azure DevOps

This metadata is stored with each run for debugging and audit purposes.

Webhook Notifications

Configure Slack or Teams webhooks to receive notifications when tests fail:

reporter: [
  ['playwright-smart-reporter', {
    slackWebhook: process.env.SLACK_WEBHOOK_URL,
    teamsWebhook: process.env.TEAMS_WEBHOOK_URL,
  }],
]

Notifications include:

  • Summary of passed/failed tests
  • List of first 5 failed test names
  • Only sent when there are failures

Merging Reports from Multiple Machines

When running tests in parallel across multiple machines (sharding), you can merge both the test results and history files.

1. Configure each machine to output blobs and history

// playwright.config.ts
const outputDir = process.env.PLAYWRIGHT_BLOB_OUTPUT_DIR || 'blob-report';

export default defineConfig({
  reporter: [
    ['blob'],
    ['playwright-smart-reporter', {
      outputFile: `${outputDir}/smart-report.html`,
      historyFile: `${outputDir}/test-history.json`,
    }],
  ],
});

2. Run tests on each machine

# Machine 1
PLAYWRIGHT_BLOB_OUTPUT_DIR=blob-reports/machine1 npx playwright test --shard=1/2

# Machine 2
PLAYWRIGHT_BLOB_OUTPUT_DIR=blob-reports/machine2 npx playwright test --shard=2/2

3. Merge history files

npx playwright-smart-reporter-merge-history \
  blob-reports/machine1/test-history.json \
  blob-reports/machine2/test-history.json \
  -o blob-reports/merged/test-history.json

4. Merge blob reports

cp blob-reports/machine1/*.zip blob-reports/machine2/*.zip blob-reports/merged/
npx playwright merge-reports --reporter=playwright-smart-reporter blob-reports/merged

The merged report will include all tests from all machines with unified history for flakiness detection and trend charts.

Development

# Install dependencies
npm install

# Build
npm run build

# Run demo tests
npm run test:demo

# Open the report
open example/smart-report.html

Merge History CLI

When running tests in parallel across multiple machines or CI shards, use the merge history CLI tool to combine test histories:

# Merge parallel test histories
npx playwright-smart-reporter-merge-history history1.json history2.json -o merged.json

# Use with glob patterns
npx playwright-smart-reporter-merge-history 'blob-reports/**/test-history.json' -o test-history.json --max-runs 10

This ensures your flakiness detection and performance trends work correctly even with parallelized test execution. The merged history maintains all the metadata from individual runs while deduplicating test results.

License

MIT