npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@phoenixaihub/flake-finder

v0.1.0

Published

Detect, score, and quarantine flaky tests in your CI pipeline

Readme

flake-finder

Detect, score, and quarantine flaky tests in your CI pipeline.

CI npm License: MIT

flake-finder tracks test results over time and uses Bayesian statistics + change-point detection to identify flaky tests — tests that fail non-deterministically without any code change.


Features

  • 📥 Ingest JUnit XML, Jest JSON, pytest JSON, or generic JSON
  • 📊 Score each test with a Bayesian flakiness score (0–100)
  • 🔍 Distinguish regression from flakiness via CUSUM change-point detection
  • ⚖️ Weight recent results more heavily via exponential decay (14-day half-life)
  • 🚫 Quarantine flaky tests with ready-to-use config for Jest, pytest, and JUnit
  • 🤖 CI-native commands: ingest, check (exit 1), and GitHub PR comment generation

Install

npm install -g @phoenixaihub/flake-finder
# or as a dev dependency:
npm install -D @phoenixaihub/flake-finder

Requires Node.js 18+.


Quick Start

1. Track test results

# JUnit XML (Java, Go, Python, Rust...)
flake-finder track results.xml

# Jest JSON (jest --json)
jest --json --outputFile results.json
flake-finder track results.json --format jest

# pytest (pytest-json-report)
pytest --json-report --json-report-file=results.json
flake-finder track results.json --format pytest

Run this after every CI build. Results accumulate in .flake-finder/results.db.

2. View the flakiness report

flake-finder report

# Only show tests with score > 20
flake-finder report --threshold 20

# Output as Markdown
flake-finder report --format markdown

Sample output:

🔍 Flaky Test Report

┌─────────────────────────────────────────────────┬───────┬───────────┬──────┬───────┬──────────────┐
│ Test                                            │ Score │ Fail Rate │ Runs │ Fails │ Change Point │
├─────────────────────────────────────────────────┼───────┼───────────┼──────┼───────┼──────────────┤
│ LoginPage > handles session timeout             │  72.4 │    68.0%  │  25  │   17  │ flaky        │
│ PaymentService#processRefund                    │  48.1 │    42.0%  │  12  │    5  │ ⚠ regression │
│ UserAuth > validates expired token              │  31.2 │    28.0%  │  18  │    5  │ flaky        │
│ …SearchController#testPaginationEdgeCase        │  12.7 │    11.0%  │  27  │    3  │ flaky        │
└─────────────────────────────────────────────────┴───────┴───────────┴──────┴───────┴──────────────┘

  4 flaky test(s) found
  Score: 0-100 (higher = flakier) | ⚠ = regression detected, not pure flakiness

3. Generate quarantine config

# Show all formats
flake-finder quarantine --threshold 20

# Dry run to preview
flake-finder quarantine --threshold 20 --dry-run

Output includes:

📦 Jest (--testPathIgnorePatterns):
--testPathIgnorePatterns \
  "LoginPage",
  "UserAuth"

🐍 pytest (-k exclusion):
pytest -k 'not "LoginPage > handles session timeout" and not "UserAuth > validates expired token"'

☕ JUnit (@Ignore annotations):
  @Ignore("flaky: score=72.4")
  // LoginPage > handles session timeout

4. Stats dashboard

flake-finder stats
📊 Flake-Finder Stats Dashboard

  Total tests tracked:  142
  Total test runs:      38
  Total results:        5,396
  Flaky tests:          7 / 142 (threshold: 10)
  Date range:           12/15/2023 → 1/15/2024

🔥 Top 10 Flakiest Tests:
  ███████░░░  72.4 LoginPage > handles session timeout
  ████░░░░░░  48.1 PaymentService#processRefund
  ███░░░░░░░  31.2 UserAuth > validates expired token
  ...

CI Integration

GitHub Actions

# .github/workflows/test.yml
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: '20' }
      - run: npm ci && npm test -- --json --outputFile results.json

      - name: Ingest flake results
        run: npx flake-finder ci ingest results.json
        env:
          GITHUB_SHA: ${{ github.sha }}

      - name: Check for flaky tests
        run: npx flake-finder ci check --threshold 25

      - name: Post PR comment
        if: github.event_name == 'pull_request'
        run: |
          npx flake-finder ci comment > /tmp/flake-comment.md
          gh pr comment ${{ github.event.pull_request.number }} --body-file /tmp/flake-comment.md
        env:
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

CircleCI

jobs:
  test:
    steps:
      - checkout
      - run: npm test -- --json --outputFile results.json
      - run: |
          npx flake-finder ci ingest results.json
          npx flake-finder ci check --threshold 25

Persisting the database across runs

For the flakiness scores to improve over time, persist the .flake-finder/ directory as a CI cache:

GitHub Actions:

- uses: actions/cache@v4
  with:
    path: .flake-finder
    key: flake-finder-${{ runner.os }}

How It Works

Bayesian Flakiness Score

Each test gets a Beta distribution as its failure rate posterior:

  • Prior: Beta(1, 1) — uninformed, assumes nothing
  • For each test result: add weight to α (failure) or β (pass)
  • Results are exponentially decayed by age (14-day half-life by default)
  • Score = E[failure_rate] × confidence_weight × 100

This means:

  • A test that fails once in 100 runs scores ~1–2
  • A test that fails 5 times in 10 runs scores ~40–60
  • A test that always fails scores close to 100 (high confidence)

Change-Point Detection (CUSUM)

Uses Page's CUSUM algorithm to detect if a test has transitioned from passing to failing (a regression), vs. randomly flipping (flakiness):

  • Encodes pass=0, fail=1
  • Accumulates deviations from baseline failure rate
  • If cumulative sum exceeds threshold → change point detected
  • Tests with change points are flagged ⚠ regression in reports

Exponential Decay

Results decay with half-life of 14 days:

weight = 2^(-(age_days / 14))

A test fixed 2 weeks ago contributes half as much signal as a recent result.


Configuration

All commands accept --db <path> to override the default .flake-finder/results.db location.

Environment variables respected during ci ingest:

  • GITHUB_SHA or GIT_COMMIT — auto-attached as commit SHA
  • GITHUB_RUN_ID or CI_RUN_ID — auto-attached as run ID

Contributing

See CONTRIBUTING.md.


License

MIT © PhoenixAI Hub