regression-proof

v2.0.0

Published

a month ago

Local red/green evidence summaries for bug-fix pull requests.

0High
0Medium
0Low

jinhyuk9714

cli testing regression regression-testing pull-requests open-source git

regression-proof

Local red/green evidence summaries for bug-fix pull requests.

regression-proof helps external contributors prepare a short verification note before opening a PR. It records that a focused regression command failed before the fix, passed after the fix, adds JS/TS-first review hints for the diff, and renders Markdown that maintainers can scan quickly.

regression-proof terminal demo

Why

Bug-fix PRs are easier to review when they answer a few concrete questions:

Did the contributor reproduce the bug with a focused test or command?
Did the same command pass after the fix?
Did the diff stay small and reviewable?
Were lockfile, dependency, generated-file, and binary-file review hints checked?

The local PR body is still contributor-provided evidence, not a guarantee. For stronger maintainer-side verification, use verify-commits or the GitHub Action so CI reruns the red/green command on local refs.

Quick Start

npm install -D regression-proof

Using npx regression-proof or pnpm dlx regression-proof downloads the package through the npm registry. The CLI itself does not make network calls by default while recording local evidence.

# Once per working repo
npx regression-proof init --base origin/main

# After adding a regression test, before applying the fix
npx regression-proof red --red-match "expected token" "pnpm test tests/parser.test.ts"

# After applying the fix
npx regression-proof green "pnpm test tests/parser.test.ts"

# Record review hints and check readiness
npx regression-proof diff
npx regression-proof status

# Generate Markdown for the PR body
npx regression-proof body --issue 123 --output pr-body.md

Run regression-proof status before regression-proof body. Missing, stale, failed, inconclusive, or incompatible evidence is shown explicitly instead of being rendered as ready evidence.

Example Output

## Regression Evidence

This PR includes red/green verification for the reported issue.

| Check | Result |
|---|---|
| Base branch | <code>origin/main</code> |
| Regression test failed before fix | Yes |
| Red failure matched | Yes |
| Regression test passed after fix | Yes |
| Lockfile changed | No |
| Dependencies changed | No |
| Changed files | 3 |
| Diff evidence stale | No |
| Diff review hint | Low |

## Commands

- Red failure must match: <code>expected token</code>
- Red: <code>pnpm test tests/parser.test.ts</code> - failed as expected
- Green: <code>pnpm test tests/parser.test.ts</code> - passed

## Notes

This evidence does not prove the implementation is perfect. It records the
verification steps used before opening the PR.

Fixes #123

What This Is Not

regression-proof does not generate code, review code, open pull requests, call AI APIs, call GitHub APIs, bypass human review, or decide whether code should merge. It records verification evidence so contributors can make bug-fix PRs easier to review.

Command Safety

regression-proof red, green, and verify-commits execute the commands you pass through your local shell. Read commands before running them, especially when they come from an issue, comment, or external contributor.

Do not run commands you do not understand. Output redaction and bounded capture can reduce accidental storage of common secrets, but they are not a sandbox and cannot guarantee that every secret format is removed. Prefer the default tail capture or outputCapture: "none" for sensitive projects.

How Is This Different?

| Tool | Main user | Where it runs | Purpose | |---|---|---|---| | Danger JS | Maintainer/team | CI | Automate common PR review chores with project rules | | reviewdog | Maintainer/team | Local or CI, often posting to code hosts | Report linter or code-analysis findings on diffs | | regression-proof | External contributor or maintainer CI | Local machine or GitHub Actions | Record red/green regression evidence and render a PR summary |

The main difference is ownership. CI-based gates help maintainers enforce project policy after a PR exists. regression-proof starts as contributor-owned local evidence, and verify-commits/the Action can move the same workflow into maintainer-side CI when a project wants a stronger check.

GitHub Action

Maintainers can run commit-mode verification in CI without GitHub API calls from the CLI:

name: Regression Evidence

on:
  workflow_dispatch:
    inputs:
      red_ref:
        description: Commit that contains the regression test but not the fix
        required: true
        type: string

jobs:
  regression-evidence:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v6
        with:
          fetch-depth: 0

      - uses: sjh9714/regression-proof/.github/actions/[email protected]
        with:
          red-ref: ${{ inputs.red_ref }}
          green-ref: ${{ github.sha }}
          base: origin/main
          setup-command: pnpm install --frozen-lockfile
          setup-artifact-policy: ignored-only
          red-match: expected token
          red-not-match: Cannot find module
          command: pnpm test tests/parser.test.ts

red-ref must be a commit where the regression command already exists and fails for the intended bug-reproduction reason. Do not use the PR base SHA as red-ref for newly added regression tests unless the same command already exists at the base ref.

One common local workflow is to commit the regression test first, keep that ref as the red state, then apply the fix:

git switch -c fix/parser-regression
# Add the focused regression test only.
git add tests/parser.test.ts
git commit -m "test: reproduce parser bug"
git branch red-state

# Apply the production fix, then verify both commits.
git add src/parser.ts tests/parser.test.ts
git commit -m "fix: handle parser token edge case"
npx regression-proof verify-commits \
  --red-ref red-state \
  --green-ref HEAD \
  --setup-command "pnpm install --frozen-lockfile" \
  --setup-artifact-policy ignored-only \
  --red-match "expected token" \
  --red-not-match "Cannot find module" \
  "pnpm test tests/parser.test.ts"

See GitHub Action usage and Trust model for the difference between local PR-body evidence and CI-rerun evidence.

Project Command Policy

Maintainers can define approved evidence commands in .regression-proof-policy.json and ask contributors or CI to use --command-id instead of copying raw shell commands into every run:

{
  "version": 1,
  "commands": {
    "parser": {
      "description": "Parser regression evidence",
      "command": "pnpm test tests/parser.test.ts",
      "cwd": ".",
      "setupCommands": ["pnpm install --frozen-lockfile"],
      "redMatch": ["expected token"],
      "redNotMatch": ["Cannot find module"]
    }
  }
}

npx regression-proof red --command-id parser
npx regression-proof green --command-id parser
npx regression-proof verify-commits --command-id parser --red-ref red-state --green-ref HEAD

When --command-id is used, the policy owns the wrapped command, cwd, setup commands, and red failure matching. Passing raw command text or command-shaping overrides such as --cwd, --setup-command, --red-match, or --red-not-match with --command-id exits 2.

Red/Green Semantics

The red command expects the wrapped command to fail. If the wrapped command fails, regression-proof red exits successfully because the regression was reproduced. The green command expects the wrapped command to pass.

For stricter red evidence, pass --red-match <regex> and/or --red-not-match <regex>. The red command must still exit non-zero, and its bounded redacted output must match every required pattern and avoid every forbidden pattern. A pattern mismatch exits 1 and is rendered as failed evidence, not as Yes.

| Command | Wrapped command exit code | Expected | CLI exit | |---|---:|---|---:| | red | non-zero | failure | 0 | | red | 0 | failure | 1 | | green | 0 | success | 0 | | green | non-zero | success | 1 |

Exit codes are consistent across commands:

| Code | Meaning | |---:|---| | 0 | Evidence command succeeded according to expected semantics | | 1 | Evidence command ran but contradicted the red/green expectation | | 2 | Usage or config error | | 3 | Git or repository error | | 4 | Command execution infrastructure or setup-command error | | 5 | Timeout or interrupted command |

Privacy And Storage

No telemetry. No runtime network calls by default. Reports stay local unless you paste them into your PR.

Evidence is stored under .regression-proof/:

.regression-proof/
  config.json
  report.json
  logs/

Command output is not stored in full by default. The default report stores a short redacted tail, and init adds .regression-proof/ to .git/info/exclude so local evidence does not appear in your PR diff.

Wrapped commands time out after 10 minutes by default. Use --timeout-ms on red, green, or verify-commits for unusually slow test commands. Timed-out commands and child processes that exit due to a signal are recorded as inconclusive evidence and exit with code 5.

Commit-mode setup commands default to --setup-artifact-policy ignored-only: ignored cache/build artifacts are allowed, but tracked or unignored changes fail before evidence is recorded. Use no-changes for stricter runs, or allow-listed plus --allow-setup-change <pattern> for focused generated artifacts that should be recorded in the report.

JS/TS-First Review Hints

The wrapped command can be any shell command, so the red/green workflow can be used outside JavaScript and TypeScript projects. The built-in dependency and lockfile review hints are JS/TS-first: package.json, npm, pnpm, yarn, and bun lockfiles receive the most specific handling.

Commands

regression-proof init --base origin/main
regression-proof red --red-match "expected failure" "pnpm test path/to/test"
regression-proof red --command-id parser
regression-proof green "pnpm test path/to/test"
regression-proof diff
regression-proof status
regression-proof body --issue 123
regression-proof doctor
regression-proof clean --yes
regression-proof action-summary
regression-proof verify-commits "pnpm test path/to/test" --red-ref pre-fix --green-ref HEAD --setup-command "pnpm install --frozen-lockfile" --setup-artifact-policy ignored-only --red-match "expected failure"
regression-proof verify-commits --command-id parser --red-ref pre-fix --green-ref HEAD

See docs/commands.md for the full command and option reference. Schema and release details live in:

Limitations

This tool records evidence; it does not prove the implementation is correct.
Diff review hints are heuristic and can be wrong.
Generated-file, binary-file, lockfile, and dependency-change detection are JS/TS-first.
verify-commits only uses local refs; it does not fetch missing commits.
The CLI does not create PRs or integrate with GitHub APIs.

Development And Release

pnpm install --frozen-lockfile
pnpm typecheck
pnpm test
pnpm build
npm pack --dry-run

Publishing is manual. Before running npm publish, make sure npm whoami returns the intended publisher account and that npm publish --dry-run shows only the expected package files.

Roadmap

The trust roadmap is intentionally staged. The current v2 work writes structured report metadata while preserving v1 report read compatibility; future work should stay focused on adoption and maintainer-side dogfood rather than broad feature expansion.

See docs/trust-roadmap.md.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

regression-proof

Why

Quick Start

Example Output

What This Is Not

Command Safety

How Is This Different?

GitHub Action

Project Command Policy

Red/Green Semantics

Privacy And Storage

JS/TS-First Review Hints

Commands

Limitations

Development And Release

Roadmap