regression-proof
v2.0.0
Published
Local red/green evidence summaries for bug-fix pull requests.
Maintainers
Readme
regression-proof
Local red/green evidence summaries for bug-fix pull requests.
regression-proof helps external contributors prepare a short verification
note before opening a PR. It records that a focused regression command failed
before the fix, passed after the fix, adds JS/TS-first review hints for the
diff, and renders Markdown that maintainers can scan quickly.

Why
Bug-fix PRs are easier to review when they answer a few concrete questions:
- Did the contributor reproduce the bug with a focused test or command?
- Did the same command pass after the fix?
- Did the diff stay small and reviewable?
- Were lockfile, dependency, generated-file, and binary-file review hints checked?
The local PR body is still contributor-provided evidence, not a guarantee. For
stronger maintainer-side verification, use verify-commits or the GitHub
Action so CI reruns the red/green command on local refs.
Quick Start
npm install -D regression-proofUsing npx regression-proof or pnpm dlx regression-proof downloads the
package through the npm registry. The CLI itself does not make network calls by
default while recording local evidence.
# Once per working repo
npx regression-proof init --base origin/main
# After adding a regression test, before applying the fix
npx regression-proof red --red-match "expected token" "pnpm test tests/parser.test.ts"
# After applying the fix
npx regression-proof green "pnpm test tests/parser.test.ts"
# Record review hints and check readiness
npx regression-proof diff
npx regression-proof status
# Generate Markdown for the PR body
npx regression-proof body --issue 123 --output pr-body.mdRun regression-proof status before regression-proof body. Missing, stale,
failed, inconclusive, or incompatible evidence is shown explicitly instead of
being rendered as ready evidence.
Example Output
## Regression Evidence
This PR includes red/green verification for the reported issue.
| Check | Result |
|---|---|
| Base branch | <code>origin/main</code> |
| Regression test failed before fix | Yes |
| Red failure matched | Yes |
| Regression test passed after fix | Yes |
| Lockfile changed | No |
| Dependencies changed | No |
| Changed files | 3 |
| Diff evidence stale | No |
| Diff review hint | Low |
## Commands
- Red failure must match: <code>expected token</code>
- Red: <code>pnpm test tests/parser.test.ts</code> - failed as expected
- Green: <code>pnpm test tests/parser.test.ts</code> - passed
## Notes
This evidence does not prove the implementation is perfect. It records the
verification steps used before opening the PR.
Fixes #123What This Is Not
regression-proof does not generate code, review code, open pull requests, call
AI APIs, call GitHub APIs, bypass human review, or decide whether code should
merge. It records verification evidence so contributors can make bug-fix PRs
easier to review.
Command Safety
regression-proof red, green, and verify-commits execute the commands you
pass through your local shell. Read commands before running them, especially
when they come from an issue, comment, or external contributor.
Do not run commands you do not understand. Output redaction and bounded capture
can reduce accidental storage of common secrets, but they are not a sandbox and
cannot guarantee that every secret format is removed. Prefer the default tail
capture or outputCapture: "none" for sensitive projects.
How Is This Different?
| Tool | Main user | Where it runs | Purpose |
|---|---|---|---|
| Danger JS | Maintainer/team | CI | Automate common PR review chores with project rules |
| reviewdog | Maintainer/team | Local or CI, often posting to code hosts | Report linter or code-analysis findings on diffs |
| regression-proof | External contributor or maintainer CI | Local machine or GitHub Actions | Record red/green regression evidence and render a PR summary |
The main difference is ownership. CI-based gates help maintainers enforce
project policy after a PR exists. regression-proof starts as contributor-owned
local evidence, and verify-commits/the Action can move the same workflow into
maintainer-side CI when a project wants a stronger check.
GitHub Action
Maintainers can run commit-mode verification in CI without GitHub API calls from the CLI:
name: Regression Evidence
on:
workflow_dispatch:
inputs:
red_ref:
description: Commit that contains the regression test but not the fix
required: true
type: string
jobs:
regression-evidence:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 0
- uses: sjh9714/regression-proof/.github/actions/[email protected]
with:
red-ref: ${{ inputs.red_ref }}
green-ref: ${{ github.sha }}
base: origin/main
setup-command: pnpm install --frozen-lockfile
setup-artifact-policy: ignored-only
red-match: expected token
red-not-match: Cannot find module
command: pnpm test tests/parser.test.tsred-ref must be a commit where the regression command already exists and
fails for the intended bug-reproduction reason. Do not use the PR base SHA as
red-ref for newly added regression tests unless the same command already
exists at the base ref.
One common local workflow is to commit the regression test first, keep that ref as the red state, then apply the fix:
git switch -c fix/parser-regression
# Add the focused regression test only.
git add tests/parser.test.ts
git commit -m "test: reproduce parser bug"
git branch red-state
# Apply the production fix, then verify both commits.
git add src/parser.ts tests/parser.test.ts
git commit -m "fix: handle parser token edge case"
npx regression-proof verify-commits \
--red-ref red-state \
--green-ref HEAD \
--setup-command "pnpm install --frozen-lockfile" \
--setup-artifact-policy ignored-only \
--red-match "expected token" \
--red-not-match "Cannot find module" \
"pnpm test tests/parser.test.ts"See GitHub Action usage and Trust model for the difference between local PR-body evidence and CI-rerun evidence.
Project Command Policy
Maintainers can define approved evidence commands in
.regression-proof-policy.json and ask contributors or CI to use
--command-id instead of copying raw shell commands into every run:
{
"version": 1,
"commands": {
"parser": {
"description": "Parser regression evidence",
"command": "pnpm test tests/parser.test.ts",
"cwd": ".",
"setupCommands": ["pnpm install --frozen-lockfile"],
"redMatch": ["expected token"],
"redNotMatch": ["Cannot find module"]
}
}
}npx regression-proof red --command-id parser
npx regression-proof green --command-id parser
npx regression-proof verify-commits --command-id parser --red-ref red-state --green-ref HEADWhen --command-id is used, the policy owns the wrapped command, cwd, setup
commands, and red failure matching. Passing raw command text or command-shaping
overrides such as --cwd, --setup-command, --red-match, or
--red-not-match with --command-id exits 2.
Red/Green Semantics
The red command expects the wrapped command to fail. If the wrapped command
fails, regression-proof red exits successfully because the regression was
reproduced. The green command expects the wrapped command to pass.
For stricter red evidence, pass --red-match <regex> and/or
--red-not-match <regex>. The red command must still exit non-zero, and its
bounded redacted output must match every required pattern and avoid every
forbidden pattern. A pattern mismatch exits 1 and is rendered as failed
evidence, not as Yes.
| Command | Wrapped command exit code | Expected | CLI exit |
|---|---:|---|---:|
| red | non-zero | failure | 0 |
| red | 0 | failure | 1 |
| green | 0 | success | 0 |
| green | non-zero | success | 1 |
Exit codes are consistent across commands:
| Code | Meaning | |---:|---| | 0 | Evidence command succeeded according to expected semantics | | 1 | Evidence command ran but contradicted the red/green expectation | | 2 | Usage or config error | | 3 | Git or repository error | | 4 | Command execution infrastructure or setup-command error | | 5 | Timeout or interrupted command |
Privacy And Storage
No telemetry. No runtime network calls by default. Reports stay local unless you paste them into your PR.
Evidence is stored under .regression-proof/:
.regression-proof/
config.json
report.json
logs/Command output is not stored in full by default. The default report stores a
short redacted tail, and init adds .regression-proof/ to .git/info/exclude
so local evidence does not appear in your PR diff.
Wrapped commands time out after 10 minutes by default. Use --timeout-ms on
red, green, or verify-commits for unusually slow test commands. Timed-out
commands and child processes that exit due to a signal are recorded as
inconclusive evidence and exit with code 5.
Commit-mode setup commands default to --setup-artifact-policy ignored-only:
ignored cache/build artifacts are allowed, but tracked or unignored changes
fail before evidence is recorded. Use no-changes for stricter runs, or
allow-listed plus --allow-setup-change <pattern> for focused generated
artifacts that should be recorded in the report.
JS/TS-First Review Hints
The wrapped command can be any shell command, so the red/green workflow can be
used outside JavaScript and TypeScript projects. The built-in dependency and
lockfile review hints are JS/TS-first: package.json, npm, pnpm, yarn, and bun
lockfiles receive the most specific handling.
Commands
regression-proof init --base origin/main
regression-proof red --red-match "expected failure" "pnpm test path/to/test"
regression-proof red --command-id parser
regression-proof green "pnpm test path/to/test"
regression-proof diff
regression-proof status
regression-proof body --issue 123
regression-proof doctor
regression-proof clean --yes
regression-proof action-summary
regression-proof verify-commits "pnpm test path/to/test" --red-ref pre-fix --green-ref HEAD --setup-command "pnpm install --frozen-lockfile" --setup-artifact-policy ignored-only --red-match "expected failure"
regression-proof verify-commits --command-id parser --red-ref pre-fix --green-ref HEADSee docs/commands.md for the full command and option reference. Schema and release details live in:
- Config schema
- Report schema
- Trust model
- Trust roadmap
- GitHub Action usage
- Privacy and security
- Security checklist
- Release runbook
- Launch kit
- Dogfood log
Limitations
- This tool records evidence; it does not prove the implementation is correct.
- Diff review hints are heuristic and can be wrong.
- Generated-file, binary-file, lockfile, and dependency-change detection are JS/TS-first.
verify-commitsonly uses local refs; it does not fetch missing commits.- The CLI does not create PRs or integrate with GitHub APIs.
Development And Release
pnpm install --frozen-lockfile
pnpm typecheck
pnpm test
pnpm build
npm pack --dry-runPublishing is manual. Before running npm publish, make sure npm whoami
returns the intended publisher account and that npm publish --dry-run shows
only the expected package files.
Roadmap
The trust roadmap is intentionally staged. The current v2 work writes structured report metadata while preserving v1 report read compatibility; future work should stay focused on adoption and maintainer-side dogfood rather than broad feature expansion.
