npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

secure-review

v1.0.2

Published

Multi-model security review for AI-generated code. Runs OpenAI, Anthropic, and Google reviewers in parallel and posts findings as PR comments.

Downloads

159

Readme

secure-review

npm version npm downloads License: MIT

Multi-model security review for AI-generated code. CLI and GitHub Action that runs several LLM reviewers (Anthropic, OpenAI, Google) and SAST tools (Semgrep, ESLint, npm audit) against your codebase. Findings are aggregated across reviewers — overlap becomes a confidence signal. Modes: scan (SAST only), review (multi-model report), fix (cross-model rotating loop applies fixes), pr (GitHub Action entrypoint for static review), estimate (preview cost without running), baseline (mark known/accepted findings), benchmark (compare writer models), compare (A/B path diff), reviewer-benchmark (single vs multi-model reviewer comparison).

Live target testing (deterministic attack, attack-ai, ZAP, Nuclei, browser-login hooks) lives in the companion package secure-review-runtime, which depends on this library for shared types and reporters. Install both if you need static review plus runtime probes.

npm install --save-dev secure-review        # https://www.npmjs.com/package/secure-review
npx secure-review init                      # interactive scaffold
npx secure-review review ./src              # report — no file changes
npx secure-review fix ./src                 # report + apply fixes via cross-model loop

How it actually works under the hood — see WORKFLOW.md for the full per-mode pseudo-code (read this if you're evaluating the methodology, not just running the tool).

The design is grounded in recent LLM-security research showing that (1) SAST alone is nearly blind to AI-generated code, and (2) same-model self-review loops often regress. The tool operationalizes the cross-model-review pattern the industry uses informally.

Why this, not GitHub Copilot PR review?

| | GitHub Copilot code review | secure-review | |---|---|---| | Models | 1 (OpenAI via Copilot) | anthropic / openai / google (3 supported providers) | | Security-specialized | No (general quality) | Yes (skill-configurable) | | Agreement signal across models | No | Yes | | SAST integrated with AI | No | Yes (Semgrep + ESLint + npm audit) | | Provider-agnostic | No (Copilot only) | Yes | | Empirical justification | Marketing | Grounded in LLM-security research (see below) |

Requirements

  • Node.js >= 20 (enforced by package.json engines field).
  • At least one provider configured for any AI-backed mode. API mode requires an API key for that provider (ANTHROPIC_API_KEY, OPENAI_API_KEY, or GOOGLE_API_KEY); CLI mode (ANTHROPIC_MODE=cli / GOOGLE_MODE=cli) needs the claude/gemini binary on PATH but no API key. scan and estimate work with no keys at all.
  • git is required for diff-based modes (--since, pr mode PR diff filtering, GitHub Action checkouts).
  • npm ci (used in the GitHub Action quick-start) requires a committed package-lock.json in the consumer repo. Use npm install instead if your repo doesn't ship one.
  • Outbound HTTPS to provider APIs for AI-backed modes — api.anthropic.com, api.openai.com, generativelanguage.googleapis.com. CI runners with egress restrictions need these allow-listed.
  • Semgrep (optional but recommended): install separately to enable the Semgrep SAST layer — pip install semgrep or brew install semgrep. Without it the Semgrep layer silently degrades to available: false and only ESLint + npm audit run.
  • ESLint v9+ flat config (optional): required in the target project for the ESLint layer to find any rules. Older .eslintrc.* configs are not picked up.
  • claude / gemini CLI binaries on PATH (optional): required only if you set ANTHROPIC_MODE=cli or GOOGLE_MODE=cli to route through a local subscription instead of the API. The default mode is api, which uses HTTP and does not need the CLIs.
  • gh CLI (optional): required for secure-review setup-secrets and for the GitHub Action quick-start. Not needed for local CLI use.
  • pr mode is GitHub-Actions-only — it requires GITHUB_EVENT_PATH and GITHUB_TOKEN to be set by the runner (both are injected automatically inside actions/checkout-based workflows). Use review mode for local testing; running secure-review pr from your laptop will fail.

Quick start — CLI

npm install --save-dev secure-review
npx secure-review init        # interactive scaffold: .secure-review.yml + .env or .env.example
# if init created .env.example: cp .env.example .env
# edit .env — paste your API keys
npx secure-review review ./src

.env in the current directory is auto-loaded — no source .env needed.

init asks interactive prompts (yes/no flags for which providers and SAST tools to enable, provider/model choices for the writer, max iterations for the fix loop, and a 3-way choice for GitHub Action mode) and drops a working config + env file. Use --yes to skip the prompts and accept all defaults; that non-interactive path writes .env.example, so copy it to .env and fill in the keys before running an AI-backed mode.

Other CLI subcommands

Modes (9) — the actual review/fix/analysis pipelines:

| Command | Purpose | |---|---| | secure-review scan <path> | SAST only — no AI calls, no API keys needed | | secure-review review <path> | Multi-model review, no file changes | | secure-review fix <path> | Iterative review → write → re-review loop | | secure-review estimate <path> [--mode review\|fix] | Print a pre-run cost estimate without invoking any model (see WORKFLOW.md for details) | | secure-review baseline <findings.json> [--merge] [--reason ...] | Create or update a .secure-review-baseline.json of known/accepted findings to suppress in subsequent runs (see WORKFLOW.md for details) | | secure-review benchmark <path> | Compare multiple writer models head-to-head on fix quality | | secure-review compare <pathA> <pathB> | Side-by-side security diff of two codebases | | secure-review reviewer-benchmark <path> | Show what each single model misses vs the combined multi-model ensemble | | secure-review pr | GitHub Action entry point: static multi-model review with PR inline comments (see action.yml) |

Utilities (2) — setup helpers that don't run a review:

| Command | Purpose | |---|---| | secure-review init | Scaffold .secure-review.yml + .env or .env.example + optional GitHub Actions workflow | | secure-review setup-secrets | Push API keys from local .env to GitHub Action secrets via gh CLI |

One key is enough. You don't need keys for all three providers — secure-review runs with as few as one reviewer, as long as the writer also uses an enabled provider. Disable any provider during init (or remove its entry from .secure-review.yml) and the tool simply doesn't instantiate that provider. This is useful if you only have an OpenAI key, or want to keep cost down to a single provider.

Quick start — GitHub Action

# .github/workflows/secure-review.yml
name: Secure Review
on: pull_request
permissions:
  contents: read
  pull-requests: write
  checks: write
jobs:
  review:
    runs-on: ubuntu-latest
    if: github.event.pull_request.head.repo.fork == false
    steps:
      - uses: actions/checkout@v4
        with: { fetch-depth: 0 }
      - uses: actions/setup-node@v4
        with: { node-version: 20 }
      - run: npm ci
      - uses: fonCki/secure-review@v1
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          OPENAI_API_KEY:    ${{ secrets.OPENAI_API_KEY }}
          GOOGLE_API_KEY:    ${{ secrets.GOOGLE_API_KEY }}
          GITHUB_TOKEN:      ${{ secrets.GITHUB_TOKEN }}

Open a PR — a single review is posted, with inline comments for findings that land on GitHub-commentable diff lines and summary text for changed-file findings outside those lines.

Setting GitHub Action secrets

You need to set the API keys as GitHub repo secrets so the action can authenticate with the providers. Two ways:

A) Automated (requires gh CLI installed and gh auth login done):

npx secure-review setup-secrets
# Reads keys from .env, sets one secret per enabled provider via `gh secret set`.
# Use --repo owner/name if not running inside a clone.

B) Manual (always works):

gh secret set ANTHROPIC_API_KEY    # paste when prompted
gh secret set OPENAI_API_KEY
gh secret set GOOGLE_API_KEY

Or via the web UI: https://github.com/<owner>/<repo>/settings/secrets/actions — click New repository secret for each key.

Only set secrets for providers you actually enabled. If you only use OpenAI, just OPENAI_API_KEY. GITHUB_TOKEN is auto-provided by Actions — don't set it.

Config (.secure-review.yml)

writer:
  provider: anthropic
  model: claude-sonnet-4-6
  skill: skills/secure-node-writer.md

reviewers:
  - name: codex-web-sec
    provider: openai
    model: gpt-5-codex
    skill: skills/web-sec-reviewer.md
  - name: sonnet-owasp
    provider: anthropic
    model: claude-sonnet-4-6
    skill: skills/owasp-reviewer.md
  - name: gemini-dependencies
    provider: google
    model: gemini-2.5-pro
    skill: skills/dependency-reviewer.md

sast:
  enabled: true
  tools: [semgrep, eslint, npm_audit]
  inject_into_reviewer_context: true   # reviewers see SAST findings

review:
  parallel: true

fix:
  mode: sequential_rotation             # verifier = reviewers[i % len] each iteration
  max_iterations: 3
  final_verification: all_reviewers
  min_confidence_to_fix: 0             # only send findings with confidence >= this (0 = all)
  min_severity_to_fix: INFO            # only send findings at or above this severity (INFO = all)

# Optional: list additional writer models to benchmark head-to-head
writers:
  - provider: anthropic
    model: claude-sonnet-4-6
    skill: skills/secure-node-writer.md
  - provider: openai
    model: gpt-4o
    skill: skills/secure-node-writer.md

gates:
  block_on_new_critical: true
  block_on_new_high: false              # default — set true to also fail on new HIGH findings (see `pr` mode)
  max_cost_usd: 20
  max_wall_time_minutes: 15

# Optional `dynamic:` block — preserved for YAML compatibility; runtime probing lives in
# secure-review-runtime (https://github.com/sstaempfli/secure-review-runtime), not in core CLI modes.

Every reviewer is a {name, provider, model, skill} quad — name is required (it appears in reportedBy and PR comments to identify which reviewer flagged a finding). Skills are Markdown files defining the reviewer's role (web-sec pen-tester, OWASP auditor, supply-chain specialist, etc.). Write your own by copying skills/*.md.

Environment

ANTHROPIC_API_KEY=...
OPENAI_API_KEY=...
GOOGLE_API_KEY=...

# Local dev: use the provider's CLI binary instead of API (Claude Max / Gemini CLI subscription).
# GitHub Actions runners: must be api (factory refuses cli mode in runners).
ANTHROPIC_MODE=api        # api | cli
OPENAI_MODE=api           # api only
GOOGLE_MODE=api           # api | cli

# For `secure-review pr`
GITHUB_TOKEN=...

Modes

Each mode below is the friendly summary. For the full per-step pseudo-code, see WORKFLOW.md.

scan — SAST only

secure-review scan ./src

Runs Semgrep, then ESLint, then npm audit, and normalizes their output to the same Finding schema the AI reviewers use. No LLM calls, no API keys required. Cheapest pre-commit triage.

review — multi-model parallel one-shot

secure-review review ./src                 # full scan (asks before running once cost is shown)
secure-review review ./src --since main    # only files changed since `main`
secure-review review ./src --baseline none # ignore any local .secure-review-baseline.json
secure-review review ./src --yes           # skip the cost-estimate prompt

SAST runs first, then every reviewer (e.g. anthropic-haiku + openai-mini + gemini-flash) scans the same code with the SAST findings passed as prior context when enabled. Reviewers run in parallel by default; set review.parallel: false in .secure-review.yml to run them sequentially. Findings are deduped by {file, line-bucket, cwe-or-title-prefix} — overlapping findings at the same location merge ONLY when they share a CWE (or, when CWE is missing, a 24-char title prefix). Two genuinely-distinct vulnerabilities in the same 10-line bucket of the same file (e.g., a SQL injection at line 7 and a command injection at line 13) stay separate. Cross-model agreement on the same CWE merges with reportedBy accumulating names. Confidence per finding is min(1, |reportedBy| / 3), so a finding flagged by 2 of 3 reporters is high-confidence. The report sorts findings by agreement count descending and highlights multi-model agreement with a badge.

If a .secure-review-baseline.json is present in the scan root (or --baseline <path> is set), findings whose fingerprint matches an entry are excluded from the headline findings array (still recorded under baselineSuppressed for transparency). With --since <ref>, only files changed since that git ref are reviewed — useful on iterative PR workflows where the full tree hasn't changed.

No file mutations. Output: reports/review-<timestamp>.{md,html,json}. The HTML report is a single self-contained file (inline CSS + vanilla JS, no external assets) with sortable/filterable findings, severity badges, agreement counts, and collapsible per-finding detail. Open it in any browser; works offline.

fix — cross-model rotating loop (0.5.0+ semantics)

secure-review fix ./src --max-iterations 3 --max-cost-usd 20
secure-review fix ./src --since main                  # only files changed since `main`
secure-review fix ./src --baseline ./baseline.json    # use a specific baseline file
secure-review fix ./src --yes --no-estimate           # CI-friendly: skip prompt + skip preview

The mode that actually fixes things. Three phases:

  1. Initial union scan — SAST runs first, then all reviewers run in parallel. The aggregated union becomes the writer's iter-1 to-do list (no reviewer's blind spots get a free pass).
  2. Iteration loop (rotating verifier per iter):
    • Step A: Writer applies fixes for the current findings list (iter 1: union; iter 2+: previous verifier's audit).
    • Step B: Next reviewer in rotation acts as the verifier and audits the writer's output with fresh eyes (different model = different blind spots).
    • Step C: Baseline filter + stable-ID annotation, then the audit becomes the next iteration's input.
    • The loop exits when any of these four conditions hits: (a) N consecutive verifiers all see clean (full rotation; N = number of configured reviewers), (b) max_iterations is reached (the for-loop ceiling — fix.max_iterations in the config, default 3), (c) a gate fires (block_on_new_critical, block_on_new_high, max_cost_usd, max_wall_time_minutes), or (d) divergence is detected.
  3. Final verification — by default, all reviewers in parallel re-scan the final state. Catches anything the per-iteration verifiers missed individually.

The writer is always the same model; the verifier rotates. This prevents the writer from drifting toward "code that satisfies one specific model" — every iteration a different judge shows up.

Safety controls:

  • Pre-run cost estimate — before any model call, print a token-cost projection per model (point + ±30% band) and (in interactive shells) prompt for confirmation. --yes skips the prompt; --no-estimate skips the preview entirely. In CI / non-TTY contexts the estimate is printed but the run proceeds without prompting (gates.max_cost_usd remains the budget contract). Standalone preview: secure-review estimate ./src --mode fix.
  • Run capsgates.max_cost_usd and gates.max_wall_time_minutes stop runaway fix loops. For one-off longer runs, use secure-review fix ... --max-wall-time-minutes 60 instead of editing YAML.
  • Baseline / FP suppression — findings whose fingerprint matches .secure-review-baseline.json are filtered before the writer ever sees them and never appear in the remaining set, so the loop spends only on net-new issues.
  • Stable finding IDs across iterations — every finding is assigned a session-scoped S-NNN keyed on {file, line-bucket, cwe-or-title-prefix}. The same bug (same CWE) keeps the same ID even when the verifier rephrases the title, so the per-iteration resolved and introduced deltas in the report reflect actual writer effects, not relabeling.
  • Rollback — if the writer introduces a new CRITICAL finding and a gate fires, the loop rolls back to the pre-iteration snapshot before stopping. New files created by the writer are also removed.
  • Divergence detection — if total findings grow for 2 consecutive iterations, the loop stops to prevent regression spirals (the loop-divergence failure mode).
  • Filtering — configure min_confidence_to_fix and min_severity_to_fix in the config to limit what the writer attempts (e.g. only fix HIGH+ findings with ≥50% confidence).
  • Incremental mode--since <ref> restricts the pipeline to tracked files changed since <ref> (via git diff --name-only --diff-filter=ACMR <ref>) PLUS untracked, non-gitignored files (via git ls-files --others --exclude-standard). Reviewers (and the writer in fix mode) only see that changed-files set directly. SAST tools scan the entire root, then findings outside the changed-files set are filtered out post-scan (Semgrep and ESLint don't reliably honor per-file include lists, so a scan + filter is the safe path).

Earlier versions (pre-0.5.0) used a different loop: each iteration's reviewer scanned alone, single-reviewer-zero exited the loop early, and the initial scan was a vanity baseline metric. See CHANGELOG.md for the migration notes.

Output: reports/fix-<timestamp>.{md,html,json} plus the unified diff (fix-<timestamp>.patch) and modified source files. The HTML report adds a before/after delta block and a per-iteration timeline with the resolved/introduced split per iteration — useful for thesis presentations and code-review walk-throughs.

benchmark — compare writer models

secure-review benchmark ./src

Runs the initial full scan to get a baseline finding set, then for each writer model configured under writers: in the config: applies one round of fixes, re-scans with all reviewers, measures how many findings were resolved vs introduced, and restores files before running the next writer. Produces a markdown comparison table.

# .secure-review.yml — add a writers array to benchmark multiple models
writers:
  - provider: anthropic
    model: claude-sonnet-4-6
    skill: skills/secure-node-writer.md
  - provider: openai
    model: gpt-4o
    skill: skills/secure-node-writer.md

Output: reports/benchmark-<timestamp>.md

compare — A/B path security diff

secure-review compare ./v1 ./v2

Reviews two directories in parallel and produces a side-by-side report: findings unique to A, findings unique to B, findings common to both, and an overall delta (better / worse / same). Useful for comparing AI-generated vs human-written code, or before/after a refactor.

Output: reports/compare-<timestamp>.md

reviewer-benchmark — single vs combined multi-model

secure-review reviewer-benchmark ./src

Answers the question: what does each individual model miss that the ensemble catches? Runs each configured reviewer in isolation (+ SAST), then compares against the full multi-model aggregate. The report shows:

  • Per-model blind spot percentage (findings in combined that the solo model missed)
  • Unique contributions per model (findings only that model found)
  • Multi-model agreement breakdown on the combined finding set

This is the empirical justification for the multi-model design — no single model catches everything.

Output: reports/reviewer-benchmark-<timestamp>.md

pr — GitHub Action entrypoint

Runs review mode on the full checkout, then filters the aggregated findings against the PR diff before posting a single review. Findings are split into three buckets:

  • inline — finding on a changed line in a changed file → posted as inline comment
  • summary — finding in a changed file but on an unchanged line → mentioned in the review summary
  • dropped — finding in an untouched file → not posted

Fork PRs are skipped by default (forks don't have secret access). Fails the check (exit 2) if block_on_new_critical is true and any CRITICAL finding lands on a changed file (not only commentable diff lines), or if block_on_new_high triggers on a HIGH, or if cost cap is exceeded — see evaluatePrGates in src/reporters/github-pr.ts.

Architecture

secure-review architecture: entrypoints flow through modes and core modules to reports and GitHub PR review

For the per-mode runtime flow (sequence diagrams, state diagrams, full pseudo-code), see WORKFLOW.md.

Evidence JSON

review and fix modes emit a self-contained JSON evidence file with per-iteration counts and severity breakdowns — suitable for plotting, diffing across runs, or feeding into dashboards. Other modes emit markdown summaries or stdout output: benchmark, compare, and reviewer-benchmark write markdown to reports/, and scan prints a JSON summary to stdout (no report file).

{
  "task_id": "my-app",
  "tool": "secure-review",
  "tool_version": "0.5.2",
  "fingerprint_algorithm": "sha256:file+line-bucket+cwe-or-title-prefix",
  "condition": "F-fix",
  "run": 1,
  "timestamp": "2026-04-28T12:00:00.000Z",
  "generation_time_seconds": 184.7,
  "total_cost_usd": 0.42,
  "model_version": "claude-sonnet-4-6|gpt-5-codex+claude-sonnet-4-6+gemini-2.5-pro",
  "total_findings_initial": 12,
  "findings_by_severity_initial": { "CRITICAL": 1, "HIGH": 3, "MEDIUM": 5, "LOW": 2, "INFO": 1 },
  "total_findings_after_fix": 4,
  "findings_by_severity_after_fix": { "CRITICAL": 0, "HIGH": 1, "MEDIUM": 2, "LOW": 1, "INFO": 0 },
  "new_findings_introduced": 1,
  "findings_resolved": 9,
  "resolution_rate_pct": 75.0,
  "semgrep_after_fix": 0,
  "eslint_after_fix": 0,
  "lines_of_code_fixed": 0,
  "reviewers": ["codex-web-sec", "sonnet-owasp", "gemini-dependencies"],
  "iterations": 3,
  "review_status": "ok",
  "failed_reviewers": [],
  "findings": [
    {
      "id": "S-001",
      "severity": "HIGH",
      "title": "Unsanitized user input flows to SQL query",
      "file": "src/db/users.ts",
      "line": 42,
      "cwe": "CWE-89",
      "reportedBy": ["codex-web-sec", "sonnet-owasp"],
      "confidence": 0.67
    }
  ],
  "per_iteration": [
    { "iteration": 1, "reviewer": "codex-web-sec",      "findings_found": 8 },
    { "iteration": 2, "reviewer": "sonnet-owasp",       "findings_found": 5 },
    { "iteration": 3, "reviewer": "gemini-dependencies","findings_found": 4 }
  ]
}

The same schema is used by both review and fix modes. Review-only runs use condition: "F-review" and set the before/after finding counts to the same values because no fixes are applied.

Runtime evidence JSON from secure-review-runtime uses condition: "F-attack" or "F-attack-ai"; see the repo for details.

Developing and verifying

Use this checklist when you change the tool or want confidence it behaves as documented in WORKFLOW.md.

Automated (no API keys required for most tests)

npm install
npm run typecheck          # TypeScript — catches broken imports/types
npm test                   # Vitest — CLI, schema, reporters, core modes (mocked LLMs)
npm run build              # library (dist/) — what `npx secure-review` runs via bin
npm run build:action       # GitHub Action bundle (dist-action/index.js) — commit when src/ changes affect the action

Most tests mock LLM adapters and spin short-lived HTTP servers on 127.0.0.1; they do not call external APIs unless you add separate live checks yourself.

Smoke checks (fast, local)

| Goal | Command | |------|---------| | CLI loads | npx secure-review --help | | Config + SAST path only | npx secure-review scan ./src (prints JSON summary to stdout; no report files) | | Cost math only | npx secure-review estimate ./src --mode review or --mode fix | | Runtime probes (optional) | Use secure-review-runtime against a live app |

After editing TypeScript under src/, run npm run build before expecting npx secure-review to pick up those changes (unless you invoke node dist/cli.js from a fresh build).

End-to-end with real models

Requires provider keys in .env and consumes quota for AI-backed modes (review, fix, etc.). Inspect generated JSON under reports/ and cross-check behavior with the mode pseudo-code in WORKFLOW.md.

License

MIT © 2026 Alfonso Pedro Ridao, Shana Stampfli