@christianmorup/review-intent

v0.1.3

Published

2 hours ago

Render the diff between the current branch and main as an intent-annotated HTML review page with mermaid class & sequence diagrams.

0High
0Medium
0Low

christianmorup

code-review diff git mermaid cli

review-intent

A CLI that renders the diff between your current branch and main as an intent-annotated HTML review page — opened in your browser — with mermaid class & sequence diagrams.

It exists to fix shallow PR review: a diff shows what changed but erases why. review-intent puts the agent's stated intent side-by-side with each change, so a reviewer adjudicates decisions instead of skimming lines.

How it works

The CLI is a pure renderer. It does two things:

Runs git diff <base>...HEAD itself (PR-style, merge-base diff).
Reads an agent-authored artifact (./.review/intent.json) for the intent prose and the mermaid diagram sources.

It joins them and emits one self-contained review.html, then opens it. No LLM call, no API key, no token cost per run. The agent that made the changes is responsible for writing the artifact.

Usage

Run it without installing:

npx @christianmorup/review-intent    # diff HEAD vs main, read ./.review/intent.json, open browser

Or install it globally:

npm install -g @christianmorup/review-intent
review-intent

From a clone (development):

npm install
npm run build
node dist/cli.js              # diff HEAD vs main, read ./.review/intent.json, open browser

Options:

| Flag | Default | Meaning | |------|---------|---------| | --base <ref> | main, else master | Base branch to diff against | | --artifact <path> | .review/intent.json | Intent artifact location | | --out <path> | OS temp file | Where to write the HTML | | --no-open | (opens) | Write the file but don't launch the browser | | --allow-gaps | (off) | Render a draft even if intent is incomplete; gaps render as red markers |

Blast-radius summary

The top of the page carries a blast-radius block — the part that earns the tool its keep — with three parts that deliberately pit claimed against measured:

Surface-area scorecard (measured, CLI-computed from the diff) — files/ hunks/±lines, test-vs-code file and line counts, net line delta, hunks-per-file concentration, new-file count, file-level reach fan-in, intent coverage (files & hunks annotated), diagram coverage, the single most-churned file, a count of debt/debug markers introduced (TODO/FIXME/console.log/ debugger), a noise-file count (lockfiles, generated, build output, binaries), a red badge when code changed but no tests did, sensitive-path flags (auth, *.bicep, ADO pipelines, app config, secrets/Key Vault, Dockerfile, dependency manifests), and a churn flag. Objective and un-gameable.
Risk ledger (claimed, agent-authored) — assumption → if false → how you'd know. If absent, an honesty nudge appears instead of a blank.
Reach graph (measured, CLI-computed) — a mermaid flowchart of repo files that import the changed files. Heuristic (import/require/from scan), labelled as such; bounded, and any truncation is shown, never silent.

The scorecard sitting next to the ledger is the point: if the agent claims "low risk" while the scorecard flags touches auth/, 0 tests, the contradiction is visible at a glance.

Visual summary

Below the blast radius is a visual summary — five charts rendered as pure, self-contained inline SVG (no charting dependency, deterministic output):

Diff mass — diverging add/remove bars per file, sorted by churn, coloured by category (test/code/noise) with a green/red dot for intent present/missing.
Change treemap — rectangles sized by ± lines, coloured by top-level directory; files with no intent get a red outline.
Intent-coverage rings — donut gauges for the share of files and hunks that carry agent rationale (the completeness contract, visualized).
Reach ripple — the reach graph as concentric rings: changed files at the centre, importers rippling outward.
Change map — the signature view: one dot per changed file, placed by measured downstream reach (how many repo files import it) on the x-axis against measured churn (± lines) on the y-axis, sized by hunk count and coloured red when the file carries a complexity hotspot. Files in the shaded top-right corner are the largest and most depended-on — the review-first targets, picked out instead of being averaged into a single PR-wide score.

npm run sample builds and writes a representative sample-output.html you can open to see the whole page.

Code complexity (optional, via `lizard`)

If the lizard analyzer is installed (pip install lizard), the scorecard also reports measured cyclomatic complexity of the changed functions — max CCN, a count of hotspots at/above the threshold, and a "complexity hotspots" bar chart in the visual summary. lizard covers the whole Immeo stack (C#, TS/JS, Python) from source, no build required. It's a measured signal, so it sits on the same un-gameable side as the scorecard. If lizard isn't on PATH, the page says so rather than hiding the gap — nothing else changes.

Optional repo policy — `.review/config.json`

{
  "sensitivePaths": [ { "label": "pii", "pattern": "(^|/)pii" } ],  // regex on posix path; replaces defaults
  "churnFiles": 20,           // flag "large change set" above this many files
  "churnLines": 600,          // ...or this many ± lines
  "complexityThreshold": 15   // cyclomatic complexity at/above which a function is a hotspot
}

Absent → built-in defaults (tuned to the Immeo stack). It's repo policy, kept out of the per-change artifact so it can't be gamed per-PR.

The artifact contract (`.review/intent.json`)

{
  "title": "Short change-set title",
  "tldr": "Five-second headline: what this does + the single most important why.",
  "overall": "Why this change set exists, what was rejected, what it rests on. (markdown)",
  "risks": [
    { "assumption": "Data is request-scoped", "ifFalse": "Cache leaks across users", "howYoudKnow": "Concurrent-session test" }
  ],
  "tests": [
    { "describes": "Cache returns null on a miss instead of throwing.", "name": "CacheMiss_ReturnsNull", "kind": "unit" }
  ],
  "diagrams": {
    "class": "classDiagram\n  ...",       // mermaid source, authored by the agent
    "sequence": "sequenceDiagram\n  ..."   // highlight changed steps with rect / Note
  },
  "files": [
    {
      "path": "src/foo.ts",
      "what": "What changed in this file.",
      "why": "Why — the decision behind it. (markdown)",
      "hunks": [
        { "anchor": 42, "what": "What this change does.", "why": "Why this specific change." }
      ]
    }
  ]
}

title, tldr, overall, and every file/hunk's what + why are required. diagrams, risks, tests, and hunks are optional. The tldr renders as a lede at the top (a five-second read); overall is the fuller story in a collapsible block beneath it.

Tests section (claimed, agent-authored)

tests is an optional list of test cases described in plain language — the point is a reviewer reading "Cache returns null on a miss" instead of decoding a name like CacheMiss_ReturnsNull. Each entry needs a describes sentence; name (the real test identifier, for cross-reference) and kind (unit, integration, e2e, manual, or anything else) are optional. Known kinds get a coloured tag and group the list. It renders as a standalone Tests section between the visual summary and the diagrams. It is pure display — review-intent never parses or runs your tests — so it sits on the claimed side, like the risk ledger.

Completeness is enforced

The original pain point was agents leaving intent blank. So the contract has teeth: every changed file needs a what + why, and every diff hunk needs an intent. review-intent runs a completeness gate and refuses to render if anything is missing, printing the exact files and hunks that lack rationale:

Intent is incomplete — 2 gap(s) found:
  - src/util.js: no what/why written for this changed file
  - src/util.js: hunk @@ -1 +1 @@ has no intent

--allow-gaps renders a work-in-progress draft anyway, with each gap shown as a red marker in place — so even a draft can't hide an empty spot. what is a cheap one-line description; why is the decision and must not just restate the what.

How per-hunk intent is matched

anchor is a line number in the new version of the file. The CLI attaches the note to whichever diff hunk's new-line range contains that anchor. This is robust to minor hunk-boundary shifts (unlike matching by hunk ordinal).

Notes that match no hunk are surfaced under "Notes not matched to a hunk" — never silently dropped. Artifact entries for files absent from the diff appear under "Intent for files not in this diff". (Visibility over silent truncation, by design.)

Claude Code integration: authoring the artifact

Nothing generates intent.json — that's the change-making agent's job, and whether the intent is genuine reasoning or post-hoc rationalization is the whole ballgame. review-intent ships a Claude Code skill that teaches the agent to author the artifact honestly (real rejected alternatives, stated assumptions, incidental changes marked as incidental) when it finishes a change set, then render it.

review-intent skill install            # ~/.claude/skills/review-intent-authoring (all repos)
review-intent skill install --local    # ./.claude/skills/review-intent-authoring (this repo only)
review-intent skill uninstall          # remove user-scoped skill
review-intent skill uninstall --local  # remove repo-scoped skill

Once the artifact is written the skill tells the agent to run review-intent and open the review — not to ask first, since invoking the skill already means a review is wanted. Add --force to overwrite or remove a hand-edited skill file. User and --local scopes are independent.

The honesty contract is the point: a fluent rationalization is worse than nothing because it lowers the reviewer's guard while adding no signal. The skill pushes for "why I chose this over X" and "what this rests on" — and explicitly tells the agent to admit gaps rather than invent thoroughness.

Development

npm test          # vitest, pure-module unit tests
npm run test:watch

Modules are deliberately small and single-purpose: git.ts (diff), artifact.ts (load + validate), diff-parser.ts (parse), match.ts (pure join), render.ts (pure HTML), cli.ts (thin orchestrator).

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme