presto-review

v0.3.0

Published

a month ago

AI-powered PR code review bot for Azure DevOps — posts inline findings via Claude and blocks merge on HIGH severity issues

0High
0Medium
0Low

bigtp25

code-review pull-request azure-devops ado claude anthropic ai ci pipeline pr-review

Azure DevOps AI PR Review Bot

A self-contained Node script + ADO pipeline that runs Claude over every pull request and posts findings as a PR comment thread. When the AI flags a 🔴 HIGH severity issue, the comment is left active and (combined with a standard branch policy) blocks merge until a human resolves it.

Built for solo / small-team workflows where you want a second pair of eyes on every PR without paying for a managed service.

What it actually does

PR opens against a long-lived branch (e.g. main)
Pipeline triggers via Azure DevOps branch policy
Pipeline fetches the diff (git diff origin/<target>...HEAD)
Calls the Anthropic API with the diff + a system prompt you control
Claude returns a structured review with severity-labeled findings
Pipeline posts the review as a PR comment thread via the ADO REST API
- 🔴 HIGH found → thread posted as Active → branch policy blocks merge
- Otherwise → thread posted as Closed → visible-but-non-blocking

Cost: roughly $0.02–$0.30 per PR depending on diff size. Always exits 0 (advisory infrastructure must not break your build).

Why this vs. a managed service

| | This tool | CodeRabbit / Greptile / Diamond | |---|---|---| | Cost | API costs only (~$0.02–0.30/PR) | $20–50/dev/mo subscriptions | | Customization | Edit a markdown file | Limited to vendor's knobs | | Vendor lock-in | None | Yes | | Setup time | ~30 min first time | ~5 min | | Prompt tailored to your codebase | Yes (you write it) | Generic | | Polish | DIY | Professional | | You own the data flow | Yes | No |

Pick this if: you want full control, low cost, and don't mind editing a markdown file to teach the AI about your codebase.

Pick a managed service if: you want polished UX with zero config, you have budget, and your codebase is conventional enough that a generic reviewer works well.

Quick start (5 minutes once prerequisites are met)

Prerequisites: Azure DevOps project + repo, an Anthropic API key, project admin permissions.

Copy 3 files into your repo:
- pr-review.mjs → tools/pr-review.mjs (or wherever you like)
- system-prompt.md → tools/system-prompt.md (edit for your project)
- pipeline-template.yml → pipelines/pr-review.yml (search for TODO: markers and replace)
In Azure DevOps:
- Add ANTHROPIC_API_KEY to a variable group
- Grant Contribute to pull requests to the build service account
- Register the pipeline (Pipelines → New pipeline → Existing YAML)
- On your target branch, enable Check for comment resolution policy and add this pipeline as Build Validation (optional)
Open a test PR. Within ~60 seconds you should see a Claude review comment appear. If it flagged HIGH, the merge button will be greyed out until you resolve the thread.

Full step-by-step with screenshots-level detail: docs/01-setup-guide.md.

Repository structure

ado-ai-pr-review/
├── README.md                          ← you are here
├── LICENSE                            ← MIT
├── pr-review.mjs                      ← the script (drop into your repo)
├── system-prompt.md                   ← default prompt (copy + edit per-project)
├── pipeline-template.yml              ← ADO pipeline (copy + edit per-project)
├── docs/
│   ├── 01-setup-guide.md              ← step-by-step setup for a new project
│   ├── 02-how-it-works.md             ← architecture deep dive + design rationale
│   ├── 03-customizing-the-prompt.md   ← how to teach the AI about your codebase
│   ├── 04-severity-and-blocking.md    ← how merge blocking works
│   └── 05-troubleshooting.md          ← common issues + fixes (battle-tested)
└── prompt-templates/                  ← starter prompts for common stacks
    ├── README.md
    ├── nextjs-typescript-monorepo.md
    ├── python-fastapi.md
    ├── go.md
    └── generic.md

What's in the script that's worth knowing

The script is ~200 lines, single file, no dependencies beyond Node 20's built-in fetch. The non-obvious behaviors:

Reads the diff from stdin — the pipeline does git diff > /tmp/pr.diff && node pr-review.mjs < /tmp/pr.diff. Keeps the script unaware of git mechanics.
System prompt loaded from file — the script reads system-prompt.md from its own directory (or $SYSTEM_PROMPT_PATH). Lets you tailor per-project without touching code.
Diff truncated at 100K chars (configurable via MAX_DIFF_CHARS). Bigger PRs get the first ~25K tokens reviewed; the AI is told it was truncated.
Always exits 0 — if Anthropic is down, your build doesn't break. The merge-block mechanism is the Active thread status, not a build failure.
HIGH-detection is a configurable regex (HIGH_MARKER env var, default 🔴\s*HIGH). If you change the severity emoji convention in your prompt, change this too.

Costs

Sonnet 4.5 (default) at typical PR sizes:

~2K input + 500 output tokens = ~$0.02 / PR
Larger PRs (50K diff) closer to $0.20-0.30
Diffs above 100K chars get truncated (no further cost growth)

Set ANTHROPIC_MODEL=claude-haiku-4-5-20251001 in the pipeline env to drop to Haiku for ~5× cheaper reviews. Lower review quality, but fine for low-stakes repos.

Limitations

Azure Repos Git only. The pr: YAML trigger doesn't work for Azure Repos (documented ADO quirk) — you must rely on branch policy validation triggers. Should work as-is for GitHub-hosted repos using ADO, but untested.
No conversation memory. Each PR review is fresh; the AI doesn't remember prior PRs or learn from your feedback. (Feature: keeps it stateless and cheap. Limitation: it'll make the same mistake twice.)
Diff-only. The AI doesn't see the full file context, only the diff hunks. Some bugs that require broader context will be missed.
English-language prompt assumes the codebase comments are in English.

License

MIT. Use it, modify it, ship it. No warranty.

Provenance

Extracted from a working production Azure DevOps PR review pipeline on 2026-05-23. The original was built in a single afternoon to address one team's review-discipline concerns. After a week of in-production use, all the lessons learned (Next.js cache invalidation, SYSTEM_PULLREQUEST_* env vars, status code semantics, etc.) have been folded back into this template and the troubleshooting guide.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme