plotroom
v0.1.2
Published
Embeddable CI/CD + pull-request operations dashboard — watch GitHub Actions pipeline health, runner timing and critical path, and PR merge-readiness, with one-click ready-to-merge / draft levers. Ships a React component (./embed) and a mountable Express b
Maintainers
Readme
pr-dashboard
Locally-hosted dashboard for every open PR across one or more GitHub orgs/users: CI/CD lifecycle stage, percent complete, and ETA — all in a single browser tab.

UI
The dashboard opens with a status strip — five colour-coded tiles (Running, Queued, Deploying, Failed, Idle) that show live PR counts; clicking any tile filters the PR list to that state, and clicking again clears the filter. Below the strip, each open PR is rendered as a metro-map row: a horizontal track of nodes (CI → merge queue → QA deploy → prod) with the active node pulsing, done segments filled green, and a small elapsed/ETA annotation under the active node. Clicking a row expands a check Gantt panel: one row per CI job, grouped by workflow — the rollup workflow (e.g. CI) is shown first with required checks before advisory; other workflows (e.g. Auto-merge PRs) appear in a separate labeled section below. Each row has a proportional progress bar, elapsed time, and an ETA-accuracy footer line (typically ±Xm, n=N). At the very top, when any PRs are in the merge queue, a queue-train strip shows the current build batch(es) as blue bordered "cars" with progress bars, followed by dashed "next batch" and "then" cars listing the waiting PRs — clicking any PR anchor in a car smoothly scrolls the page to that PR's row (respecting prefers-reduced-motion). Queued PRs show two labeled sections in their expanded panel: the merge group build (driving the stage ETA) and PR checks (head commit).
Features


Screenshots show public repos (facebook/*), captured from a demo instance.
- Pipeline stages per PR: CI → merge queue → QA deploy → awaiting prod (deploy stages only for repos with configured environments), plus parked substates (draft, conflicting, CI failed, queue group failed…).
- Live updates over SSE with frame dedup — unchanged snapshots aren't
re-sent. A server keepalive forces a frame at least every 60s so the
header's
live · updated HH:MMstamp stays fresh. - Connection badge: the header shows
live · updated HH:MMwhile the SSE stream is connected and a reddisconnected — retrying…badge when it drops (plusstale since HH:MMwhen GitHub fetches are failing). - Queue position: queued PRs show
behind N(entries ahead of it in the merge queue). - ETAs from observed history: per-check p50/p90 durations, observed whole-group merge-queue runs, and merged→QA-live deploy gaps.
- Conditional-remaining estimator: when ≥5 historical samples exist for a running check, the remaining time is re-anchored on the samples that exceed the current elapsed time (handles bimodal cold/warm caches); if elapsed exceeds every sample the check is flagged overdue.
- ETA accuracy footer: the expanded check panel shows
ETA accuracy (ci): typically ±2m (n=14)— the median absolute error of first-ETA predictions vs actual stage durations (seconds shown below one minute, e.g.±45s). - Runner visibility: queued jobs are split into waiting-for-runner
(every
needs:dependency completed OK — the job is eligible and just needs a machine:⧗ waiting for runner · 3m (typical ~1m), amber when waiting exceeds 2× typical) vs blocked on a named upstream job (⊘ blocked on static-checks). Pickup waits are learned per(repo, check, event)—wait = startedAt − max(needs' completedAt), no extra API calls — and the medians feed stage ETAs for still-queued jobs. Theneeds:graph derived fromci.ymlis phase-aware: a dependency whose job provably never runs for the check's event (e.g. amerge_group-only job seen from a PR-phase check) is satisfied by absence instead of being reported as blocking forever. - Workflow-scoped required population: checks are attributed to the workflow
that emitted them; prefix matching for required checks is scoped to the rollup
workflow so helper workflows (e.g.
Auto-merge PRs) can never pollute the required set. - Settings panel: a gear button (⚙) in the header opens a slide-over panel for editing the safe subset of instance config live — see Settings panel below.
Setup (new user quickstart)
1. Clone and install
git clone https://github.com/your-fork/pr-dashboard.git
cd pr-dashboard
pnpm install2. Authenticate with GitHub
The default token source is the gh CLI keyring:
gh auth login # once — follow the promptsAlternatively, set tokenSource: "env" in your config and export GITHUB_TOKEN,
or register a dedicated GitHub App with pnpm app:setup — see
GitHub App mode below.
3. Configure (optional but recommended)
Create config.json in the repo root or ~/.config/pr-dashboard/config.json
(XDG; useful when you want one config for multiple checkouts). The first existing
file wins; PRDASH_CONFIG env var overrides both. See config.example.json for a
complete annotated example.
Minimal config to watch one org:
{ "owners": ["your-org"] }Without a config file the app auto-derives the owner from the GitHub token (viewer.login).
4. Build and run
pnpm build # compiles frontend → dist/public; only needed once and after updates
pnpm start # http://127.0.0.1:4400Dev mode with hot-module reload:
pnpm dev # Vite on :5173 proxying /api to :4400First launch backfills ~50 commits of check-run history per repo so ETAs work
immediately. History lives in data/history.db. Deploy ancestry ("is this
merge commit live on QA/prod yet?") is answered through the GitHub compare API
by default — no git binary, no local clones. Only with ancestrySource:
"clone" do bare clones get created in data/clones/.
5. Install as a systemd user service
pnpm service:install # renders deploy/pr-dashboard.service.template → ~/.config/systemd/user/ and daemon-reloads
systemctl --user enable --now pr-dashboard
loginctl enable-linger $USER # keep running after logout
# Inspect
systemctl --user status pr-dashboard
journalctl --user -u pr-dashboard -fTo restart after a config change or update:
systemctl --user restart pr-dashboardConfig field reference
All fields are optional; the table shows default values.
| Field | Default | Description |
|---|---|---|
| owners | [] | GitHub org/user logins whose PRs are shown. Empty → auto-derived: App mode uses the installation account logins; gh/env use the token's viewer.login. |
| exclude | [] | Repo full names ("org/repo") to skip entirely. |
| port | 4400 | HTTP port. Bound to 127.0.0.1 only (loopback-only by design — see Security). |
| retentionDays | 7 | How many days of check-run history to keep in data/history.db. |
| batchSize | 6 | Default merge-queue batch size used in queue-position arithmetic. |
| tokenSource | "gh" | Where the GitHub token comes from. "gh" reads the gh CLI keyring (strips GITHUB_TOKEN). "env" reads GITHUB_TOKEN. "app" mints GitHub App installation tokens (see GitHub App mode). |
| app.appId | — | Required when tokenSource is "app". Numeric GitHub App id. File-only (never PUT-writable). |
| app.privateKeyPath | — | Required when tokenSource is "app". Path to the App's RSA private key PEM. File-only. |
| app.installationId | all installations | Optional restriction: pin the dashboard to a single installation. Omit to watch repos across every account the App is installed on (see Multi-installation). File-only. |
| webhooks.enabled | false | Opt-in signed webhook receiver (see Webhooks). File-only. |
| webhooks.secretPath | — | Required when webhooks are enabled. Path to the shared webhook secret file (written by pnpm app:setup). File-only. |
| webhooks.path | "/api/webhooks/github" | Route the receiver listens on. File-only. |
| notifications | see Notifications | Desktop/webhook notification sinks + daily digest. File-only except the enabled toggle. webhookUrl (often token-bearing) is shown host-masked everywhere. |
| deployUrlAllowlist | unset | Optional hostname allowlist for in-repo (.pr-dashboard.yml-sourced) deploy healthUrl/cloneUrl (cloneUrl is only checked with ancestrySource: "clone" — it is never touched otherwise). Unset → in-repo URLs honored as-is. File-only. |
| ancestrySource | "api" | How deploy ancestry is answered. "api" = GitHub compare API (no git binary, no local clones; a pre-existing clone serves as a transport-error fallback only). "clone" = local bare clones in data/clones/ (the previous mechanism — useful if you prefer local checks or are rate-limit constrained). File-only. |
| apiUrl | "https://api.github.com/graphql" | GraphQL endpoint. Override for GitHub Enterprise (e.g. "https://github.example.com/api/graphql"). |
| rateLimitFloor | 1000 | Remaining rate-limit budget below which polling degrades to slow intervals. |
| intervals.sweepMs | 60000 | Full-sweep poll interval (ms). |
| intervals.hotMs | 15000 | Fast-poll interval when active PRs are in flight (ms). |
| intervals.deployMs | 30000 | Deploy-health-check interval (ms). |
| deploy.<repo> | {} | Deploy-tracking config keyed by "owner/repo". Omit the key to disable deploy stages for that repo. |
| deploy.<repo>.cloneUrl | "https://github.com/<repo>.git" | Git URL for the bare clone used for ancestry checks. Only used (and only needed) when ancestrySource is "clone". |
| deploy.<repo>.defaultBranch | "main" | Branch that merges land on (used to anchor ancestry walks). |
| deploy.<repo>.environments[] | — | Array of deployment environments (at most one qa and one prod). |
| deploy.<repo>.environments[].name | — | Required. "qa" or "prod". |
| deploy.<repo>.environments[].healthUrl | — | Required. URL polled for the deployed SHA (expects a JSON body). |
| deploy.<repo>.environments[].auto | true for qa, false for prod | Whether deploys to this env trigger automatically (affects stage transitions). |
| deploy.<repo>.environments[].shaKey | "commitSha" | JSON key in the health response that contains the deployed commit SHA. |
| repos.<repo> | {} | Per-repo behaviour overrides keyed by "owner/repo". |
| repos.<repo>.requiredCheckPrefixes | derived from ci.yml | Check name prefixes that force a check to be treated as required mid-run, before GitHub marks it isRequired. Explicit [] disables prefix matching entirely for this repo. |
| repos.<repo>.rollupJobId | "ci" | The rollup job in ci.yml whose needs: closure defines required checks. Also used to scope prefix matching to the right workflow. Must equal the branch-protection required check (it's the gate the merge queue waits on); change both together when you rename the rollup job. |
| repos.<repo>.workflowPath | ".github/workflows/ci.yml" | Repo-relative path to the workflow YAML read for needs: derivation. |
| repos.<repo>.batchSize | global batchSize | Merge-queue batch size for this repo. Overrides the global value. |
| costPerMinute | unset | CI cost attribution: pool label → $ per runner-minute ("default" prices unlisted pools). Unset → the cost panels report minutes only. File-only (money figures come from the operator's file, never the browser). |
| poolMeta.<pool>.instanceType | unset | Display-only instance type for the pool (shown in the cost explorer tables). File-only. |
| poolMeta.<pool>.dollarsPerMinute | unset | $ per runner-minute for the pool — supersedes the costPerMinute entry for the same label (the "default" key participates in fallback the same way). File-only. |
| poolMeta.<pool>.podsPerNode | 1 | Bin-packing divisor (≥ 1, fractional OK): when N runner pods share one node (ARC/Kubernetes), pricing each job-minute at the node's $/min overstates spend N× — the effective rate is rate ÷ podsPerNode (label's entry, else "default"'s, else 1). File-only. |
| poolMeta.<pool>.note | unset | Operator free-text for the pool. File-only. |
Settings panel
The gear button (⚙) in the header opens a slide-over settings panel backed by
GET/PUT /api/config. Changes are written back to the loaded config file
(read-modify-write — hand-written fields outside the editable subset are
preserved verbatim) and hot-applied without a restart: the poller swaps its
config, re-arms its timers with the new intervals, and triggers an immediate
sweep.
Editable in the panel (the safe subset):
| Section | Fields |
|---|---|
| Watched repos | owners, exclude |
| Tuning | retentionDays, batchSize, intervals (sweep / hot / deploy) |
File-only (shown read-only in the panel):
tokenSource,apiUrl,port— anything the UI can write, anything running on localhost can write viaPUT /api/config. These three are the credential and network surface: a writabletokenSource/apiUrlwould let a local process redirect your GitHub token to an attacker-controlled endpoint, and a writableportcould re-bind the service. The server rejects any attempt to PUT them with400 { offendingKeys }— the UI's read-only rendering is a convenience, not the security boundary.deploy.<repo>andrepos.<repo>blocks — edit them inconfig.jsondirectly, or set them per-repo via an in-repo.pr-dashboard.yml(next section). The panel shows each repo's effective settings with a per-field source tag (override/in-repo/derived/default).
The panel also has a Restart button (POST /api/admin/restart, with inline
confirmation): the server responds 202 and exits non-zero shortly after, so
systemd (Restart=on-failure) revives it; no shell execution is involved. The
UI rides out the bounce on the existing SSE auto-reconnect.
CI cost explorer
The metrics tab's CI cost panel attributes runner occupancy (every job's
start→end span, all conclusions — a failed or cancelled job burned its runner
too) to the job's runs-on pool, then prices it through the file-only rate
config: poolMeta.<pool>.dollarsPerMinute supersedes costPerMinute[<pool>]
per label, the "default" pair backs everything unlisted, and the winning
rate divides by podsPerNode (bin-packing correction). Unpriced pools report
minutes but stay out of the $ totals — a documented undercount, never a
fabricated zero. The explorer breaks the window down by pool (with
instance types), by job (top 15 by minutes), and by run (top 20,
grouped by event + head sha + run number, with best-effort PR links); each
open PR's expanded panel also shows CI cost this run for its current head
((partial) flags a $ figure that's missing unpriced checks).
Actuals import (POST /api/cost/actuals)
Attribution explains which jobs burn the minutes; it can't see idle runner capacity, node boot/teardown, or anything else on the real bill. To close the loop, push your actual daily spend in — the panel then overlays actual vs attributed dollars per day and headlines attribution coverage ("jobs explain 58% of fleet spend"); the unexplained remainder is idle/overhead. Coverage trending down while job minutes hold steady means the fleet is idling more.
The endpoint is provider-agnostic — anything that can curl can feed it:
// single row or an array; scope defaults to "fleet" (or name a pool label)
{ "scope": "fleet", "date": "2026-06-11", "dollars": 123.45, "source": "aws-ce" }Rows upsert on (scope, date) so re-imports are idempotent. Validation is
strict (real calendar YYYY-MM-DD, finite dollars ≥ 0, no unknown keys) and
all-or-nothing — a cron never half-imports. Browser-originated cross-site
POSTs are blocked by the same-origin guard; header-less clients (curl, cron)
pass. Example nightly cron pulling yesterday's spend from AWS Cost Explorer
(illustrative — adapt the query/filters to your account):
# /etc/cron.d/pr-dashboard-cost-actuals — import yesterday's fleet spend at 09:30
30 9 * * * tony aws ce get-cost-and-usage --time-period Start=$(date -d yesterday +\%F),End=$(date +\%F) --granularity DAILY --metrics UnblendedCost --query 'ResultsByTime[0].{date:TimePeriod.Start,dollars:Total.UnblendedCost.Amount}' --output json | jq '{scope:"fleet",date:.date,dollars:(.dollars|tonumber),source:"aws-ce"}' | curl -sS -X POST -H 'content-type: application/json' -d @- http://127.0.0.1:4400/api/cost/actualsNotifications
The poller already detects every alert-worthy transition; the notifier layer (issue #19) turns them into desktop notifications. The event types:
| Type | Fires when | Default |
|---|---|---|
| ci-failed | a PR enters parked/ci-failed (a required check failed) | on |
| group-failed | a queued PR's merge-group build fails | on |
| queue-blocked | a queue entry goes UNMERGEABLE (genuine conflict or cascade victim — the detail names the conflicting culprit PR) | on |
| ready | a PR's checks go green (ci -> ready/armed or ready/idle) | off |
| overdue | a stage's ETA is exceeded (overdue flips true) | off |
| prod-live | a merged PR's commit becomes prod ancestry ("shipped") | on |
| queue-stalled | a repo's merge queue enters dispatch-stall (repo-level) | on |
| duration-regression | a check's recent p50 steps up persistently — recent 10-run median ≥ 1.5× the prior 20-run median AND +60s (repo-level; hourly scan, issue #41) | on |
| runner-starvation | a runner pool's last-hour pickup-wait p90 exceeds max(5min, 4× its 7-day baseline p90) with ≥5 samples (repo-level; hourly scan, issue #45) | on |
Debounce: one notification per (PR, event type) while the condition holds;
if the condition clears (e.g. the failing check is retried green) and later
re-enters, it fires again. prod-live fires once per PR per process lifetime.
Repo-level types debounce per repo (queue-stalled) or per (repo, check,
event) (duration-regression — clears when the ratio drops below 1.2) or per
(repo, pool) (runner-starvation — hysteresis clears below 2× baseline).
Sink A — host command (notifications in config.json, file-only)
"notifications": {
"enabled": true,
"command": ["notify-send", "{title}", "{body}"],
"webhookUrl": "https://hooks.slack.com/services/T123/B456/xxxx",
"digest": { "enabled": false, "hourLocal": 8 },
"events": { "ci-failed": true, "group-failed": true, "queue-blocked": true,
"ready": false, "overdue": false, "prod-live": true,
"queue-stalled": true, "duration-regression": true,
"runner-starvation": true }
}commandis an argv array, run viaexecFile— never a shell, so a hostile PR title can't inject.{title}/{body}are substituted in any argument (never incommand[0], the executable).- The whole block is file-only:
PUT /api/configrejects it (the command executes on the host, so it must never be writable from the browser). - Command failures are logged once and never crash a poll cycle.
- A type set
falseineventsfires neither sink.
Sink B — browser notifications (the header bell)
Notification events also ride the SSE stream as named notification frames.
The bell button in the header toggles browser Web Notifications: turning it on
requests Notification permission and persists the choice in localStorage.
Works regardless of notifications.enabled (that flag gates only the host
command and the webhook); the per-type events toggles apply to every sink.
Caveat: there is no service worker — the dashboard tab must be open (backgrounded is fine) to receive browser notifications. For tab-independent delivery, use the command or webhook sink.
Sink C — generic webhook (notifications.webhookUrl, file-only)
Set webhookUrl to any HTTP(S) endpoint (Slack/Discord/ntfy relay, your own
receiver) and every notification event is POSTed as JSON while
notifications.enabled is true:
{ "type": "ci-failed", "repo": "acme/widgets", "prNumber": 7,
"title": "fix: the thing", "detail": "a required check failed",
"at": "2026-06-12T08:00:00.000Z" }- Fire-and-forget, no retries (v1): one POST per event with a 5s timeout.
A missed delivery is dropped — building a retry queue trades a duplicate
storm for a missed ping, which is the wrong trade for notifications. If a
receiver needs reliability, poll
GET /api/stateinstead. - Failures are logged at most once per hour and never crash a poll cycle.
- File-only — webhook URLs routinely carry tokens in the path (Slack), so
the key is rejected by
PUT /api/configand the settings panel only ever shows the scheme+host (https://hooks.slack.com/…); the same mask applies toGET /api/configand the server log line. - Slack/Discord expect their own payload shapes (
{"text": …}etc.) — pointwebhookUrlat a small relay (or an ntfy topic) if you need translation; the dashboard deliberately stays format-agnostic.
Daily digest (notifications.digest, file-only)
A scheduled morning summary of the last 24h — merges, queue ejects with the
top culprit check (cross-referenced against the flake radar), runner-pool p90
vs baseline, active duration regressions, and queue health — sent through the
command and webhook sinks (and the SSE stream) as one digest event with a
multi-line body.
- Enable with
"digest": { "enabled": true, "hourLocal": 8 }in thenotificationsblock (file-only; restart to apply).hourLocalis the server's local hour, 0–23. - The scheduler self-re-arms to the next wall-clock occurrence after each firing (DST-safe), and a compose failure never kills the schedule.
- Delivery uses the same gates as every event:
enabledmust be true for the command/webhook sinks; the browser bell shows it regardless.
Actuals cron (cloud cost feed)
scripts/post-actuals.sh [days] pulls daily cost from AWS Cost Explorer
(filter configurable via ACTUALS_FILTER_JSON; defaults to EC2 compute) and
POSTs it to /api/cost/actuals, powering the attribution-coverage view.
Wire it as a systemd user timer with the templates in deploy/
(pr-dashboard-actuals.{service,timer}.template — render __APP_ROOT__,
enable with systemctl --user enable --now pr-dashboard-actuals.timer).
Any other cost source works too — the endpoint just takes {date, dollars}.
Pool-rate relay (true per-capacity $/min)
Where the actuals cron feeds the fleet total (the denominator of the
coverage view), scripts/relay-pool-rates.sh [days] feeds the per-pool
rates (the numerator). It asks Cost Explorer for EC2-Compute split by
PURCHASE_TYPE (spot vs on-demand), reads the dashboard's own per-pool
runner-minutes from /api/metrics, and writes a real
dollarsPerMinute into each pool's config.json poolMeta —
type $ ÷ that type's runner-minutes. Spot pools land on the cheap spot
rate, on-demand pools on the dearer one, automatically. GitHub-hosted
pools (ubuntu-latest) are on a separate bill and are skipped.
It's self-correcting: the rate's denominator is the same minute
snapshot the dashboard prices against, so attributed-$ reconciles to
actual-$ by construction. Cumulative fleet coverage settles at the
honest EC2-Compute ÷ total-fleet ratio (~90% — the gap is non-compute
EC2-Other/EKS/VPC, which isn't per-runner-minute attributable). Two
expected wrinkles: (1) coverage spikes briefly right after the relay's
reload while pool-learning re-warms the volatile unknown pool, then
settles within ~30s; (2) per-day coverage is noisy because a blended
monthly $/min won't match each day's CE allocation — read the
cumulative number. DRY_RUN=1 prints the rates without writing.
Wire it as a daily systemd user timer with the deploy/ templates
(pr-dashboard-pool-rates.{service,timer}.template — render
__APP_ROOT__, enable with
systemctl --user enable --now pr-dashboard-pool-rates.timer). It runs
at 06:30, after the actuals feed, so CE has settled for the day.
Kiosk mode (wall displays)
Append ?kiosk=1 to the dashboard URL for a read-only, at-a-distance view
intended for wall-mounted displays (e.g. a Raspberry Pi running Chromium in
kiosk mode):
http://127.0.0.1:4400/?kiosk=1 # 30s per view (default)
http://127.0.0.1:4400/?kiosk=1&cycle=20 # 20s per view (minimum 10)- Read-only chrome — the settings gear, legend, and notification bell are hidden, the Pipeline/Metrics tab bar is gone, status tiles are plain (non-filtering) summaries, and PR rows don't expand on click. The status strip stays: it's the glanceable summary.
- Larger type for readability across the room.
- Auto-cycling — the view rotates every
cycleseconds: each repo section is scrolled to the top of the viewport in turn, then the Metrics trends view shows, then the cycle wraps. Cycling pauses while the tab is hidden and honorsprefers-reduced-motion(instant jumps instead of smooth scrolling). - Live updates unchanged — the same SSE stream (keepalive + auto-reconnect) feeds the kiosk view, so it never needs a manual refresh.
Params are read once at page load; change the URL and reload to adjust.
In-repo .pr-dashboard.yml
Any watched repo can carry its own dashboard settings in a .pr-dashboard.yml
at the repo root of its default branch. The file is read over the GraphQL API
(blob read — no clone needed) at startup and refreshed on the same 24h cycle as
ci.yml prefix derivation. This is the "repo layer" of the config model — the
repo's maintainers describe how their CI/deploys work, and any dashboard
instance watching the repo picks it up automatically.
Schema
All fields optional; unknown keys are ignored with a logged warning, and an invalid field is dropped individually (one bad field never takes the rest of the file with it).
rollupJobId: ci # rollup job in the workflow below (default: ci)
workflowPath: .github/workflows/ci.yml
requiredCheckPrefixes: [] # replaces ci.yml derivation when set ([] disables prefix matching)
batchSize: 6 # merge-queue batch size
aliases: # carry learned history across a check rename
static-checks: checks # old canonical check name -> new name
deploy: # enables deploy stages for this repo
cloneUrl: https://github.com/owner/repo.git # default: GitHub URL of the repo (clone mode only)
defaultBranch: main
environments: # at most one qa and one prod
- name: qa # qa | prod
healthUrl: https://qa.example.com/health # required per environment
auto: true # default: true for qa, false for prod
shaKey: commitSha # JSON key holding the deployed SHAValidation and defaulting are identical to the same fields in config.json
(env names lowercased, qa/prod only, healthUrl required, shaKey
defaults to commitSha).
Precedence
Per-repo settings resolve field-by-field, highest first:
- Instance override —
repos.<repo>/deploy.<repo>in yourconfig.jsonalways wins (your instance, your last word). - In-repo — the repo's
.pr-dashboard.yml. - Derived — prefixes derived from the repo's
ci.yml(requiredCheckPrefixesonly). - Defaults.
The settings panel's per-repo section shows which layer each effective value
came from (override / in-repo / derived / default).
Surviving workflow/job renames
Reorganizing CI is a non-event for the dashboard if you treat .pr-dashboard.yml
as config-as-code that travels in the same PR as the rename. Everything below is
a top-level key in the watched repo's .pr-dashboard.yml (at the repo root
of its default branch — the same file shown under Schema above), so
the dashboard never observes a mismatched state. A dashboard-instance override
in config.json (repos.<repo>.*) would win over the in-repo file, but you
rarely want that for a rename — keep the change with the repo.
- Rename a workflow file (
ci.yml→main.yml): nothing to do. If the configured/default path stops resolving, the dashboard auto-discovers which file under.github/workflows/now definesrollupJobId(one GraphQL tree read), adopts it, and remembers the path. PinningworkflowPathexplicitly opts out of discovery (the declared path is honored verbatim). - Rename the rollup job (
ci→ something): add a top-levelrollupJobId: <new>in the same PR. This is the one value that can't self-discover, because it must equal the branch-protection required check (the gate the merge queue waits on) — so you're changing the ruleset anyway, and this line moves with it. - Rename any other check/job: add an
aliases:entry (old-name: new-name). The renamed check then keeps its learned history — ETAs, runner-pool mapping, flake/duration stats — instead of cold-starting. Without an alias the new name still self-heals (durations re-learn over ~20 runs, pools re-learn within hours); the alias just removes that transient. Aliases are applied once and rewrite stored history under the new name (idempotent; reverse with the inverse alias).
A worked example — .pr-dashboard.yml in the PR that renames the ci rollup
job to merge-gate, the workflow file to main.yml, and the static-checks
job to checks:
rollupJobId: merge-gate # was: ci (default) — must match branch protection
# workflowPath omitted on purpose: auto-discovered after the file rename
aliases:
static-checks: checks # carry static-checks' ETA/pool/flake history forwardTrust note
By default, in-repo deploy URLs are honored as-is: a .pr-dashboard.yml
lets that repo's maintainers point healthUrl (which the dashboard polls) and
— in clone mode — cloneUrl (which it clones) anywhere. That is acceptable for
the single-user self-hosted deployment — you control the repos you watch. If
you watch repos you don't fully control, set deployUrlAllowlist in
config.json: when set, in-repo deploy entries whose healthUrl host (plus
the cloneUrl host when ancestrySource is "clone" — the only mode that
touches it) is not on the list are dropped with a logged warning.
Instance-override deploy config (deploy.<repo> in your own config.json) is
exempt — the operator wrote it.
Path anchoring
All runtime paths (data directory, config file, static files) are resolved
relative to the package root — the directory containing package.json.
The server does not rely on process.cwd(), so it works when started from
any directory (including via systemd).
Environment overrides:
PRDASH_DATA_DIR— override the data directory (default:<root>/data)PRDASH_CONFIG— override the config file path (default:<root>/config.json)
Required-check prefixes (derived from ci.yml)
Repos that gate merges on a single rollup job (e.g. a ci rollup) don't mark
checks isRequired until late in the run. To classify required checks mid-run
the poller uses name prefixes, resolved in this order:
- Config:
repos["<owner>/<repo>"].requiredCheckPrefixesinconfig.json— always wins. An explicit empty array ([]) disables prefix matching entirely for that repo. - Derived: the poller reads the repo's workflow file (default
.github/workflows/ci.yml) at startup (and re-derives every 24h) and walks the rollup job'sneeds:graph; each job in the closure contributes its display name as a prefix (reusable-workflow jobs get a/suffix). WithancestrySource: "api"(the default) the file is read over the GraphQL blob API — no clone needed — for every deploy repo and every repo that opts in via arepos.<repo>entry or an in-repo.pr-dashboard.yml. WithancestrySource: "clone"derivation reads the bare clone and is limited to deploy repos (as before). A successful derivation is logged:[poller] derived required-check prefixes for <repo>: …. Unparseable YAML leaves the previous prefixes in place; valid YAML with no rollup job degrades to['ci']. - Fallback:
['ci']— used only until derivation succeeds.
Prefix matching is scoped to the rollup workflow: a check whose name starts with a required prefix only counts as required when it was emitted by the same workflow that owns the rollup job. Checks from helper workflows (e.g. an auto-merge orchestrator) are excluded from the required population regardless of their names.
GitHub App mode (pnpm app:setup)
Instead of a personal token, the dashboard can authenticate as a GitHub App
you register yourself — read-only permissions (checks, pull_requests,
actions, contents, metadata), private to your account, no PAT to rotate.
Registration is one command via GitHub's app-manifest flow:
pnpm app:setup # optional: pnpm app:setup my-dashboard-appThe script starts a one-shot localhost listener and prints a URL. Open it: it forwards you to GitHub's pre-filled "Create GitHub App" page (the name is editable there). After you confirm, GitHub redirects back to the local listener and setup completes automatically:
- the App's private key is written to
~/.config/pr-dashboard/<slug>.private-key.pem(mode 0600), and the generated webhook secret to<slug>.webhook-secretalongside it; - your active config file is patched (read-modify-write — other fields are
preserved) to
tokenSource: "app"withapp: { appId, privateKeyPath }; - the install URL (
https://github.com/apps/<slug>/installations/new) is printed — install the App on the account/repos the dashboard should watch, then restart the dashboard.
At runtime the server mints short-lived installation tokens itself (App JWT →
installation access token, cached and refreshed before expiry, built on
node:crypto only). At startup it lists the App's installations and builds a
per-owner client map; the list is refreshed every 24 hours, so new
installations are picked up without a restart.
Multi-installation
One instance watches repos across all of the App's installations. Install
the App on each account whose repos the dashboard should watch (the install
URL again: https://github.com/apps/<slug>/installations/new). Every GitHub
request is routed to the installation that covers the repo's owner — each
installation gets its own token and its own rate-limit budget. An owner in
your owners list that no installation covers is skipped with a one-time
owner '<owner>' has no installation — skipped warning (it's a config
mismatch, not an outage — the repo's data is never marked stale for it).
With tokenSource: "app" and no owners configured, the owners list defaults
to the installation account logins — installing the App on an account is
enough to start watching it.
app.installationId is an optional restriction: when set, the registry is
pinned to that single installation and only its account's repos are visible.
Use it when one App serves several dashboard instances and each should only
see its own account.
Alternative pattern — one instance per account. Instead of one instance
spanning installations, you can still run a separate instance per account:
distinct PRDASH_CONFIG / PRDASH_DATA_DIR / port, with app.installationId
pinned to that account's installation. Useful when you want per-account
isolation (separate ports, data dirs, lifecycles) rather than a single
combined dashboard.
Webhooks (optional)
Polling is the primary update mechanism and works with zero ingress. If you want lower-latency updates, the server ships an opt-in, signature-verified webhook receiver:
{
"webhooks": {
"enabled": true,
"secretPath": "~/.config/pr-dashboard/<slug>.webhook-secret"
}
}Every delivery must carry a valid
X-Hub-Signature-256HMAC (verified with a timing-safe compare against the shared secret insecretPath— written for you bypnpm app:setup). Bad/missing signature → 401; receiver disabled → 404.Events nudge the poller out-of-band:
pull_request/check_run/check_suitetrigger a targeted PR refresh,merge_groupa queue refresh,workflow_runa sweep. Webhooks are a hint, not the source of truth — polling still reconciles everything.When webhooks are enabled the hot-poll interval automatically relaxes ×4 (an explicit
intervals.hotMsin your config still wins).The server stays loopback-only, so GitHub needs a tunnel to reach it. Point the tunnel at the receiver path on loopback:
# cloudflared cloudflared tunnel --url http://127.0.0.1:4400 # then set the App's webhook URL to https://<tunnel-host>/api/webhooks/github # or smee.io (dev-grade) smee --url https://smee.io/<channel> --target http://127.0.0.1:4400/api/webhooks/githubEnable the webhook in your App's settings (URL + the same secret) after the tunnel is up. Without ingress, simply leave
webhooks.enabledfalse — everything works on polling alone.
Security model
- Loopback-only by design. The server binds to
127.0.0.1only and there is no configuration option to expose it publicly. Do not reverse-proxy it without adding authentication. - Same-origin guard on mutations. Mutating endpoints (
PUT /api/config,POST /api/admin/restart) reject cross-site requests with 403: a presentSec-Fetch-Siteheader must besame-origin/none, and a presentOriginmust be a localhost origin on the configured port. This blocks browser-mediated CSRF from random websites against the loopback service. The webhook path is exempt — it is authenticated by its HMAC signature instead. - Credential/network config is file-only.
tokenSource,apiUrl,port,app,webhooks,ancestrySource, anddeployUrlAllowlistcan never be written throughPUT /api/config(the server rejects them with400 { offendingKeys }). Anything that could redirect your token or re-bind the service requires editing the config file on disk. - In-repo deploy URLs and
deployUrlAllowlist. Watched repos can carry a.pr-dashboard.ymlthat pointshealthUrl(and, in clone mode,cloneUrl) anywhere. If you don't fully control every watched repo, setdeployUrlAllowlistto the hostnames you trust; non-matching in-repo deploy entries are dropped with a warning (instance-override config is exempt). - Bare clones in
data/clones/contain full repository history — clone mode only. With the defaultancestrySource: "api"no clones are ever created (ancestry runs over the compare API) and this concern does not apply. WithancestrySource: "clone", any repo configured underdeploy.<repo>will have its full git history cloned locally — on private repos this means all commits, messages, and tree objects are stored on disk indata/clones/. Protect this directory accordingly; it is excluded from source control via.gitignore. - Token source. The default
tokenSource: "gh"reads the token from theghCLI keyring and deliberately strips theGITHUB_TOKENenvironment variable so a stale env var cannot shadow the fresh keyring credential.tokenSource: "env"readsGITHUB_TOKENdirectly — ensure the environment is clean when using this mode (e.g. avoid exporting a stale token from a shell profile).tokenSource: "app"reads the App's PEM fromapp.privateKeyPath; installation tokens are cached in memory only and never logged or returned by any endpoint.
Embedding Plotroom in a host app
Plotroom (npm: plotroom) is source-only: a host app consumes it and hosts
both tiers in-process — there is no separate Plotroom service and no cross-service
proxy. It exposes two mountable surfaces, frontend (./embed) and backend
(./server). The standalone (pnpm start, :4400) is unchanged for local/dev.
Full handoff: docs/embedding-host-guide.md.
Backend — mount the API + poller in your own Express server (plotroom/server):
import { createPrDashboardBackend } from 'plotroom/server';
const prdash = await createPrDashboardBackend({
config, // resolved config object, or { path }
dataDir: '/data/prdash', // YOUR volume: history.db / workspace.db / clones
githubApp: { appId, privateKey }, // PEM string from YOUR env (not a file path)
// serveStatic defaults false (you serve the frontend); trustHostAuth defaults true
});
app.use('/bff/ops/prdash', requireAdminSession, prdash.router); // YOUR auth gates it
const stopPoller = prdash.startPoller(); // run the poller in-process; call stopPoller() on shutdownrouteris a mountable Express app — no.listen().trustHostAuth(default true) skips the built-in same-origin guard because your middleware (requireAdminSession) is the gate — there is no shared secret.- Single instance + a persistent volume at
dataDir(better-sqlite3 is single-writer; don't horizontally scale the poller)./api/admin/restartis a no-op when mounted (it neverprocess.exites your host). - With inline
githubAppcreds, passconfigas an object (as above), not{ path }— the file loader still requiresapp.privateKeyPathfortokenSource:"app", whereas the object path lets the inlineprivateKeystand alone.
Frontend — mount the component (plotroom/embed), pointing apiBase at your own backend mount:
import { PrDashboard } from 'plotroom/embed';
import 'plotroom/embed/style.css';
// the API lives under `${mount}/api`, so apiBase = your mount path + /api
<PrDashboard apiBase="/bff/ops/prdash/api" basename="/ops/prdash" routerMode="path" />Props: apiBase?, basename?, routerMode?, focusedRepo?, onFocusChange?, className?, withCredentials?.
Consumption: the package is published to npm as plotroom —
npm i plotroom ships the prebuilt dist/embed (frontend) + dist/server
(backend, tsc), no build step at install time. A git dependency also works
("plotroom": "github:steiner385/plotroom#<sha>"; its prepare builds both on
install). React 19 is a peer dependency (host provides the single instance). The
./server export's emitted ESM uses extensionless imports — import it under
tsx or a bundler (plain node ESM needs .js extensions; a plain-node
build is a follow-up, #200).
License
MIT © 2026 Tony Stein.
