@webui-rubric/cli
v0.1.8
Published
End-user CLI for `webui-rubric`. Orchestrates the full evaluation pipeline: loads and validates the project config, runs the Playwright capture pipeline (`@webui-rubric/capture`), runs all deterministic checks (`@webui-rubric/checks`), scores dimensions a
Downloads
1,321
Readme
@webui-rubric/cli
End-user CLI for webui-rubric. Orchestrates the full evaluation pipeline: loads and validates the project config, runs the Playwright capture pipeline (@webui-rubric/capture), runs all deterministic checks (@webui-rubric/checks), scores dimensions and computes the composite using the V1 rubric (@webui-rubric/core), assembles and validates the EvaluationResult JSON artifact, and routes output to stdout or a file. Also provides version, validate-config, and check-tools utility commands.
Installation
# Global install
npm install -g @webui-rubric/cli
# Or run directly with npx
npx @webui-rubric/cli evaluate https://example.comPrerequisites
- Node.js >= 20 LTS
- Playwright Chromium (for browser capture):
npx playwright install chromium - Chrome/Chromium on
PATH(for Lighthouse performance checks)
Dependencies
| Dependency | Version | Purpose |
|---|---|---|
| commander | ^13.0.0 | CLI argument parsing and command definitions |
| yaml | ^2.6.0 | Parses .webui-rubric.yml configuration files |
| @webui-rubric/core | workspace:* | Rubric definition, scoring math, config validation, redaction, output validation, logger |
| @webui-rubric/capture | workspace:* | Headless browser pipeline (screenshots, DOM, HAR, styles) |
| @webui-rubric/checks | workspace:* | All deterministic check adapters |
Package Interactions
@webui-rubric/cli
├── @webui-rubric/core (V1_RUBRIC, scoring, validateProjectConfig, validateOutput,
│ redactHarHeaders, redactDomSnapshot, logger, loop utilities)
├── @webui-rubric/capture (capturePage, loadReferenceImage, inferDpr)
└── @webui-rubric/checks (checkHeadingOrder, checkLandmarkUsage, ..., runAxeChecks,
runLighthouseChecks, runPixelmatch, mapDiffRegionsToElements, ...)@webui-rubric/cli is the only package that has no consumers — it is the end-user entry point. All imports from the other packages are done lazily (dynamic import()) inside command handlers to avoid circular dependency issues at module load time.
Commands
evaluate <url> (default command)
Runs a full deterministic evaluation of a live web UI against the 10-dimension rubric.
webui-rubric evaluate https://example.com [options]<url> must be a fully qualified, publicly reachable URL.
Options
| Flag | Type | Default | Description |
|---|---|---|---|
| --config <path> | string | .webui-rubric.yml | Project configuration file path |
| --out <path> | string | — | Write JSON artifact to file (default: stdout) |
| --reference <path> | string | — | Reference design PNG for pixel comparison |
| --reference-viewport <name> | string | desktop | Viewport the reference image represents |
| --viewports <list> | string | desktop,mobile | Comma-separated viewport names to capture |
| --debug-dir <path> | string | — | Persist debug artifacts (screenshots, DOM, HAR, diff PNGs) |
| --iteration <n> | integer | — | Loop iteration index (for Evaluator/Generator loops) |
| --previous-composite <n> | float | — | Previous run's composite score (for delta computation) |
| --attempted-fixes <path> | string | — | Path to JSON array of attempted fix hashes (oscillation prevention) |
| --allow-overrun | boolean | false | Permit iterations beyond iteration_cap |
| --allow-tool-version-drift | boolean | false | Proceed when installed tool versions differ from rubric pins |
| --no-redact | boolean | false | Disable HAR/DOM/evidence redaction |
| --log-level <level> | string | info | Log verbosity: debug, info, warn, error |
| -q, --quiet | boolean | false | Suppress all logs below error |
Exit codes
| Code | Meaning |
|---|---|
| 0 | Success — JSON artifact emitted |
| 1 | Runtime error (target unreachable, tool crash, schema validation failure) |
| 2 | Configuration error (invalid config, weights don't sum to 100, missing anchors) |
| 3 | Tool version mismatch (installed version ≠ rubric pin; use --allow-tool-version-drift to bypass) |
| 4 | Iteration cap exceeded (use --allow-overrun to bypass) |
| 5 | Precondition failure (auth wall detected, server returned 4xx/5xx, page settlement timeout) |
version [--json]
Prints the CLI version, rubric version, and pinned tool versions.
webui-rubric version
# CLI: 0.1.7 Rubric: 1.0.0
# axe-core: 4.10.2 lighthouse: 12.2.1 pixelmatch: 7.1.0 playwright: 1.52.0
webui-rubric version --json
# { "cli": "0.1.7", "rubric": "1.0.0", "tools": { ... } }validate-config [--config <path>]
Validates a .webui-rubric.yml project configuration file without running an evaluation. Reports all validation errors to stderr.
webui-rubric validate-config
# ✓ Config valid
webui-rubric validate-config --config custom.yml
# ✗ weights: Weights must sum to 100, got 95Exit codes: 0 (valid), 2 (invalid).
check-tools [--json]
Verifies that the installed versions of axe-core, Lighthouse, pixelmatch, and playwright match the rubric's pinned versions. Use this to diagnose exit-code 3 from evaluate.
webui-rubric check-tools
# ✓ axe-core: pinned=4.10.2 resolved=4.10.2
# ✓ lighthouse: pinned=12.2.1 resolved=12.2.1
# ✗ pixelmatch: pinned=7.1.0 resolved=6.0.0
webui-rubric check-tools --json
# { "axe-core": { "pinned": "4.10.2", "resolved": "4.10.2", "match": true }, ... }Exit codes: 0 (all match), 3 (one or more mismatches).
Output Routing
The routing contract depends on whether --out is specified:
| Mode | stdout | stderr |
|---|---|---|
| No --out | JSON artifact | Logs + one-line summary |
| With --out <file> | One-line summary | Logs only |
One-line summary format:
score=82 blocking=0 issues=5 ship_ready=trueConfiguration File
Create .webui-rubric.yml in your project root (or pass --config <path>). All fields are optional — unset fields use the defaults shown below.
# Dimension weights — must sum to 100
# Accessibility has a weight floor of 10; include it in weight_overrides_ack to override
weights:
visual_design: 10
layout: 10
usability: 12
accessibility: 15 # floor: 10
content_ia: 8
performance: 12
code_quality: 8
brand: 5
consistency: 10
microinteractions: 10
# Include a dimension ID here to acknowledge overriding its weight floor
weight_overrides_ack: []
# Toggle blocking status of individual sub-criteria
blocking_overrides:
accessibility.color-contrast: true # default is already true for WCAG sub-criteria
# Viewport dimensions (pixels)
viewports:
desktop:
width: 1280
height: 800
mobile:
width: 375
height: 812
custom:
tablet:
width: 768
height: 1024
# Reference images for pixel comparison (viewport name → PNG path)
reference_images:
desktop: ./designs/homepage-desktop.png
mobile: ./designs/homepage-mobile.png
# What to do when reference dimensions don't match the screenshot
reference_image_mismatch_policy: fail-fast # or "resize"
# pixelmatch anti-alias tolerance (0–1)
pixelmatch_threshold: 0.1
# What to do when a check tool is not available
tool_fallback_policy: fail-fast # or "mark-unavailable"
# Maximum evaluation loop iterations before CLI refuses to proceed
iteration_cap: 5
# Composite score threshold for ship_ready = true
ship_threshold: 75
# Maximum entries in top_issues array
top_issues_cap: 10
# Network idle + navigation timeout in ms
settle_timeout_ms: 30000
# Default true — redacts HAR headers, POST bodies, and input values
redaction: true
# Cookie consent banner configuration
capture:
auto_dismiss: true
dismiss_selectors:
- '[aria-label*="accept" i][aria-label*="cookie" i]'
- '#onetrust-accept-btn-handler'
# Pixel comparison configuration
pixel_comparison:
mask_selectors:
- '.dynamic-carousel'
- '[data-ad]'
mask_color: '#FF00FF'
device_pixel_ratio: auto # or a number: 1, 2, etc.
# Add custom sub-criteria to existing dimensions
custom_sub_criteria:
- dimension: usability
id: usability.custom-form-validation
name: Custom Form Validation Check
description: Checks for inline validation patterns
bound_check:
check_family: dom
check_id: custom-form-validation
full_id: dom.custom-form-validation
anchors:
- score: 0
label: Critical
description: No validation present
threshold: { operator: eq, value: 0, min: null, max: null }
- score: 1
label: Poor
description: Minimal validation
threshold: { operator: eq, value: 1, min: null, max: null }
- score: 2
label: Needs Improvement
description: Basic validation
threshold: { operator: eq, value: 2, min: null, max: null }
- score: 3
label: Good
description: Good validation patterns
threshold: { operator: eq, value: 3, min: null, max: null }
- score: 4
label: Excellent
description: Comprehensive validation
threshold: { operator: eq, value: 4, min: null, max: null }
blocking_if_zero: falseEvaluation Pipeline
When evaluate runs, it follows this sequence:
Load and validate config — Reads
.webui-rubric.yml, validates withvalidateProjectConfig, merges with rubric defaults, and validates weight constraints.Check iteration cap — If
--iterationis set and exceedsiteration_cap, exit code4(unless--allow-overrun).Resolve DPR — If
--referenceis supplied, reads the reference image and callsinferDprto determine the device scale factor. Falls back to the config value or1.Capture phase — Calls
capturePage(url, options)which:- Launches headless Chromium
- Navigates, settles, dismisses consent banners
- Injects stabilization CSS
- Captures screenshots at each viewport
- Extracts DOM, computed styles, element locations, console errors, HAR
Apply redaction — If enabled (default), calls
redactHarHeadersandredactDomSnapshoton the captured artifacts.Run structural and runtime checks — Synchronously on the captured artifacts:
- DOM:
checkHeadingOrder,checkLandmarkUsage,checkLinkDescriptiveness,checkImageAlt,checkFormLabels,checkMetaViewport - CSS:
checkUniqueColorCount,checkFontFamilyCount,checkSpacingConsistency - Runtime:
checkConsoleErrors,checkResourceCount
- DOM:
Run Lighthouse — Async. Launches Chrome, runs performance audit, extracts LCP, FCP, CLS, TBT metrics. Failure is non-fatal (findings marked
tool_unavailable).Run pixel comparison — If
--referenceis supplied, callsrunPixelmatchandmapDiffRegionsToElements. Dimension mismatch triggerstool_unavailablefor visual-parity sub-criteria.Score dimensions — Maps check results to rubric sub-criteria, calls
buildDimensionResultfor each dimension, thencomputeCompositeScore.Build blocking list — Collects sub-criteria with
score === 0andblocking_if_zero: true.Build top issues — Ranks all imperfect findings by
priority_score = dimension_weight × severity, filters out hashes in--attempted-fixes, caps attop_issues_cap.Detect loop state — Computes
delta = composite - previousCompositeand setsno_progress = |delta| < 3.Validate output — Calls
validateOutputagainst the Zod schema. Exits code1if invalid — no partial JSON.Persist debug artifacts — If
--debug-diris set, writes screenshots,dom-snapshot.html,recording.har,console-errors.json, and diff PNGs to the directory at mode0700.Route output —
routeJsonOutput(to stdout or--outfile) androuteSummary(to stderr or stdout per routing contract).
Usage Examples
Basic evaluation
webui-rubric evaluate https://example.com
# JSON artifact → stdout
# score=72 blocking=2 issues=8 ship_ready=false → stderrSave JSON to file
webui-rubric evaluate https://example.com --out result.json
# JSON → result.json
# score=72 blocking=2 issues=8 ship_ready=false → stdoutPixel comparison against a reference design
webui-rubric evaluate https://example.com \
--reference ./designs/homepage.png \
--reference-viewport desktopCapture debug artifacts
webui-rubric evaluate https://example.com --debug-dir ./debug
# Saves: debug/screenshot-desktop.png, debug/screenshot-mobile.png,
# debug/dom-snapshot.html, debug/recording.har, debug/console-errors.jsonIterative loop mode
# Iteration 1 (initial run)
webui-rubric evaluate https://example.com \
--out iter1.json \
--iteration 1
# Iteration 2 (after Generator has applied fixes)
webui-rubric evaluate https://example.com \
--out iter2.json \
--iteration 2 \
--previous-composite 68.5 \
--attempted-fixes iter1-fixes.json
# meta.delta and no_progress are now set in outputCustom configuration
webui-rubric evaluate https://example.com --config ./ci/webui-rubric.ymlSuppress logs (machine-readable mode)
webui-rubric evaluate https://example.com --quiet --out result.json
# Only errors → stderr; no info/warn logsValidate config without running
webui-rubric validate-config --config staging.ymlCheck tool versions
webui-rubric check-tools
webui-rubric check-tools --json | jq '.lighthouse'