comfy-qa
v2.4.1
Published
ComfyUI QA automation CLI
Readme
comfy-qa
E2E QA automation for user-facing frontend repos. AI-driven Playwright tests with video recording, HUD overlay, and structured reports.
One-shot setup
Tell your AI agent (Claude Code, Cursor, etc.):
run npx comfy-qa setupThe agent reads the emitted prompt and automatically:
- Detects your framework, package manager, backend, and auth
- Installs Playwright
- Creates
playwright.qa.config.tswith video/trace/screenshot enabled - Creates
.claude/skills/comfy-qa/SKILL.mdtailored to your repo - Creates
.claude/skills/comfy-qa/REPRODUCE.mdfor issue reproduction - Creates starter
tests/e2e/qa.spec.tscovering your key routes - Updates
.gitignore
Re-running npx comfy-qa setup updates existing files without overwriting what's already correct.
QA a PR or issue
# Paste a GitHub URL — auto-detects PR vs issue
comfy-qa https://github.com/org/repo/pull/123
comfy-qa https://github.com/org/repo/issues/456
# Or use subcommands
comfy-qa pr https://github.com/org/repo/pull/123
comfy-qa issue org/repo#456
# Batch QA recent open issues
comfy-qa full org/repo --limit 5Each run produces in .comfy-qa/<slug>/:
| File | Content |
|------|---------|
| report.md | Full QA report — bug analysis, checklist, test scenarios |
| qa-sheet.md | Printable QA checklist for manual testing |
| <type>-<N>.e2e.ts | Generated Playwright E2E test |
| qa-<N>.webm | Recorded session video with HUD overlay |
| screenshots/ | Step-by-step screenshots |
| research.json | Raw research data |
| agent-log.txt | Agent action log |
Options
--no-record Disable video recording (ON by default)
--add-comment Post report as GitHub comment
--comfy-url <url> Point to a running dev server (default: auto-detect)
--limit N Number of issues for batch mode (default: 5)How it works
- Research — Fetches PR/issue from GitHub, Claude analyzes bug/feature and generates QA checklist + test scenarios
- Record — Playwright opens the app with a HUD overlay showing what the agent is doing. If the dev server is running, an AI agent drives the browser through each test scenario
- Report — Generates structured markdown report, QA sheet, E2E test file, and video
Backend strategy
No mocks. QA runs against real servers:
- If the QA target is unrelated to the backend — use the repo's default staging server
- If the QA target is related to the backend — clone the backend to
tmp/, build, run locally, point the frontend tolocalhost
Install
bun installRoadmap
Short-term, in priority order:
Improve reproduction precision. Current pipeline misses bugs that depend on specific workflows or custom nodes (see PR #9430 — 8/11 reproduced). Environment setup tools (workflow loader, custom-node installer, attachment downloader) close this gap.
Measure reliability (flakiness). Run the same checklist N times, track pass→fail→pass transitions per operation. A QA run is only trustworthy if its result is stable across repeats. Surface flaky operations on the dashboard.
Auto-file GitHub issues for failing operations — gated behind a confidence threshold.
scripts/report-failures.shexists but is intentionally not wired up yet: while reproduction rate is still ~70%, auto-filing would drown maintainers in false positives. Enable once (1) and (2) lift the floor, and keep a manual review step before filing the first batch per repo.Cross-product QA matrix. Today: registry, docs, website, download-data, embedded-editor. Add cloud.comfy.org (WebGL, already working via
--headless=new), comfy-vibe, and first-party ComfyUI_frontend runs.Continuous QA. Schedule daily runs, track score trends per product, alert on regressions (score drop ≥ 10%).
License
MIT
