qa-workflow-cc

v1.0.0

Published

13 days ago

Autonomous QA orchestrator for Claude Code — test-fix-verify cycles that run themselves.

0High
0Medium
0Low

dlandry85

claude claude-code qa testing autonomous orchestrator

QA Workflow

Autonomous QA orchestrator for Claude Code — test-fix-verify cycles that run themselves.

npx qa-workflow-cc

Works on Mac, Windows, and Linux. Any tech stack.

"Point it at your project, walk away, come back to a QA report with a fix plan ready to approve."

How It Works · Commands · Architecture · Model Profiles · Getting Started

The Problem

Manual QA in AI-assisted development is a bottleneck. You build features fast with Claude Code, but then spend hours manually testing, triaging bugs, writing fixes, and re-testing. Every context reset loses your progress. Every new session starts from scratch.

QA Workflow fixes that. It discovers your tech stack, spawns specialized test agents, consolidates results into a structured report, and plans fixes — all autonomously. When it needs your input, it stops cleanly and waits. When you approve, it picks up exactly where it left off.

Who This Is For

Developers using Claude Code who want real QA — not just "run the test suite." QA Workflow covers:

Unit, component, and E2E tests — generated from your actual codebase
Security audits — OWASP checklist, auth boundary testing, tenant isolation
UX heuristic evaluation — Nielsen's 10 heuristics scored and reported
Performance benchmarks — Lighthouse scores, load testing baselines
Visual regression — screenshot-based change detection

All of it automated. All of it resumable.

How It Works

1. Bootstrap your project

/qa:init

QA Workflow discovers your tech stack — languages, frameworks, test runners, auth patterns, database setup. It generates specialized test agents tailored to your project. This runs once per project.

2. Run a full QA cycle

/qa:full

The orchestrator takes over:

Phase 0: Bootstrap (if needed)
Phase 1: Load resources & profile
Phase 2: Parse scope
Phase 3: Execute tests          ← spawns parallel test agents
Phase 4: Consolidate report     ← merges all results
Phase 5: Decision gate          ← PASS / FAIL / ESCALATE
Phase 6: Plan fixes             ← ■ STOPS here for your review

You get a structured report and a prioritized fix plan. Review it, then continue.

3. Approve and let it fix

/qa:continue

The system executes fixes in priority order, verifies each batch, then automatically re-runs the failing tests:

Phase 7:  Execute fixes         ← batch checkpoints per priority
Phase 7b: Verify                ← type-check + build + re-test
    └─→ Loop back to Phase 3   ← next cycle (max 3)

If everything passes → Certified. If defects persist after 3 cycles → Escalated with a detailed report.

4. Resume from anywhere

/qa:resume

Context reset mid-cycle? Machine restarted? No problem. QA Workflow writes state before every phase. /qa:resume reads cycle-state.json and re-enters at the exact right phase.

Commands

| Command | What it does | |---------|--------------| | /qa:init | Bootstrap QA infrastructure — discover tech stack, generate agents | | /qa:full | Run complete QA cycle — test, report, decision gate, fix plan | | /qa:full api | API/backend tests only | | /qa:full security | Security audit only | | /qa:full ux | UX heuristic evaluation only | | /qa:continue | Execute approved fix plan — applies fixes, verifies, re-tests | | /qa:resume | Resume from any interrupted state | | /qa:status | View current QA progress dashboard |

The Lifecycle

QA Workflow is a state machine. Every phase writes its state before executing, so nothing is ever lost.

                          /qa:full
                             │
               ┌─────────────┤
               │             │
          Phase 0         Phase 1─2
         Bootstrap       Load & Parse
         (if needed)         │
               │             │
               └─────────────┤
                             │
           ╔═════════════════╧══════════════════╗
           ║        CYCLE LOOP (max 3)          ║
           ║                                    ║
           ║   Phase 3: Execute Tests           ║
           ║     └─ Parallel test agents        ║
           ║   Phase 4: Consolidate Report      ║
           ║     └─ Merge raw results           ║
           ║   Phase 5: Decision Gate           ║
           ║     ├─ PASS  → Phase 8 (certify)   ║
           ║     ├─ FAIL  → Phase 6 (plan)      ║
           ║     └─ STUCK → Phase 9 (escalate)  ║
           ║                                    ║
           ║   Phase 6: Plan Fixes              ║
           ║     └─ ■ STOP — review plan        ║
           ║         run: /qa:continue          ║
           ║                                    ║
           ║   Phase 7: Execute Fixes           ║
           ║     └─ Batch by priority           ║
           ║   Phase 7b: Verify & Re-test       ║
           ║     └─ → back to Phase 3           ║
           ║                                    ║
           ╚════════════════════════════════════╝
                             │
              ┌──────────────┴──────────────┐
              │                             │
         Phase 8                       Phase 9
        Certified ✓                   Escalated ⚠
       (all criteria pass)        (stuck after 3 cycles)

Stop Points

The system pauses at well-defined points — you're always in control.

| Stop Point | When | What to do | |-----------|------|------------| | Bootstrap Complete | After /qa:init | Run /qa:full to start testing | | Fix Plan Ready | After Phase 6 | Review the plan, then /qa:continue | | Certified | Phase 8 | Done — all exit criteria passed | | Escalated | Phase 9 | Manual intervention needed for stuck defects | | Phase Transition | Between phases | Recovery checkpoint (automatic) |

Exit Criteria

QA certification requires all gates to pass:

| Gate | Threshold | |------|-----------| | P0 features pass | 100% | | P1 features pass | 100% (or documented workarounds) | | Critical defects open | 0 | | Major defects open | 0 | | Minor defects open | < 10 | | UX Score (Nielsen avg) | >= 3.5 / 5.0 | | WCAG 2.1 AA critical violations | 0 | | Lighthouse Performance | >= 80 (if frontend) | | Lighthouse Accessibility | >= 85 (if frontend) | | Auth boundary tests | 100% pass (if auth exists) | | Tenant isolation verified | All routers (if multi-tenant) |

Architecture

Command (thin dispatcher, ~65-150 lines)
  │
  ├─ Resolves model profile from references/model-profiles.md
  ├─ Reads state from cycle-state.json
  ├─ Routes to appropriate workflow
  │
  └─ Workflow (thick execution logic, ~50-100 lines)
       │
       ├─ Inlines profile + state into Task() prompts
       ├─ Spawns specialized agents (parallel where possible)
       ├─ Writes state checkpoints before each phase
       │
       └─ Outputs stop-point template
            └─ templates/stop-points/*.md

Design Principles

| Principle | Why | |-----------|-----| | Thin commands, thick workflows | Commands parse args and route. Workflows contain logic. Clean separation. | | Context inlining | All Task() spawns include inlined profile/state data — no cross-boundary @ references that break in subagents. | | Extracted templates | Stop-point output is defined in template files — single source of truth, not buried in code. | | State-before-execute | Every phase writes to cycle-state.json before running. If context resets, /qa:resume knows exactly where to continue. |

Test Agent Templates

QA Workflow generates specialized agents from these templates:

| Template | Coverage | |----------|----------| | unit-test.md | Unit test generation and execution | | component-test.md | Component/integration testing | | e2e-test.md | End-to-end user flow testing | | security-checklist-owasp.md | OWASP top 10 security audit | | nielsen-heuristics.md | UX heuristic evaluation (Nielsen's 10) | | performance-benchmarks-base.md | Performance and Lighthouse scoring | | visual-regression.md | Screenshot-based regression detection | | domain-security-profiles.md | Domain-specific security rules | | domain-research-queries.md | Domain-aware test generation |

Model Profiles

Control cost and quality by routing agents to different Claude models.

Set in .claude/qa-profile.json:

{ "config": { "model_profile": "balanced" } }

| Profile | Philosophy | Best for | |---------|-----------|----------| | quality | Maximum reasoning — Opus for test execution, security, fix planning | Production releases, security-critical apps | | balanced | Smart allocation — Opus for fix planning only, Sonnet elsewhere | Daily development, most projects | | budget | Minimal Opus — Sonnet for code, Haiku for research/reports | Rapid iteration, cost-sensitive workflows |

[!TIP] Start with balanced. Switch to quality before shipping. Use budget when iterating fast on early-stage features.

See skills/qa/references/model-profiles.md for the complete table showing which model handles each agent role across all three profiles.

Getting Started

Installation

npx qa-workflow-cc

That's it. The installer copies commands and skills into ~/.claude/ and backs up any existing files.

npx qa-workflow-cc --global   # Install to ~/.claude/
npx qa-workflow-cc --local    # Install to ./.claude/

Use --global (-g) or --local (-l) to skip the interactive prompt.

Clone the repository and use symlinks for live editing:

git clone https://github.com/desland01/qa-workflow.git ~/qa-workflow
cd ~/qa-workflow
chmod +x install.sh
./install.sh

Changes in ~/qa-workflow/ are live immediately via symlinks — no reinstall needed.

First Run

# 1. Open your project in Claude Code
claude

# 2. Bootstrap QA for this project
/qa:init

# 3. Run the full QA cycle
/qa:full

# 4. When the fix plan appears, review it, then:
/qa:continue

Recommended: Skip Permissions

QA Workflow spawns multiple agents that run tests, read files, and write reports. For a smooth experience:

claude --dangerously-skip-permissions

Add to your project's .claude/settings.json:

{
  "permissions": {
    "allow": [
      "Bash(date:*)",
      "Bash(echo:*)",
      "Bash(cat:*)",
      "Bash(ls:*)",
      "Bash(mkdir:*)",
      "Bash(wc:*)",
      "Bash(head:*)",
      "Bash(tail:*)",
      "Bash(git add:*)",
      "Bash(git commit:*)",
      "Bash(git status:*)",
      "Bash(git log:*)",
      "Bash(git diff:*)",
      "Bash(npm test:*)",
      "Bash(npx:*)",
      "Bash(pytest:*)",
      "Bash(cargo test:*)"
    ]
  }
}

Directory Structure

qa-workflow/
├── README.md
├── VERSION
├── install.sh
│
├── commands/qa/                    # Thin dispatchers
│   ├── full.md                     # Main QA cycle orchestrator
│   ├── continue.md                 # Fix execution entry point
│   ├── init.md                     # Bootstrap-only command
│   ├── resume.md                   # Universal recovery command
│   └── status.md                   # Read-only dashboard
│
└── skills/qa/                      # Thick execution layer
    ├── SKILL.md                    # Bootstrap protocol (B1-B9)
    ├── references/
    │   ├── continuation-format.md  # Format rules + template index
    │   ├── exit-criteria.md        # Pass/fail thresholds
    │   ├── lifecycle.md            # Phase definitions + state machine
    │   └── model-profiles.md       # Agent-to-model mapping
    ├── templates/
    │   ├── agent-skeleton.md       # Base agent template
    │   ├── unit-test.md            # Unit test agent
    │   ├── component-test.md       # Component test agent
    │   ├── e2e-test.md             # E2E test agent
    │   ├── security-checklist-owasp.md
    │   ├── nielsen-heuristics.md   # UX evaluation
    │   ├── performance-benchmarks-base.md
    │   ├── visual-regression.md
    │   ├── qa-report-template.md
    │   ├── test-standards.md
    │   ├── domain-research-queries.md
    │   ├── domain-security-profiles.md
    │   └── stop-points/            # Output templates
    │       ├── bootstrap-complete.md
    │       ├── fix-ready.md
    │       ├── certified.md
    │       ├── escalated.md
    │       ├── status-dashboard.md
    │       └── phase-transition.md
    └── workflows/                  # Thick execution logic
        ├── bootstrap.md            # Phase 0
        ├── test-phase.md           # Phase 3
        ├── report-phase.md         # Phase 4
        ├── decision-gate.md        # Phase 5
        ├── fix-plan.md             # Phase 6
        ├── fix-execute.md          # Phase 7
        └── verify-phase.md         # Phase 7b

Troubleshooting

Commands not found after install?

Restart Claude Code to reload slash commands
Verify symlinks exist: ls -la ~/.claude/commands/qa/ and ls -la ~/.claude/skills/qa/

QA cycle stuck or interrupted?

Run /qa:resume — it reads cycle-state.json and re-enters at the correct phase
Run /qa:status to see exactly where the cycle stopped

Want to re-run from scratch?

Delete cycle-state.json in your project directory, then run /qa:full

Tests failing for the wrong reasons?

Run /qa:init again to regenerate agents with updated tech stack detection
Check that your project builds and tests pass manually first

Contributing

Clone the repo and run ./install.sh
Edit files in ~/qa-workflow/ — changes are live via symlinks
Test with /qa:status (read-only, safe) to verify nothing breaks
For command changes, test the full flow with /qa:full in a test project

Claude Code is powerful. QA Workflow makes it thorough.

Autonomous test-fix-verify cycles — so you can build, not babysit.