@ai-qa/workflow

v2.0.29

Published

13 hours ago

AI QA Workflow Template — transforms any AI agent into an autonomous QA engineer. AI explores, plans, generates tests, and heals. Scripts execute and report.

0High
0Medium
0Low

houcine111

qa testing playwright ai test-automation e2e dashboard

AI QA Workflow Template

Turn any AI agent into an autonomous QA engineer.

This template gives an AI coding assistant (opencode, Copilot, Claude Code, Cursor, etc.) everything it needs to: explore your web app, plan tests, generate real Playwright code, execute them, debug failures, and visualize results.

The template handles the mechanical parts (execution, reporting, dashboard). The AI handles the creative parts (planning, test generation, debugging, healing).

How It Works


                                ┌──────────────────────────────────────┐
                                │         AI AGENT (does thinking)     │
                                │                                      │
                                │  Phase 1: Environment check          │
                                │  Phase 2: Explore app + write plan   │
                                │  Phase 3: Generate Playwright tests  │
                                │  Phase 4: Execute tests              │
                                │  Phase 5: Debug + heal failures      │
                                │  Phase 6: Report + update context    │
                                └───────┬────────────────────┬────────┘
                                        │                    │
                          ┌─────────────┘                    └─────────────┐
                          ▼                                                ▼
              ┌─────────────────────────┐              ┌─────────────────────────┐
              │  HUMAN SUPERVISION       │              │  SCRIPTS (mechanics)    │
              │                         │              │                         │
              │  At EVERY phase:        │              │  npm run qa:execute     │
              │  AI proposes ──▶ you    │              │  npm run qa:report      │
              │  approve ──▶ AI acts    │              │  npm run qa:retry       │
              │                         │              │  npm run qa:status      │
              │  You are always in      │              │  npm run qa:list        │
              │  control.               │              │  npm run dashboard      │
              └─────────────────────────┘              └─────────────────────────┘

One-Command Install

npx @ai-qa/workflow init --yes

Installs the full template into your project:

Playwright MCP + GitHub MCP configuration (opencode.json)
AI agent definitions (.github/agents/)
Workflow prompts (prompts/)
CLI scripts for execution and reporting
QA Dashboard (web UI)
Directory structure (user-story/, specs/, tests/, test-results/, docs/, .qa-context/)

Update an existing installation

npx @ai-qa/workflow update --yes

Updates template files while preserving your user stories, test specs, config, and run results.

What You Get

your-project/
├── ai-qa-workflow.js          # CLI orchestrator (8 commands)
├── .qa-workflow.json          # Project config (auto-detected)
├── scripts/
│   ├── utils.js               # Config loading, shared helpers
│   ├── executor.js            # Runs Playwright tests
│   ├── retrier.js             # Re-runs failed tests (longer timeout)
│   └── reporter.js            # Markdown + JSON reports
├── opencode.json              # MCP config (Playwright + GitHub)
├── .github/agents/            # AI agent definitions
│   ├── playwright-test-planner.agent.md
│   ├── playwright-test-generator.agent.md
│   └── playwright-test-healer.agent.md
├── prompts/
│   ├── QAe2eprompt.md         # Full 9-step AI workflow
│   ├── general_prompt.md      # Quick-start prompt
├── prompting_template.md       # Conversation guide — what to say to the AI at each phase
├── router.md                  # AI entry point — routes to correct agent
├── qa-dashboard/              # Web UI (port 4000)
├── .qa-context/                # Persistent AI memory (pipeline state, selectors, healing history, traceability)
├── .auth/                      # Auth credentials + Playwright storage state (gitignored)
├── user-story/                 # Your .md stories
├── specs/                     # AI-generated test plans
├── tests/                     # AI-generated Playwright specs
├── test-results/              # Run results, reports, screenshots
└── docs/                      # Application context (AI knowledge base)

Quick Start

# 1. Install the template
npx @ai-qa/workflow init --yes

# 2. Initialize project config
npm run qa:init

# 3. Write a user story in user-story/my-feature.md (see format below)

# 4. Open the project in your AI editor (opencode, Copilot, Cursor)
#    The AI agent auto-detects its role and runs an environment check.
#    It will report what's ready and what's missing.

# 5. Send this one prompt to start:
#    "Read router.md and follow the QA workflow for my-feature.md"

# The AI will:
#   - Read router.md → understand its mission
#   - Read playwright-test-planner.agent.md
#   - Run environment check (if not already done)
#   - Explore your app with Playwright MCP
#   - Save a test plan to specs/
#   - STOP and ask for your approval
#   - Once approved → generate tests → STOP again
#   - Tell you when ready to execute

# 6. Execute tests
npm run qa:execute

# 7. If tests fail, tell your AI agent:
#    "Debug and fix the failing tests"

# 8. Launch dashboard
npm run dashboard

💡 Full conversation guide: See prompting_template.md for a complete script of what to say to the AI at every phase — from installation to final report.

AI Agent Auto-Bootstrap

When you open this project in an AI-powered editor, the agent automatically understands its purpose:

Reads .opencode/rules.md (or .github/copilot-instructions.md) → discovers it's an AI QA Engineer
Runs an environment check → reports what's ready ✅ and what's missing ❌
Reads router.md → learns the workflow and supervision rules
Stops and waits for you to provide a user story

No manual instructions needed. The agent knows its role on first contact.

Your First Prompt

After installation and running npm run qa:init, open the project in your AI editor.

The very first thing you should say to the AI agent:

"Run the environment check and show me the status report"

The AI will check all 10 preconditions and report what's ready ✅ and what's missing ❌. Then wait for your instructions.

Option B — Go straight to QA workflow (if you already have a user story)

"Read router.md and follow the QA workflow for user-story/my-story.md"

The AI will run the environment check (if not done), then proceed through Pla (Replace my-story.md with the name of your user story file in user-story/.)

📖 Need more prompts? See prompting_template.md for the full conversation script — approval responses, healing prompts, report prompts, and an example session.

What happens when you send this prompt:

| Step | AI does this | You do this | |------|-------------|-------------| | 1 | Runs environment check (if first time) | Read the status report | | 2 | Reads router.md → playwright-test-planner.agent.md | — | | 3 | Explores your app with Playwright MCP | — | | 4 | Writes test plan to specs/ | Review and approve | | 5 | Reads playwright-test-generator.agent.md | — | | 6 | Generates test files to tests/ | Review and approve | | 7 | Tells you tests are ready to run | Run npm run qa:execute | | 8 | If tests fail, debugs and proposes fix | Review and approve fix |

You are the supervisor. The AI never moves to the next phase without your approval.

For Humans: Understanding the Workflow

The template follows a 9-step AI workflow defined in prompts/QAe2eprompt.md. Here's what happens at each step:

Step 1 — Context Discovery

The AI reads docs/application-context.md and explores your project to understand the tech stack, authentication, and environment.

Step 2 — Test Strategy & Plan

The AI reads playwright-test-planner.agent.md and uses Playwright MCP to explore your app visually. It maps user flows, identifies critical paths, and designs comprehensive test scenarios. The plan is saved to specs/.

⛔ Approval gate: The AI stops after saving the plan and presents it to you. You review, give feedback, and say "approved" or "continue" before the AI proceeds to test generation.

Step 3 — Manual Exploratory Testing

The AI executes each scenario manually using Playwright MCP, capturing real selectors, observing behavior, and noting issues.

Step 4 — Test Code Generation

The AI reads playwright-test-generator.agent.md and writes real Playwright .spec.ts files using selectors it discovered during exploration. Generated files go in tests/.

⛔ Approval gate: The AI stops after generating test files and presents them to you. You review the code, check selectors and logic, then say "approved" or "execute" before the AI proceeds.

Step 5 — Execution

Run the tests mechanically:

npm run qa:execute [test-name]

Or run all tests:

npm run qa:execute

The executor runs Playwright with the configured options and saves results to test-results/.

Step 6 — Debug & Heal (AI)

If tests fail, the AI reads playwright-test-healer.agent.md and:

Runs the failing test with test_run
Debugs with test_debug — examines the actual UI state
Classifies the failure and proposes a fix (1-3 line change)

⛔ Approval gate: The AI stops and presents its diagnosis to you. It shows what's broken, why, and what it wants to change. You approve the fix before the AI edits any file.

Once approved, applies fix and re-runs once
If still failing, marks as test.fixme() and reports as defect

For a quick mechanical re-run (no AI diagnosis):

npm run qa:retry [run-id]

This re-runs failed tests with a longer timeout. If they still fail, the AI investigates.

Step 7 — Bug Classification (AI)

The AI classifies every defect by severity, priority, type, root cause, and reproducibility. Results are saved to test-results/defects-log.md.

Step 8 — Report

npm run qa:report [run-id]

Generates a markdown report from execution results.

Step 9 — Knowledge Retention

The AI updates docs/application-context.md with:

Stable selectors discovered
Known flaky areas
Healing strategies that worked
Environment observations

Dépendances requises

Installez les dépendances selon vos besoins :

@playwright/test

npm install -D @playwright/test

Moteur de test E2E. Requis.

@types/node

npm install -D @types/node

Types Node.js (process, Buffer...). Requis.

allure-playwright

npm install -D allure-playwright

Reporter Allure pour Playwright. Requis pour les rapports.

allure-commandline

npm install -D allure-commandline

Génération du rapport HTML Allure. Requis pour les rapports.

@applitools/eyes-playwright

npm install -D @applitools/eyes-playwright

Tests visuels Applitools Eyes. Optionnel (nécessite APPLITOOLS_API_KEY).

@modelcontextprotocol/server-github

npm install -D @modelcontextprotocol/server-github

Intégration GitHub (PRs, issues). Optionnel (nécessite GITHUB_TOKEN).

Alternative : Installation globale dynamique

Le script d'installation (npx @ai-qa/workflow init) se charge normalement d'installer dynamiquement toutes les dépendances requises pour vous. Cependant, si vous avez besoin de forcer l'installation générale manuellement, lancez cette commande :

npm install -D @playwright/test @types/node allure-playwright allure-commandline @applitools/eyes-playwright @applitools/mcp
npx playwright install chromium

Visual Testing (Applitools)

The template supports Applitools Eyes for automated visual testing via two components:

@applitools/mcp — MCP server for AI-driven visual testing
@applitools/eyes-playwright — Playwright integration for test-level visual assertions

Both are pre-installed in package.json — no extra install needed.

If APPLITOOLS_API_KEY is configured in your environment, the AI agent automatically adds visual checkpoints to critical pages during test generation. It captures screenshots of pages like login, dashboard, and checkout, and compares them against baselines to detect visual regressions.

Setup

Set your API key (get it from https://applitools.com):

# Option A: Export in terminal
export APPLITOOLS_API_KEY=votre_clé_ici

# Option B: Add to .env file
echo "APPLITOOLS_API_KEY=votre_clé_ici" >> .env

The AI will detect the key during its environment check and use Applitools automatically. If the key is not set, visual testing is skipped entirely — no errors, no blocks.

Allure Reports

Allure test reports are pre-configured:

allure-playwright reporter generates raw results during npm run qa:execute
The report is auto-generated as HTML after each test run
View via npm run dashboard or open allure-report/index.html
Manual regeneration: npm run qa:report:allure

Commands

| Command | Description | Who does it | |---------|-------------|-------------| | npm run qa:init | Create directories + auto-detect config | You (once) | | npm run qa:execute [test] | Run Playwright tests | You or CI | | npm run qa:retry [run-id] | Re-run failed tests (longer timeout) | You or CI | | npm run qa:report [run-id] | Generate markdown report | You or CI | | npm run qa:status | Show pipeline state | You | | npm run qa:list | List stories, plans, specs | You | | node ai-qa-workflow.js context <phase> <story> | Mark a pipeline phase complete for a story | AI agent or you | | npm run dashboard | Launch web UI at :4000 | You |

The commands qa:plan, qa:generate, and qa:run do not exist in this template. Planning, test generation, and healing are done by the AI agent, not by scripts.

Zero-Config Design

On npm run qa:init, the template auto-detects your project configuration:

| Scan source | What it detects | |-------------|----------------| | package.json | Project name, description, dev server port | | README.md | Project name, URL, description | | docker-compose.yml | Exposed port, project name | | .env | APP_URL, APP_ENV / NODE_ENV | | playwright.config.* | Browser type (edge, webkit) | | Directory structure | Framework (Angular, Next.js, Python) |

Generated .qa-workflow.json:

{
  "project": {
    "name": "my-app",
    "url": "http://localhost:5173",
    "environment": "Development"
  },
  "browser": {
    "type": "chromium",
    "cdpPort": 9222,
    "headed": false
  },
  "test": {
    "timeout": 120000,
    "retries": 0,
    "workers": 1
  },
  "auth": {
    "user": "",
    "credentials": {}
  }
}

Edit this file to override auto-detected values.

Application Context (`docs/application-context.md`)

This file serves as the AI's knowledge base. On init it's auto-generated. The AI populates it during exploration and testing with:

Stable selectors — CSS/XPath selectors discovered during exploration
Known flaky areas — elements or flows that frequently break
Auth details — users, tokens, environments
Tech stack notes — framework specifics the AI should know

For best results, edit this file with your project's specifics before the AI starts.

Persistent AI Memory (`.qa-context/`)

The .qa-context/ directory stores structured memory that persists between AI sessions and pipeline phases:

| File | Content | Used by | |------|---------|---------| | pipeline.json | Current pipeline state: which phases are complete, current story, last run | All agents (read on start, write on complete) | | selectors.json | All discovered selectors with reliability scores, healing history, recommended alternatives | Generator (write best selectors), Healer (read alternatives) | | heal-history.json | Every healing attempt: what was tried, which strategy, whether it succeeded | Healer (avoid repeating failed attempts), Planner (avoid flaky pages) | | traceability.json | Full mapping: story → plan → spec → runs → healing | Reporter (link report to story), Dashboard (display traceability) |

Each agent reads .qa-context/ before starting and updates it after completing. This means:

The Generator reuses stable selectors discovered by the Planner
The Healer knows which selectors were tried and failed before
The Reporter links every run back to its original story
The Dashboard can display the full audit trail

The AI automatically consults these files — no manual setup needed.

Authentication Management (`.auth/`)

The .auth/ directory stores everything needed for automated login during tests:

| File | Content | Git | |------|---------|:---:| | credentials.json | Username + password for the app | ❌ ignored | | storage-state.json | Playwright session (cookies, localStorage, tokens) | ❌ ignored | | .gitignore | Ensures nothing in .auth/ is committed | ✅ |

How it works

npm run qa:init creates .auth/ with .gitignore — ready to use
AI discovers login during Phase 1 (Plan) — navigates to /login, detects form fields, saves structure to .qa-context/auth.json
You provide credentials once — the AI saves them to .auth/credentials.json
Session is persisted — after first login, Playwright's storage state is saved to .auth/storage-state.json
Tests reuse the session — generated auth.setup.ts loads storage state before each run
If credentials change, the AI detects auth failures via auth-manager.js and prompts you to update .auth/credentials.json

First-time setup

# 1. After qa:init, open the project in your AI editor
# 2. The AI will ask for credentials during Phase 1 (Plan)
# 3. Provide them once — they're saved to .auth/credentials.json
# 4. All future runs reuse the stored session

Manual setup

{
  "username": "[email protected]",
  "password": "your-password",
  "url": "http://localhost:3000/login"
}

Save this as .auth/credentials.json. The AI detects it automatically on next run.

How the AI Agents Work

The template defines three specialized AI agents in .github/agents/:

Planner Agent

Triggers when the AI needs to create a test plan. The AI:

Opens your app with Playwright MCP
Explores navigation, flows, UI components
Maps critical paths and edge cases
Saves a structured test plan to specs/
⛔ Stops and presents the plan to you for approval

Generator Agent

Triggers when the AI needs to write test code. The AI:

Reads the test plan
Uses Playwright to manually execute each step
Captures real selectors (data-testid, aria-label, role)
Adds visual checkpoints via Applitools on critical pages (if APPLITOOLS_API_KEY is set)
Writes complete, executable Playwright tests to tests/
⛔ Stops and presents the code to you for approval

Healer Agent

Triggers when tests fail. The AI:

Runs only the failing tests
Debugs with test_debug — sees the actual UI state
Classifies the failure (selector, timing, or bug)
⛔ Stops and presents diagnosis + proposed fix for your approval
Once approved, applies a targeted 1-3 line fix
Re-runs once — if still failing, marks as test.fixme()

Writing Good User Stories

The AI reads your user stories to understand what to test. Write them in user-story/:

# User Login

**Story ID**: US-001
**Title**: User Login
**Feature**: Authentication

## Description
As a registered user, I want to log in with my credentials so I can access the dashboard.

## Preconditions
- User exists with email [email protected]
- Browser is on the login page

## Acceptance Criteria
1. Enter valid email and password
2. Click "Sign In"
3. Redirected to dashboard
4. User name displayed in header

Well-structured stories produce better test plans and better automated tests.

Dashboard

npm run dashboard
# → http://localhost:4000

The dashboard provides:

Multi-project management
Execution history with pass/fail/healed stats
Analytics charts (pass rate, duration, healing vs defects)
Allure report integration
Export to HTML / plain text

Self-Healing Protocol (AI-Driven)

This is not an automatic script. The AI agent follows this protocol:

| Attempt | What happens | If still fails | |---------|-------------|----------------| | 1 | Standard re-run | Classify failure | | 2 | Longer timeout (60s) | Mark as defect | | STOP | — | test.fixme() + classify |

The mechanical npm run qa:retry just re-runs. True healing (fixing selectors, adjusting waits, fixing assertions) is done by the AI agent using Playwright MCP.

Human Supervision (You Are in Control)

The AI agent never acts without your approval. Every major phase follows this cycle:

AI proposes ──▶ You review ──▶ You approve ──▶ AI executes ──▶ You verify

| Phase | AI does | You approve | |-------|---------|-------------| | Plan | Explore app, write test plan | Review plan content | | Generate | Write Playwright test code | Review selectors and logic | | Heal | Diagnose failure, propose fix | Review the fix before it's applied |

The only exception: npm run qa:execute runs tests mechanically — you run this yourself or via CI.

The AI never:

Generates tests without showing you the plan first
Edits test code without showing you the diagnosis first
Deploys, commits, or pushes without asking

You are the supervisor. The AI is your engineer.

Token Efficiency

The framework is designed to minimize token usage during AI-driven testing. These rules are embedded in the agent definitions and prompts:

Browser Navigation Rules

Planner explores once — navigates the app in ONE session and saves all selectors to .qa-context/selectors.json.
Verification pass (not re-exploration) — the Generator reads selectors.json, then does a single lightweight batch session to verify selectors via locator.isVisible() checks (not full snapshots). This catches stale selectors from dynamic frameworks (Ionic, Angular) without re-exploring every page.
test.stableSelectors flag — set to false in .qa-workflow.json for dynamic frameworks. The Generator always runs the verification pass when this is false.
Snapshot depth ≤ 3 — browser_snapshot capped at depth 3. Full DOM trees (depth 10+) can consume 50-100K tokens per snapshot.
Batch all navigation — all page visits happen in one session per phase, never per scenario.
Skip non-critical pages — static content, footers, about/legal pages are skipped.
No screenshots during exploration — pages referenced by path only.

Selector Reuse

Planner captures selectors once → saved to .qa-context/selectors.json
Generator runs a single batch verification pass — lightweight isVisible() checks, not navigation per test
Healer reads existing selectors + healing history → avoids re-discovering failed locators
Every discovered or corrected selector is saved so no agent ever re-explores the same element

Execution & Healing

Max 1 fix attempt per test — no endless retries
Failures classified immediately (selector / timing / bug) — no ambiguous loops
Targeted 1-3 line edits only — never rewrite entire files
test.fixme() marks defects — no retrying known bugs

Context Management

Error messages truncated to 200 chars
Files cached in memory between reads
Screenshots referenced by path, not embedded
Directory listings done once per pipeline run

For details, see the agent files in .github/agents/ and the TOKEN & EFFICIENCY RULES section in prompts/QAe2eprompt.md.