@ai-qa/workflow
v2.0.29
Published
AI QA Workflow Template — transforms any AI agent into an autonomous QA engineer. AI explores, plans, generates tests, and heals. Scripts execute and report.
Maintainers
Readme
AI QA Workflow Template
Turn any AI agent into an autonomous QA engineer.
This template gives an AI coding assistant (opencode, Copilot, Claude Code, Cursor, etc.) everything it needs to: explore your web app, plan tests, generate real Playwright code, execute them, debug failures, and visualize results.
The template handles the mechanical parts (execution, reporting, dashboard). The AI handles the creative parts (planning, test generation, debugging, healing).
How It Works
┌──────────────────────────────────────┐
│ AI AGENT (does thinking) │
│ │
│ Phase 1: Environment check │
│ Phase 2: Explore app + write plan │
│ Phase 3: Generate Playwright tests │
│ Phase 4: Execute tests │
│ Phase 5: Debug + heal failures │
│ Phase 6: Report + update context │
└───────┬────────────────────┬────────┘
│ │
┌─────────────┘ └─────────────┐
▼ ▼
┌─────────────────────────┐ ┌─────────────────────────┐
│ HUMAN SUPERVISION │ │ SCRIPTS (mechanics) │
│ │ │ │
│ At EVERY phase: │ │ npm run qa:execute │
│ AI proposes ──▶ you │ │ npm run qa:report │
│ approve ──▶ AI acts │ │ npm run qa:retry │
│ │ │ npm run qa:status │
│ You are always in │ │ npm run qa:list │
│ control. │ │ npm run dashboard │
└─────────────────────────┘ └─────────────────────────┘One-Command Install
npx @ai-qa/workflow init --yesInstalls the full template into your project:
- Playwright MCP + GitHub MCP configuration (
opencode.json) - AI agent definitions (
.github/agents/) - Workflow prompts (
prompts/) - CLI scripts for execution and reporting
- QA Dashboard (web UI)
- Directory structure (
user-story/,specs/,tests/,test-results/,docs/,.qa-context/)
Update an existing installation
npx @ai-qa/workflow update --yesUpdates template files while preserving your user stories, test specs, config, and run results.
What You Get
your-project/
├── ai-qa-workflow.js # CLI orchestrator (8 commands)
├── .qa-workflow.json # Project config (auto-detected)
├── scripts/
│ ├── utils.js # Config loading, shared helpers
│ ├── executor.js # Runs Playwright tests
│ ├── retrier.js # Re-runs failed tests (longer timeout)
│ └── reporter.js # Markdown + JSON reports
├── opencode.json # MCP config (Playwright + GitHub)
├── .github/agents/ # AI agent definitions
│ ├── playwright-test-planner.agent.md
│ ├── playwright-test-generator.agent.md
│ └── playwright-test-healer.agent.md
├── prompts/
│ ├── QAe2eprompt.md # Full 9-step AI workflow
│ ├── general_prompt.md # Quick-start prompt
├── prompting_template.md # Conversation guide — what to say to the AI at each phase
├── router.md # AI entry point — routes to correct agent
├── qa-dashboard/ # Web UI (port 4000)
├── .qa-context/ # Persistent AI memory (pipeline state, selectors, healing history, traceability)
├── .auth/ # Auth credentials + Playwright storage state (gitignored)
├── user-story/ # Your .md stories
├── specs/ # AI-generated test plans
├── tests/ # AI-generated Playwright specs
├── test-results/ # Run results, reports, screenshots
└── docs/ # Application context (AI knowledge base)Quick Start
# 1. Install the template
npx @ai-qa/workflow init --yes
# 2. Initialize project config
npm run qa:init
# 3. Write a user story in user-story/my-feature.md (see format below)
# 4. Open the project in your AI editor (opencode, Copilot, Cursor)
# The AI agent auto-detects its role and runs an environment check.
# It will report what's ready and what's missing.
# 5. Send this one prompt to start:
# "Read router.md and follow the QA workflow for my-feature.md"
# The AI will:
# - Read router.md → understand its mission
# - Read playwright-test-planner.agent.md
# - Run environment check (if not already done)
# - Explore your app with Playwright MCP
# - Save a test plan to specs/
# - STOP and ask for your approval
# - Once approved → generate tests → STOP again
# - Tell you when ready to execute
# 6. Execute tests
npm run qa:execute
# 7. If tests fail, tell your AI agent:
# "Debug and fix the failing tests"
# 8. Launch dashboard
npm run dashboard💡 Full conversation guide: See
prompting_template.mdfor a complete script of what to say to the AI at every phase — from installation to final report.
AI Agent Auto-Bootstrap
When you open this project in an AI-powered editor, the agent automatically understands its purpose:
- Reads
.opencode/rules.md(or.github/copilot-instructions.md) → discovers it's an AI QA Engineer - Runs an environment check → reports what's ready ✅ and what's missing ❌
- Reads
router.md→ learns the workflow and supervision rules - Stops and waits for you to provide a user story
No manual instructions needed. The agent knows its role on first contact.
Your First Prompt
After installation and running npm run qa:init, open the project in your AI editor.
The very first thing you should say to the AI agent:
"Run the environment check and show me the status report"
The AI will check all 10 preconditions and report what's ready ✅ and what's missing ❌. Then wait for your instructions.
Option B — Go straight to QA workflow (if you already have a user story)
"Read router.md and follow the QA workflow for user-story/my-story.md"
The AI will run the environment check (if not done), then proceed through Pla
(Replace my-story.md with the name of your user story file in user-story/.)
📖 Need more prompts? See
prompting_template.mdfor the full conversation script — approval responses, healing prompts, report prompts, and an example session.
What happens when you send this prompt:
| Step | AI does this | You do this |
|------|-------------|-------------|
| 1 | Runs environment check (if first time) | Read the status report |
| 2 | Reads router.md → playwright-test-planner.agent.md | — |
| 3 | Explores your app with Playwright MCP | — |
| 4 | Writes test plan to specs/ | Review and approve |
| 5 | Reads playwright-test-generator.agent.md | — |
| 6 | Generates test files to tests/ | Review and approve |
| 7 | Tells you tests are ready to run | Run npm run qa:execute |
| 8 | If tests fail, debugs and proposes fix | Review and approve fix |
You are the supervisor. The AI never moves to the next phase without your approval.
For Humans: Understanding the Workflow
The template follows a 9-step AI workflow defined in prompts/QAe2eprompt.md. Here's what happens at each step:
Step 1 — Context Discovery
The AI reads docs/application-context.md and explores your project to understand the tech stack, authentication, and environment.
Step 2 — Test Strategy & Plan
The AI reads playwright-test-planner.agent.md and uses Playwright MCP to explore your app visually. It maps user flows, identifies critical paths, and designs comprehensive test scenarios. The plan is saved to specs/.
⛔ Approval gate: The AI stops after saving the plan and presents it to you. You review, give feedback, and say "approved" or "continue" before the AI proceeds to test generation.
Step 3 — Manual Exploratory Testing
The AI executes each scenario manually using Playwright MCP, capturing real selectors, observing behavior, and noting issues.
Step 4 — Test Code Generation
The AI reads playwright-test-generator.agent.md and writes real Playwright .spec.ts files using selectors it discovered during exploration. Generated files go in tests/.
⛔ Approval gate: The AI stops after generating test files and presents them to you. You review the code, check selectors and logic, then say "approved" or "execute" before the AI proceeds.
Step 5 — Execution
Run the tests mechanically:
npm run qa:execute [test-name]Or run all tests:
npm run qa:executeThe executor runs Playwright with the configured options and saves results to test-results/.
Step 6 — Debug & Heal (AI)
If tests fail, the AI reads playwright-test-healer.agent.md and:
- Runs the failing test with
test_run - Debugs with
test_debug— examines the actual UI state - Classifies the failure and proposes a fix (1-3 line change)
⛔ Approval gate: The AI stops and presents its diagnosis to you. It shows what's broken, why, and what it wants to change. You approve the fix before the AI edits any file.
- Once approved, applies fix and re-runs once
- If still failing, marks as
test.fixme()and reports as defect
For a quick mechanical re-run (no AI diagnosis):
npm run qa:retry [run-id]This re-runs failed tests with a longer timeout. If they still fail, the AI investigates.
Step 7 — Bug Classification (AI)
The AI classifies every defect by severity, priority, type, root cause, and reproducibility. Results are saved to test-results/defects-log.md.
Step 8 — Report
npm run qa:report [run-id]Generates a markdown report from execution results.
Step 9 — Knowledge Retention
The AI updates docs/application-context.md with:
- Stable selectors discovered
- Known flaky areas
- Healing strategies that worked
- Environment observations
Dépendances requises
Installez les dépendances selon vos besoins :
@playwright/test
npm install -D @playwright/testMoteur de test E2E. Requis.
@types/node
npm install -D @types/nodeTypes Node.js (process, Buffer...). Requis.
allure-playwright
npm install -D allure-playwrightReporter Allure pour Playwright. Requis pour les rapports.
allure-commandline
npm install -D allure-commandlineGénération du rapport HTML Allure. Requis pour les rapports.
@applitools/eyes-playwright
npm install -D @applitools/eyes-playwrightTests visuels Applitools Eyes. Optionnel (nécessite APPLITOOLS_API_KEY).
@modelcontextprotocol/server-github
npm install -D @modelcontextprotocol/server-githubIntégration GitHub (PRs, issues). Optionnel (nécessite GITHUB_TOKEN).
Alternative : Installation globale dynamique
Le script d'installation (npx @ai-qa/workflow init) se charge normalement d'installer dynamiquement toutes les dépendances requises pour vous. Cependant, si vous avez besoin de forcer l'installation générale manuellement, lancez cette commande :
npm install -D @playwright/test @types/node allure-playwright allure-commandline @applitools/eyes-playwright @applitools/mcp
npx playwright install chromiumVisual Testing (Applitools)
The template supports Applitools Eyes for automated visual testing via two components:
@applitools/mcp— MCP server for AI-driven visual testing@applitools/eyes-playwright— Playwright integration for test-level visual assertions
Both are pre-installed in package.json — no extra install needed.
If APPLITOOLS_API_KEY is configured in your environment, the AI agent automatically adds visual checkpoints to critical pages during test generation. It captures screenshots of pages like login, dashboard, and checkout, and compares them against baselines to detect visual regressions.
Setup
Set your API key (get it from https://applitools.com):
# Option A: Export in terminal
export APPLITOOLS_API_KEY=votre_clé_ici
# Option B: Add to .env file
echo "APPLITOOLS_API_KEY=votre_clé_ici" >> .envThe AI will detect the key during its environment check and use Applitools automatically. If the key is not set, visual testing is skipped entirely — no errors, no blocks.
Allure Reports
Allure test reports are pre-configured:
allure-playwrightreporter generates raw results duringnpm run qa:execute- The report is auto-generated as HTML after each test run
- View via
npm run dashboardor openallure-report/index.html - Manual regeneration:
npm run qa:report:allure
Commands
| Command | Description | Who does it |
|---------|-------------|-------------|
| npm run qa:init | Create directories + auto-detect config | You (once) |
| npm run qa:execute [test] | Run Playwright tests | You or CI |
| npm run qa:retry [run-id] | Re-run failed tests (longer timeout) | You or CI |
| npm run qa:report [run-id] | Generate markdown report | You or CI |
| npm run qa:status | Show pipeline state | You |
| npm run qa:list | List stories, plans, specs | You |
| node ai-qa-workflow.js context <phase> <story> | Mark a pipeline phase complete for a story | AI agent or you |
| npm run dashboard | Launch web UI at :4000 | You |
The commands
qa:plan,qa:generate, andqa:rundo not exist in this template. Planning, test generation, and healing are done by the AI agent, not by scripts.
Zero-Config Design
On npm run qa:init, the template auto-detects your project configuration:
| Scan source | What it detects |
|-------------|----------------|
| package.json | Project name, description, dev server port |
| README.md | Project name, URL, description |
| docker-compose.yml | Exposed port, project name |
| .env | APP_URL, APP_ENV / NODE_ENV |
| playwright.config.* | Browser type (edge, webkit) |
| Directory structure | Framework (Angular, Next.js, Python) |
Generated .qa-workflow.json:
{
"project": {
"name": "my-app",
"url": "http://localhost:5173",
"environment": "Development"
},
"browser": {
"type": "chromium",
"cdpPort": 9222,
"headed": false
},
"test": {
"timeout": 120000,
"retries": 0,
"workers": 1
},
"auth": {
"user": "",
"credentials": {}
}
}Edit this file to override auto-detected values.
Application Context (docs/application-context.md)
This file serves as the AI's knowledge base. On init it's auto-generated. The AI populates it during exploration and testing with:
- Stable selectors — CSS/XPath selectors discovered during exploration
- Known flaky areas — elements or flows that frequently break
- Auth details — users, tokens, environments
- Tech stack notes — framework specifics the AI should know
For best results, edit this file with your project's specifics before the AI starts.
Persistent AI Memory (.qa-context/)
The .qa-context/ directory stores structured memory that persists between AI sessions and pipeline phases:
| File | Content | Used by |
|------|---------|---------|
| pipeline.json | Current pipeline state: which phases are complete, current story, last run | All agents (read on start, write on complete) |
| selectors.json | All discovered selectors with reliability scores, healing history, recommended alternatives | Generator (write best selectors), Healer (read alternatives) |
| heal-history.json | Every healing attempt: what was tried, which strategy, whether it succeeded | Healer (avoid repeating failed attempts), Planner (avoid flaky pages) |
| traceability.json | Full mapping: story → plan → spec → runs → healing | Reporter (link report to story), Dashboard (display traceability) |
Each agent reads .qa-context/ before starting and updates it after completing. This means:
- The Generator reuses stable selectors discovered by the Planner
- The Healer knows which selectors were tried and failed before
- The Reporter links every run back to its original story
- The Dashboard can display the full audit trail
The AI automatically consults these files — no manual setup needed.
Authentication Management (.auth/)
The .auth/ directory stores everything needed for automated login during tests:
| File | Content | Git |
|------|---------|:---:|
| credentials.json | Username + password for the app | ❌ ignored |
| storage-state.json | Playwright session (cookies, localStorage, tokens) | ❌ ignored |
| .gitignore | Ensures nothing in .auth/ is committed | ✅ |
How it works
npm run qa:initcreates.auth/with.gitignore— ready to use- AI discovers login during Phase 1 (Plan) — navigates to
/login, detects form fields, saves structure to.qa-context/auth.json - You provide credentials once — the AI saves them to
.auth/credentials.json - Session is persisted — after first login, Playwright's storage state is saved to
.auth/storage-state.json - Tests reuse the session — generated
auth.setup.tsloads storage state before each run - If credentials change, the AI detects auth failures via
auth-manager.jsand prompts you to update.auth/credentials.json
First-time setup
# 1. After qa:init, open the project in your AI editor
# 2. The AI will ask for credentials during Phase 1 (Plan)
# 3. Provide them once — they're saved to .auth/credentials.json
# 4. All future runs reuse the stored sessionManual setup
{
"username": "[email protected]",
"password": "your-password",
"url": "http://localhost:3000/login"
}Save this as .auth/credentials.json. The AI detects it automatically on next run.
How the AI Agents Work
The template defines three specialized AI agents in .github/agents/:
Planner Agent
Triggers when the AI needs to create a test plan. The AI:
- Opens your app with Playwright MCP
- Explores navigation, flows, UI components
- Maps critical paths and edge cases
- Saves a structured test plan to
specs/ - ⛔ Stops and presents the plan to you for approval
Generator Agent
Triggers when the AI needs to write test code. The AI:
- Reads the test plan
- Uses Playwright to manually execute each step
- Captures real selectors (data-testid, aria-label, role)
- Adds visual checkpoints via Applitools on critical pages (if
APPLITOOLS_API_KEYis set) - Writes complete, executable Playwright tests to
tests/ - ⛔ Stops and presents the code to you for approval
Healer Agent
Triggers when tests fail. The AI:
- Runs only the failing tests
- Debugs with
test_debug— sees the actual UI state - Classifies the failure (selector, timing, or bug)
- ⛔ Stops and presents diagnosis + proposed fix for your approval
- Once approved, applies a targeted 1-3 line fix
- Re-runs once — if still failing, marks as
test.fixme()
Writing Good User Stories
The AI reads your user stories to understand what to test. Write them in user-story/:
# User Login
**Story ID**: US-001
**Title**: User Login
**Feature**: Authentication
## Description
As a registered user, I want to log in with my credentials so I can access the dashboard.
## Preconditions
- User exists with email [email protected]
- Browser is on the login page
## Acceptance Criteria
1. Enter valid email and password
2. Click "Sign In"
3. Redirected to dashboard
4. User name displayed in headerWell-structured stories produce better test plans and better automated tests.
Dashboard
npm run dashboard
# → http://localhost:4000The dashboard provides:
- Multi-project management
- Execution history with pass/fail/healed stats
- Analytics charts (pass rate, duration, healing vs defects)
- Allure report integration
- Export to HTML / plain text
Self-Healing Protocol (AI-Driven)
This is not an automatic script. The AI agent follows this protocol:
| Attempt | What happens | If still fails |
|---------|-------------|----------------|
| 1 | Standard re-run | Classify failure |
| 2 | Longer timeout (60s) | Mark as defect |
| STOP | — | test.fixme() + classify |
The mechanical npm run qa:retry just re-runs. True healing (fixing selectors, adjusting waits, fixing assertions) is done by the AI agent using Playwright MCP.
Human Supervision (You Are in Control)
The AI agent never acts without your approval. Every major phase follows this cycle:
AI proposes ──▶ You review ──▶ You approve ──▶ AI executes ──▶ You verify| Phase | AI does | You approve | |-------|---------|-------------| | Plan | Explore app, write test plan | Review plan content | | Generate | Write Playwright test code | Review selectors and logic | | Heal | Diagnose failure, propose fix | Review the fix before it's applied |
The only exception: npm run qa:execute runs tests mechanically — you run this yourself or via CI.
The AI never:
- Generates tests without showing you the plan first
- Edits test code without showing you the diagnosis first
- Deploys, commits, or pushes without asking
You are the supervisor. The AI is your engineer.
Token Efficiency
The framework is designed to minimize token usage during AI-driven testing. These rules are embedded in the agent definitions and prompts:
Browser Navigation Rules
- Planner explores once — navigates the app in ONE session and saves all selectors to
.qa-context/selectors.json. - Verification pass (not re-exploration) — the Generator reads
selectors.json, then does a single lightweight batch session to verify selectors vialocator.isVisible()checks (not full snapshots). This catches stale selectors from dynamic frameworks (Ionic, Angular) without re-exploring every page. test.stableSelectorsflag — set tofalsein.qa-workflow.jsonfor dynamic frameworks. The Generator always runs the verification pass when this isfalse.- Snapshot depth ≤ 3 —
browser_snapshotcapped at depth 3. Full DOM trees (depth 10+) can consume 50-100K tokens per snapshot. - Batch all navigation — all page visits happen in one session per phase, never per scenario.
- Skip non-critical pages — static content, footers, about/legal pages are skipped.
- No screenshots during exploration — pages referenced by path only.
Selector Reuse
- Planner captures selectors once → saved to
.qa-context/selectors.json - Generator runs a single batch verification pass — lightweight
isVisible()checks, not navigation per test - Healer reads existing selectors + healing history → avoids re-discovering failed locators
- Every discovered or corrected selector is saved so no agent ever re-explores the same element
Execution & Healing
- Max 1 fix attempt per test — no endless retries
- Failures classified immediately (selector / timing / bug) — no ambiguous loops
- Targeted 1-3 line edits only — never rewrite entire files
test.fixme()marks defects — no retrying known bugs
Context Management
- Error messages truncated to 200 chars
- Files cached in memory between reads
- Screenshots referenced by path, not embedded
- Directory listings done once per pipeline run
For details, see the agent files in .github/agents/ and the TOKEN & EFFICIENCY RULES section in prompts/QAe2eprompt.md.
