qaa-agent
v1.9.2
Published
QA Automation Agent for Claude Code — multi-agent pipeline that analyzes repos, generates tests, validates, and creates PRs
Maintainers
Readme
QAA - QA Automation Agent
Multi-agent QA pipeline for Claude Code. Analyzes any codebase, generates a complete test suite following industry standards, validates everything, and delivers the result as a draft pull request.
scan → map → research → analyze → plan → generate → validate → deliverThe Problem
- Starting from zero is painful — a new project with no tests means weeks of setup
- Coverage gaps are invisible — without analysis, teams don't know what's missing until production breaks
- Standards drift — different team members write tests differently: inconsistent locators, vague assertions, mixed naming
- QA is always behind dev — features ship faster than tests get written
The Solution
QAA runs a pipeline of 12 specialized AI agents, each responsible for one stage:
| Stage | What happens | Output |
|-------|-------------|--------|
| Scan | Detects framework, language, testable surfaces | SCAN_MANIFEST.md |
| Research | Investigates testing ecosystem via Context7 MCP and official docs | TESTING_STACK.md, FRAMEWORK_CAPABILITIES.md |
| Map | Deep-scans codebase with 4 parallel agents (testability, risk, patterns, existing tests) | 8 codebase documents |
| Analyze | Produces risk assessment, test inventory, testing pyramid | QA_ANALYSIS.md, TEST_INVENTORY.md |
| Plan | Groups test cases by feature, assigns to files, resolves dependencies | GENERATION_PLAN.md |
| Generate | Writes test files, POMs, fixtures, configs following project standards | Test suite on disk |
| Validate | 4-layer validation (syntax, structure, dependencies, logic) with auto-fix | VALIDATION_REPORT.md |
| Deliver | Creates branch, commits per stage, opens draft PR | Pull request URL |
Install
npx qaa-agentThe interactive installer:
- Copies agents, commands, skills, templates, and workflows into your runtime directory
- Registers two MCP servers in your user-scope config (
~/.claude.json) so they're available in all projects:- Playwright MCP — live browser control for E2E tests and locator extraction
- Context7 MCP — up-to-date library documentation on demand
- Merges required permissions into
settings.json
Supported runtimes: Claude Code, OpenCode
Install scope: Global (~/.claude/, available in all projects) or Local (./.claude/, this project only)
Requirements
- Node.js 18+
- Claude Code installed
Bundled MCP servers
Both MCP servers are registered automatically in ~/.claude.json when you run npx qaa-agent. No manual setup required — once installed, they're available in every Claude Code project on your machine.
Playwright MCP — live browser control
Uses @playwright/mcp to:
- Open a real browser and navigate your running app
- Extract actual locators (
data-testid, ARIA roles, labels) from live pages - Run E2E tests, capture failures, and auto-fix locator mismatches
- Build a persistent Locator Registry (
.qa-output/locators/) that caches real locators across features
Context7 MCP — up-to-date library docs
Uses @upstash/context7-mcp to:
- Fetch the latest documentation for Playwright, Cypress, Jest, Vitest, pytest, and any other library the agent is working with
- Keep generated tests aligned with current framework APIs instead of outdated training data
- Free tier: ~60 requests/hour, ~3,300 tokens/query
Verifying the MCPs are connected
Open Claude Code in any project and type /mcp. You should see both playwright and context7 listed as connected.
Manual config (fallback)
If for any reason the automatic registration fails, you can add the servers manually to ~/.claude.json:
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["@playwright/mcp@latest"]
},
"context7": {
"command": "npx",
"args": ["-y", "@upstash/context7-mcp@latest"]
}
}
}Quick Start
New project, no tests
/qa-start --dev-repo ./myproject --autoRuns the full pipeline end-to-end: scan, map, analyze, plan, generate, validate, and deliver as a draft PR.
Mature project, new feature
/qa-map # build the "brain" (once)
/qa-create-test "password reset" # generate tests using codebase knowledge
/qa-pr --ticket PROJ-123 "password reset tests" # ship as draft PRFrom a Jira ticket
/qa-from-ticket https://company.atlassian.net/browse/PROJ-456
/qa-pr --ticket PROJ-456 "login flow tests"Fix broken tests after a deploy
/qa-fix ./tests/e2e/checkout*
/qa-pr --ticket PROJ-789 "fix checkout tests"Commands
| Command | Purpose |
|---------|---------|
| /qa-start | Full pipeline end-to-end (scan through PR) |
| /qa-research | Research testing ecosystem via Context7 MCP |
| /qa-map | Deep codebase analysis with 4 parallel agents |
| /qa-create-test <feature> | Generate tests for a specific feature |
| /qa-fix [path] | Diagnose and fix broken tests |
| /qa-audit [path] | 6-dimension quality audit with scoring |
| /qa-pr | Create a draft pull request from QA artifacts |
| /qa-testid [path] | Inject data-testid attributes into components |
Additional Commands
| Command | Purpose |
|---------|---------|
| /qa-from-ticket <url> | Generate tests from a Jira/Linear/GitHub Issue |
| /qa-analyze | Analyze a repo without generating tests |
| /qa-validate [path] | Validate test files against standards |
| /qa-gap | Find coverage gaps between dev and QA repos |
| /qa-report | Generate a QA status report |
| /qa-audit | Full quality audit with weighted scoring |
| /qa-blueprint | Generate QA repo structure from scratch |
| /qa-research | Research best testing stack for a project |
| /qa-pom | Generate Page Object Models |
| /update-test | Improve existing tests incrementally |
Run any command in Claude Code to see full usage and available flags.
Three Workflows
QAA adapts to the project's QA maturity:
Option 1: No QA repo yet — Full pipeline from scratch. Produces a complete test suite, repo blueprint, and draft PR.
/qa-start --dev-repo ./myprojectOption 2: Immature QA repo — Scans both repos, fixes broken tests, fills coverage gaps, standardizes existing tests.
/qa-start --dev-repo ./myproject --qa-repo ./testsOption 3: Mature QA repo — Surgical additions only. Finds thin coverage areas and adds targeted tests without touching working code.
/qa-start --dev-repo ./myproject --qa-repo ./testsThe "Brain" — Codebase Map
Before generating anything, QAA maps the codebase with 4 parallel agents producing 8 documents:
| Focus | Documents |
|-------|-----------|
| Testability | TESTABILITY.md, TEST_SURFACE.md — what's testable, entry points, mock boundaries |
| Risk | RISK_MAP.md, CRITICAL_PATHS.md — business-critical paths, security-sensitive areas |
| Patterns | CODE_PATTERNS.md, API_CONTRACTS.md — naming conventions, API shapes, import style |
| Existing tests | TEST_ASSESSMENT.md, COVERAGE_GAPS.md — current quality, frameworks, gaps |
Every downstream agent reads these documents. The result: generated tests feel native to the codebase, not generic boilerplate.
Standards Enforced
Every generated artifact follows strict rules:
Testing Pyramid
/ E2E \ 3-5% (critical path smoke only)
/ API \ 20-25% (endpoints + contracts)
/ Integration\ 10-15% (component interactions)
/ Unit \ 60-70% (business logic, pure functions)Locator Hierarchy
- Tier 1 (Best):
data-testid, ARIA roles with accessible names - Tier 2 (Good): Form labels, placeholders, visible text
- Tier 3 (Acceptable): Alt text, title attributes
- Tier 4 (Last Resort): CSS selectors, XPath — always with a
// TODOcomment
Page Object Model
- One class per page, no god objects
- No assertions in POMs — assertions belong in test specs
- Locators as readonly properties
- Every POM extends a shared
BasePage
Assertion Quality
// Good — concrete values
expect(response.status).toBe(200);
expect(data.name).toBe('Test User');
// Bad — never do this
expect(response.status).toBeTruthy();
expect(data).toBeDefined();Test Case IDs
Every test case has a unique ID following the pattern:
UT-MODULE-001— unit testsINT-MODULE-001— integration testsAPI-RESOURCE-001— API testsE2E-FLOW-001— E2E tests
Validation
Generated tests pass through a 4-layer validation with auto-fix (up to 3 loops):
- Syntax — does it parse? Are imports correct?
- Structure — POM rules, file organization, naming conventions
- Dependencies — all imports resolve, mocks set up correctly
- Logic — assertions are concrete, locators follow tier hierarchy
If issues remain, the Bug Detective classifies each failure:
| Classification | Action |
|----------------|--------|
| APPLICATION BUG | Flagged for developer — not auto-fixed |
| TEST CODE ERROR | Auto-fixed at HIGH confidence |
| ENVIRONMENT ISSUE | Documented with setup instructions |
| INCONCLUSIVE | Flagged with evidence for manual review |
Framework Support
QAA auto-detects the project's existing stack and matches it:
Languages: JavaScript/TypeScript, Python, Java, .NET/C#, Go, Ruby, PHP, Rust
Test Frameworks: Playwright, Cypress, Jest, Vitest, pytest, Selenium, and more
Build Tools: Vite, Next.js, Nuxt, Angular, Vue, Webpack, SvelteKit
Git Platforms: GitHub, Azure DevOps, GitLab
Learning System
QAA remembers your preferences across sessions. When you correct it — "use Playwright, not Cypress" or "our branches start with feature/" — it saves the rule permanently to MY_PREFERENCES.md. Every agent reads your preferences before generating output.
Your team's conventions always win over defaults.
Architecture
qaa-agent/
agents/ # 12 specialized QA agents
commands/ # 7 slash commands (user-facing entry points)
skills/ # 6 reusable skills
templates/ # 10 artifact templates (output format contracts)
workflows/ # 7 workflow orchestration specs
bin/ # Installer and CLI tools
docs/ # User documentation
CLAUDE.md # QA standards (read by every agent)
.mcp.json # Playwright + Context7 MCP server config
settings.json # Claude Code permissionsAgents
| Agent | Responsibility |
|-------|---------------|
| qa-scanner | Framework detection, file tree scanning |
| qa-codebase-mapper | 4-parallel-agent deep analysis |
| qa-analyzer | Risk assessment, test inventory, pyramid |
| qa-planner | Test case grouping, file assignment |
| qa-executor | Test file, POM, fixture generation |
| qa-validator | 4-layer validation with auto-fix |
| qa-e2e-runner | Browser-based test execution via Playwright MCP |
| qa-bug-detective | Failure classification with evidence |
| qa-testid-injector | data-testid attribute injection |
| qa-project-researcher | Testing stack research |
| qa-discovery | Project discovery |
| qa-pipeline-orchestrator | Pipeline coordination |
Git Workflow
QAA follows strict git conventions:
- Branch:
qa/auto-{project}-{date}(e.g.,qa/auto-shopflow-2026-03-18) - Commits: One per agent stage —
qa(scanner): produce SCAN_MANIFEST.md for shopflow - PR: Draft PR with analysis summary, test counts, coverage metrics, validation status
Documentation
All documentation is included in the installed package under docs/, templates/, and CLAUDE.md.
License
MIT
