sparq-assistant
v1.0.0
Published
AI-powered QA test generation for Claude Code, Cursor, and Codex — generate Playwright and Cypress E2E suites from Jira, Figma, and Confluence requirements
Downloads
84
Maintainers
Readme
_____ ____
/ ___/____ ____ ______/ __ \
\__ \/ __ \/ __ `/ ___/ / / /
___/ / /_/ / /_/ / / / /_/ /
/____/ .___/\__,_/_/ \___\_\
/_/ Spar[QA]ssistant ⚡The spark your QA pipeline was missing — an AI assistant that writes E2E and manual tests like your best engineer.
Getting Started · Documentation · Scenarios · Examples · Architecture
Any requirement source → production-ready tests → any test platform. One command to rule them all
flowchart LR
subgraph Sources["Requirements"]
direction TB
J[" Jira "] ~~~ C[" Confluence "] ~~~ F[" Figma "] ~~~ T[" Plain text "]
end
subgraph SparQ["⚡ SparQ Pipeline"]
direction TB
A[" 1 · Analyze "] --> M[" 2 · Manual tests "]
A --> E[" 2 · E2E code "]
M --> V[" 3 · Validate "]
E --> V
V -->|"Below Threshold"| A
end
subgraph Output["Test Artifacts"]
direction TB
MT[" Manual tests <br/>HP · VE · SEC · EC · A11Y"] ~~~ E2E[" E2E specs <br/>Playwright / Cypress"] ~~~ PO[" Page objects <br/>Fixtures<br/>Helpers"]
end
subgraph Destinations["Test Platforms"]
direction TB
TR[" TestRail "] ~~~ Q[" Qase "] ~~~ Z[" Zephyr Scale "] ~~~ LF[" Local folder "]
end
Sources --> SparQ
SparQ --> Output
Output --> DestinationsSparQ reads your requirements, generates both manual test cases and E2E automation code that matches your existing codebase patterns, then exports everything to your test management platform with human approval at every checkpoint.
Quick Start
Install with:
npx sparq-assistant@latest initThen simply start:
/sparq:generate EP-14That's it. SparQ installed 5 AI agents, 20 skills, and configured your MCP integrations — all in one command. Human approval at every step.
Who Is This For?
🎯 QA Engineers — Generate comprehensive test suites from Jira tickets in minutes, not hours. Export directly to TestRail, Qase, or Zephyr Scale. Maintain traceability from requirements to test cases automatically.
💻 Developers — Get E2E coverage for your features without writing tests from scratch. SparQ reads your code, detects your patterns, and generates tests that fit your project. Review, approve, commit.
📊 Engineering Managers — Add sparq lint --strict to your CI pipeline for deterministic test quality gates. SARIF output integrates with GitHub Code Scanning. Track coverage with structured matrices.
Prerequisites: Node.js >= 22 and an AI coding assistant (Claude Code, Cursor, Codex are approved providers but other should also work) · Full setup guide: docs/SETUP.md
Features
🔍 Multi-source requirements Pull acceptance criteria from Jira, specs from Confluence, and UI elements from Figma in a single pass.
✍️ Manual test generation Structured test cases across all categories:
- HP (Happy Path) — core success scenarios and expected user flows
- VE (Validation & Error) — input validation, error states, boundary conditions
- SEC (Security) — authentication, authorization, injection, XSS
- EC (Edge Case) — unusual inputs, race conditions, empty states, limits
- A11Y (Accessibility) — screen reader, keyboard navigation, WCAG compliance
🤖 E2E code generation Playwright or Cypress page objects, fixtures, and specs that match your existing patterns exactly.
🔄 Unified pipeline Manual tests AND E2E automation in one command with
/sparq:generate.📤 Multi-TMS export Push test cases to TestRail, Qase, Zephyr Scale, or local folder and publish CI run results back.
📏 18 lint rubrics Like ESLint for test quality — catches flaky patterns, weak locators, missing assertions. Zero AI inference. SARIF output for GitHub Code Scanning. What are rubrics?
🚀 Parallel test generation Agents generate tests concurrently, splitting work by feature scope and merging results automatically.
🧠 Context-optimized agents Every prompt, skill, and agent carefully tuned for minimal token usage and maximum output quality.
🧪 Test validation Detect broken selectors, stale flows, and coverage gaps after UI changes.
🐛 Bug regression Pass any bug ticket to
/sparq:generate-e2e— orchestrator auto-detects it and appends an inline regression test with aREG-{ticket}-{NNN}ID to the relevant feature spec. Filter withnpx playwright test --grep "REG-".🔀 PR-scoped generation Generate tests only for changed files in a pull request.
📊 Coverage iteration Automatically re-dispatches agents to fill coverage gaps until your target is met (default 80%).
✅ Checkpoint-driven Every phase requires human approval before proceeding — no surprises.
⚙️ Auto-detection Reads
package.jsonto identify framework, UI library, test runner, and language.🛡️ Graceful degradation Jira down? Paste text. Figma unavailable? SparQ greps your codebase for selectors. Never hard-fails.
📱 Viewport matrix Responsive presets for cross-breakpoint testing.
⚡ Performance testing k6, Artillery, Lighthouse CI, Web Vitals — from a single skill.
🔄 Resume from anywhere Interrupted mid-workflow? Resume picks up exactly where you left off.
📦 Zero dependencies Pure Node.js built-ins only.
✨ What You Get
SparQ generates a clean, best-practice test structure:
e2e/
├── pages/
│ ├── login.page.ts
│ ├── checkout.page.ts
│ └── index.ts
│
├── steps/
│ ├── auth.steps.ts
│ ├── checkout.steps.ts
│ └── index.ts
│
├── fixtures/
│ ├── auth.fixture.ts
│ ├── checkout-data.fixture.ts
│ └── index.ts
│
├── specs/
│ ├── auth/
│ │ └── login.spec.ts
│ └── checkout/
│ └── order-flow.spec.ts
│
└── playwright.config.ts- 🧩 Page objects with
getaccessors — clean, typed locator properties; no raw selectors scattered in tests - 🔗 Shared fixtures & barrel exports —
index.tsre-exports everything; import from one place, always - 🗂️ Feature-scoped specs — tests organized by domain, not by file type
- 🔁 Extends your existing structure in-place — if
e2e/already exists, SparQ adds to it without overwriting a single file - 🗄️ Metadata isolated to
.sparq/— workflow state, artifacts, and coverage data never touch your source tree
Cypress projects follow the same pattern with
cypress/support/pages/,cypress/e2e/, and.cy.tsextensions.
Before & After
| Without SparQ | With SparQ |
|:---|:---|
| Read Jira ticket, cross-reference Confluence, open Figma | /sparq:generate EP-14 — all sources fetched automatically |
| Study existing page objects and fixtures manually | Pattern-matched from your codebase |
| Write page object from scratch | Generated, extending your real base class |
| Write spec file with test cases | 5 test categories: HP, VE, SEC, EC, A11Y |
| Fix selectors, re-run, fix again | Validated against live DOM via Playwright MCP |
| Copy test cases into TestRail manually | /sparq:export — direct API push |
| Requirements changed? Rewrite tests | /sparq:sync EP-14 e2e/specs/auth/ — auto-diffs and updates |
| Found a bug? Write regression test from scratch | /sparq:generate-e2e BUG-42 — auto-detected as bug ticket, appended inline with REG- ID |
📏 What Are Rubrics?
Rubrics are automated quality checks that run without AI inference. Think ESLint rules, but for test quality. Each rubric is a pure JavaScript function — deterministic, auditable, runs in milliseconds.
Three categories, 18 rubrics total:
- FILE rubrics — flaky test detection, locator quality, assertion coverage, naming conventions, executability checks, regression compliance
- ARTIFACT rubrics — handoff schema validation, parallel merge integrity, resume state consistency
- MARKDOWN rubrics — coverage completeness, cross-output traceability, requirement coverage, template compliance
Why this matters: every finding is traceable to a specific pattern. No probabilistic output. Fully auditable. Runs in CI without model access.
sparq lint [path] # Human-readable output
sparq lint [path] --strict # Fail CI on any critical finding
sparq lint [path] --format sarif # SARIF 2.1.0 for GitHub Code Scanning
sparq lint [path] --threshold 85 # Fail below 85% quality score
sparq lint [path] --coverage-gate 90 # Fail if <90% of files passScenarios
| Command | What it does |
|:--------|:-------------|
| /sparq:generate EP-14 | Manual tests + E2E code in one pipeline |
| /sparq:generate-e2e EP-198 | E2E tests from requirements |
| /sparq:validate e2e/specs/auth/ | Detect broken selectors and stale flows |
| /sparq:generate-e2e BUG-42 | Inline regression test from a bug ticket (REG- ID) |
| /sparq:export | Push test cases to TestRail, Qase, or Zephyr |
Covers manual generation, manual-to-E2E conversion, E2E generation, validation, requirement sync, and result publishing. Full walkthrough: docs/SCENARIOS.md
Works With Your AI Coding Assistant
SparQ auto-detects your AI editor by scanning for .cursor/, .codex/, or .agents/ directories — no config required. All detected editors are installed simultaneously.
| | Claude Code | Cursor | Codex |
|:---|:---:|:---:|:---:|
| Status | Full support | Full support | Full support |
| Invoke | /sparq:start | /sparq:start | Ask about SparQ |
npx sparq-assistant@latest init # auto-detects all present AI editorsHow It Works
5 specialized AI agents in a phased pipeline — orchestrator classifies your request into one of 6 scenarios, dispatches agents to gather requirements, generate tests, and validate results. Every phase pauses for your approval.
| Phase | Agent | Role |
|:------|:------|:-----|
| Orchestration | sparq-orchestrator | Classifies scenario, dispatches agents |
| Phase 1 | requirements-analyst | Fetches from Jira, Confluence, Figma |
| Phase 2 | manual-test-writer | Generates structured manual tests |
| Phase 2 | automation-engineer | Generates E2E code (parallel with above) |
| Phase 3 | test-validator | Validates against live DOM, checks coverage |
Configurable model tiers (Premium/Balanced/Economy). Full architecture: docs/ARCHITECTURE.md
Built for Context Efficiency
Every token costs money and burns context window. SparQ is architected from the start to use both carefully.
A naive QA pipeline dumps everything into one prompt, runs one giant agent, and hopes for the best. That approach burns through a 200K context window fast — and produces worse results, because a single overloaded agent drifts from its instructions as the conversation grows. SparQ takes the opposite approach: a structured multi-agent pipeline where every agent has exactly one job and receives only the context it needs to do it.
Focused agents, not monoliths. The requirements analyst reads requirements. The test writer writes tests. The validator validates. Focused prompts produce sharper outputs at lower token cost — a 15K-token sub-agent doing one thing outperforms a 60K-token general-purpose agent doing four.
Handoffs, not conversation forwarding. When the orchestrator dispatches a sub-agent, it sends a structured handoff capped at 3,000 tokens (~12KB) — not the full conversation history. Each agent starts fresh with only the data it needs. This is the single biggest lever for keeping output quality high across long workflows.
Hard limits that protect quality. The orchestrator enforces concrete work-item caps: 40 requirements per workflow, 30 manual tests per batch, 20 E2E tests per batch. The orchestrator warns at 120K accumulated tokens and hard-stops at 150K. These limits aren't arbitrary — they're tuned to leave enough room for multi-phase workflows without letting any single phase crowd out the next.
Prompt budget discipline baked in. Agents and skills are written using lists over tables (roughly 30–40% token savings), XML-tagged sections for precise extraction, and mermaid diagrams instead of ASCII art for flows. The /sparq:prompt-optimizations skill is a living reference for compression patterns used throughout the codebase.
Model assignment by task type. Opus runs requirements analysis and orchestration — where nuanced judgment and multi-step reasoning matter most. Sonnet runs test writing and validation — where volume and speed dominate. Assigning the wrong model wastes tokens and degrades results.
Quality gates without LLM inference. The 18 rubrics in sparq lint are pure JavaScript — deterministic, auditable, and running in milliseconds. Checking whether a test suite covers all 5 categories, uses stable locators, or follows naming conventions requires zero model calls. No tokens spent on output you could verify with a regex.
The end result: faster turnaround, more consistent output, and lower API costs on every workflow run.
Supported Stacks
| Category | Supported | |:---------|:----------| | AI Platforms | Claude Code, Cursor, Codex, ... (auto-detected) | | Frameworks | Vue, React, Angular, Svelte (auto-detected) | | UI Libraries | PrimeVue, Vuetify, Quasar, Element Plus, MUI, Ant Design, Headless UI | | E2E Runners | Playwright (full generation), Cypress (full generation) | | Languages | TypeScript, JavaScript (auto-detected) | | TMS Providers | TestRail, Qase, Zephyr Scale, Local folder | | OS | macOS, Linux, Windows |
CLI Commands
npx sparq-assistant init # Install agents, skills, MCP configs
npx sparq-assistant doctor # Verify installation and MCP connections
npx sparq-assistant lint [path] # Deterministic quality rubrics (SARIF, CI-safe)
npx sparq-assistant update # Update to latest definitions
npx sparq-assistant uninstall # Remove all SparQ filesAlso:
clean,help,coverage. Global flags:--dry-run,--workspace. Full reference: docs/DAILY-USAGE.md
Configuration
After sparq-assistant init, settings live in sparq.config.json — auto-generated by the setup wizard. Configures sources (Jira, Confluence, Figma), test output (directory, TMS provider), and preferences (model tier, checkpoint level).
Most users never edit this manually. Full schema: docs/SETUP.md
MCP Integrations
Optional MCP servers — Atlassian (Jira + Confluence), Figma, Playwright, TestRail, Qase, Zephyr Scale. All auto-configured during sparq init.
When unavailable, SparQ degrades gracefully — falls back to user input, local files, or codebase analysis. Details: docs/SETUP.md
Environment Variables
MCP servers authenticate via environment variables — credentials stay in your shell, never in config files (which store only ${PLACEHOLDER} syntax resolved at runtime).
Which servers need setup:
- Atlassian (Jira + Confluence) and Figma — OAuth only; Claude Code handles auth on first connect. No env vars needed.
- Playwright — no credentials needed.
- TestRail —
TESTRAIL_BASE_URL,TESTRAIL_USERNAME,TESTRAIL_API_KEY - Qase —
QASE_API_TOKEN - Zephyr Scale —
ZEPHYR_API_TOKEN,ZEPHYR_PROJECT_KEYRecommended storage (most secure → quickest):
- macOS Keychain / Windows Credential Manager — OS-encrypted. Best for personal machines.
- 1Password / Bitwarden CLI — team-friendly, centralised, with audit trail and rotation.
direnv+.envrc— project-scoped, auto-loads oncd. Recommended for teams.- Isolated secrets file —
~/.sparq-secretswithchmod 600, sourced in shell profile. .envfile — never commit it (already in.gitignore); load withsource .env && claude.
Never commit real credentials. Only
${PLACEHOLDER}values belong inmcp/files. Copy.env.example→.env, fill in your values.
CI/CD: set secrets in GitHub Actions (Settings → Secrets) or GitLab CI (Settings → Variables). Add SPARQ_NO_UPDATE_CHECK=1 to skip the update check in pipelines.
Verify all MCP credentials:
npx sparq-assistant doctor· Full credential guide: docs/SETUP.md
📚 Documentation
Core Guides
🚀 Getting Started — Install SparQ and run your first workflow. Learn how SparQ auto-detects your project structure, walks you through 3 approval gates, and generates tests that match your exact patterns. Includes Jira-based, plain text, and bug regression walkthrough.
⚙️ Setup — Advanced MCP configuration, OAuth flows, CI/CD integration, and troubleshooting. Everything beyond the basic install.
📋 Daily Usage — The QA Engineer's command reference. Decision trees for "which command do I run?", S4-vs-S5 disambiguation, quick-paste workflows, and tips for power users.
🗺️ Scenarios — All scenarios mapped out with phase walkthroughs, checkpoint rules, and a composability matrix showing which workflows chain into full pipelines.
Deep Dives
🏗️ Architecture — How 5 specialized AI agents orchestrate test generation. Agent hierarchy, MCP integration map, project auto-detection rules, and data flow from requirement to generated code.
⚠️ Limitations — Honest trade-offs and graceful degradation. MCP servers optional with documented fallbacks. Framework maturity levels, batch limits, and practical workarounds.
Examples
- Unified generate — Jira to manual tests + E2E in one flow
- E2E generation — feature ticket — Jira to Playwright tests
- Bug regression — Bug ticket to regression spec
More examples covering all scenarios: examples/
Contributing
Contributions welcome. Please open an issue or submit a PR.
npm run check # lint + test — run before every commit