@exodus/xqa
v11.5.0
Published
AI-powered QA CLI tool for autonomous mobile app testing
Readme
@exodus/xqa
AI-powered QA agent CLI for Exodus applications.
Overview
xqa automates mobile app QA by connecting to physical devices or emulators and running intelligent exploration and spec-based testing. The CLI orchestrates the pipeline that spawns agents to interact with your app, capture screenshots, and generate findings based on user-defined specs or breadth-first exploration.
The tool manages configuration, project initialization, session state tracking, and interactive review workflows for triaging findings.
Commands
init
Initialize a new xqa project in the current directory.
Creates a .xqa/ directory with app.md and explore.md templates plus subdirectories for specs, designs, and suites. Installs bundled xqa skills.
xqa initupdate
Update installed xqa skills to the current CLI version.
xqa updateexplore [prompt]
Run the explorer agent; omit prompt for a full breadth-first sweep.
Optional focus hint for the explorer agent. Omit to explore the entire app from the starting state. Generates a findings JSON file in .xqa/output/ and prints the path upon completion.
xqa explore # breadth-first exploration
xqa explore "test the login flow" # focused exploration
xqa explore -v prompt,screen # verbose output for categories
xqa explore -v # verbose output for all categories
xqa explore -t 600 # override explorer timeout (seconds)
xqa explore --debug # log timing and event details to stderr
xqa explore --udid ABCD1234 # target a specific booted simulator
xqa explore --visual # force the visual-quality pass onFlags:
-v, --verbose [categories]— Log categories (prompt, tools, screen, memory). Default: all if flag is present without value.-t, --timeout <seconds>— Explorer timeout in seconds (overridesagents.explorer.timeoutSecondsin.xqa/config.yaml).--debug— Log timing and event details to stderr.--udid <id>— Target simulator UDID. Overrides auto-detect of first booted; exits with code 2 if the UDID is not booted.--visual— Force the explorer visual-quality pass on, overridingagents.explorer.visual.enabledin.xqa/config.yaml. Without adesignsDirit runs in no-designs heuristic mode (emitsdesign-system-violationfindings).
spec [spec-file]
Run the explorer agent against a spec file.
Loads a spec markdown file from .xqa/specs/ (or an absolute path) and executes the agent against it. Omit the argument to pick from available specs interactively.
xqa spec # interactive spec picker
xqa spec .xqa/specs/authentication.test.md # explicit spec file
xqa spec -v tools,memory # verbose output
xqa spec --debug # debug loggingFlags:
-v, --verbose [categories]— Same as explore.--debug— Log timing and event details to stderr.
Spec file format (YAML frontmatter + markdown):
---
feature: 'Feature Name'
timeout: 300
---
# Spec contentFrontmatter fields: feature (required), timeout (optional, seconds).
run
Run a test suite or a set of spec files in parallel across booted simulators.
Exactly one of --suite or --spec is required.
xqa run --suite smoke # run .xqa/suites/smoke.suite.json
xqa run --spec 'specs/**/*.test.md' # run matching spec files
xqa run --suite smoke --only spec-login # run a single work item by id
xqa run --suite smoke --debug # debug logging
xqa run --suite smoke --udid ABCD1234 # constrain the suite to one booted simulatorFlags:
--suite <name>— Name of the suite (<name>.suite.json) under.xqa/suites/.--spec <globs...>— Glob patterns matching spec files, resolved from the xqa directory.--only <id>— Run only the work item with the given id (requires--suite). Ids are deterministic:spec-<name-without-specs-prefix>for specs,freestyle-<index>for freestyle entries. Hooks still run. Output still lands atoutput/suite/<suiteId>/<date>/<runId>/findings.jsonwith the single item initems[].--debug— Log timing and event details to stderr.--udid <id>— Target simulator UDID. When supplied, the suite is constrained to that one simulator; exits with code 2 if the UDID is not booted.
plan
Generate or evolve the manual test plan for the current branch.
Inspects the git diff between the current branch and its upstream, asks the planner agent to emit Markdown scenario specs, and writes them to .xqa/test-plan/default/ (or a custom directory). Subcommands let you refine individual scenarios, append new scenarios after fresh commits, and correlate findings from a run against the plan.
xqa plan # generate scenarios from current diff
xqa plan --intent "login changes" --out .xqa/test-plan/my-slug
xqa plan --base develop # diff against a branch other than origin/HEAD
xqa plan edit .xqa/test-plan/my-slug/scenario-1.test.md --feedback "rename step 2"
xqa plan extend # append scenarios for fresh commits
xqa plan report --findings .xqa/output/.../findings.json --specs .xqa/test-plan/my-slugFlags:
--intent <text>— Optional focus hint passed to the planner.--out <dir>— Output directory for the generated scenarios (default:<xqa>/test-plan/default).--base <ref>— Base git ref to diff against. When omitted, xqa auto-detects the base from an open PR viagh pr viewand falls back toorigin/HEAD. Pass explicitly to override.--debug— Log base/head refs, diff summary, existing specs count, classification, the full prompt sent to the model, and the raw AI response to stderr. Useful for investigatingmodel-abstainedempty results.
Subcommands:
xqa plan edit <file> --feedback <text>— apply user-requested edits to an existing scenario spec.xqa plan extend [--intent <text>] [--out <dir>]— append new scenarios for commits since the last plan was generated.xqa plan report --findings <path> [--specs <dir>]— correlate findings with scenarios and writereport.jsonnext to the plan.
What does it do?
xqa plan reads the branch diff, summarizes it, and feeds the context to the planner agent, which emits one Markdown scenario spec per suggested flow. The specs are written to the plan directory so you can review or hand them to xqa run. After running the scenarios, xqa plan report correlates the resulting findings back to each scenario so you can see which flows passed, which surfaced issues, and which were skipped. xqa plan edit lets you nudge a single scenario with natural-language feedback; xqa plan extend picks up commits added after the initial generation and appends new scenarios without touching the existing ones.
review [findings-path]
Review findings and mark false positives.
Interactive session for triaging findings generated by explore or spec runs. Mark findings as dismissed (with optional reason) or undo previous dismissals. Dismissals are written to dismissals.json next to the .xqa directory (override with run.dismissalsPath in .xqa/config.yaml). Defaults to the last findings path if omitted.
xqa review # use last findings file
xqa review .xqa/output/findings-abc123.json # explicit pathdesigns sync
Pull Figma page designs into the local designs directory.
Reads figma.pages from .xqa/config.yaml, fetches each Figma page via the REST API, classifies exported frames with the Haiku classifier, and writes approved PNGs to agents.explorer.visual.designsDir. Requires FIGMA_TOKEN and ANTHROPIC_API_KEY.
xqa designs sync # sync all configured pages (uses cache)
xqa designs sync --no-cache # bypass classifier cache and lastModified gatedesigns rebuild
Force a full design sync from Figma.
Bypasses the cache and last-modified gate while preserving the manifest so stale managed files can be removed safely. If a previous sync quarantined a corrupt manifest, rebuild re-derives a new manifest from the configured pages.
xqa designs rebuildcompletion
Output shell completion script.
Generate completion script for bash or zsh. Pipe output to shell config file to enable tab completion.
xqa completion bash # generate bash completions
xqa completion zsh # generate zsh completionsSuite config
Suite files live at .xqa/suites/<name>.suite.json and declare the work items plus optional hooks.
{
"specs": ["specs/send.test.md"],
"freestyle": [{ "prompt": "explore settings", "timeoutSeconds": 300 }],
"hooks": {
"beforeEach": {
"script": "qa/prepare-sim.mjs",
"env": { "APP_PROFILE": "funded" },
"timeoutSeconds": 120,
"retries": 3
}
}
}Fields:
specs(optional) — glob patterns resolved from the xqa directory.freestyle(optional) — either a positive integer (N empty entries) or an array of{ prompt?, timeoutSeconds }entries.- At least one of
specsorfreestylemust resolve to a work item. hooks.beforeEach(optional) — runs before every work item on every simulator. Use for project-owned setup (wallet provisioning, cache warming, login seeding).
The hook script is invoked as a Node child process. It receives:
- Inherited
process.env - Suite-declared
envoverlaid with reserved keys (reserved wins) - Reserved xqa-owned keys:
XQA_SIM_UDID,XQA_ITEM_ID,XQA_ITEM_TYPE,XQA_ITEM_NAME,XQA_SUITE, and (when item type isspec)XQA_SPEC_PATH
Suite-declared env cannot override reserved keys — the parser rejects such configs.
Contract:
- Exit 0 → proceed with item.
- Non-zero exit → item marked failed,
executeItemskipped, counts toward simulator-unhealthy threshold. - Default 120s timeout, overridable via
hooks.beforeEach.timeoutSeconds. - Default 3 retries on failure (
HOOK_EXIT_NONZERO,HOOK_TIMEOUT,HOOK_SPAWN_FAILED), overridable viahooks.beforeEach.retries(range0..10). Set to0to disable retries. Aborts (HOOK_ABORTED) are never retried. - A
HOOK_RETRYsuite event is emitted before each retry attempt withattempt,maxAttempts, andpreviousErrorType. - Honors the suite abort signal.
Configuration
Configuration splits in two: non-sensitive runtime settings in .xqa/config.yaml, secrets in the environment.
.xqa/config.yaml
xqa init writes this file with sensible defaults. It's the canonical home for agent toggles and tunables:
version: 1
run:
# id: my-run
# dismissalsPath: .xqa/dismissals.json
suites:
directory: .xqa/suites
agents:
explorer:
enabled: true
timeoutSeconds: 1200
buildEnv: dev
capabilities:
videoRecording: false
viewUiServer: true
findingScreenshots: true
consolidator:
enabled: true
triager:
enabled: false
figma:
pages:
- https://www.figma.com/design/<fileKey>/My-App?node-id=1-2
scale: 2
classifierModel: claude-haiku-4-5-20251001
maxConcurrentPages: 1Field reference:
| Field | Default | Description |
| ------------------------------------------------- | --------------------------- | ------------------------------------------------------------------ |
| version | 1 | Config schema version. |
| run.id | (auto) | Fixed run ID. Omit for sequential per-run IDs. |
| run.dismissalsPath | .xqa/dismissals.json | Where xqa review persists dismissals. |
| suites.directory | .xqa/suites | Directory containing *.suite.json files. |
| agents.explorer.enabled | true | Runs the explorer agent. |
| agents.explorer.timeoutSeconds | 1200 | Wall-clock limit per explore/spec run. |
| agents.explorer.buildEnv | dev | dev or prod. dev ignores debug overlays as findings. |
| agents.explorer.capabilities.videoRecording | false | Records the simulator screen to MP4. |
| agents.explorer.capabilities.viewUiServer | true | Registers the view_ui MCP tool for reading the UI tree. |
| agents.explorer.capabilities.findingScreenshots | true | Writes per-finding PNGs. |
| agents.consolidator.enabled | true | Merges and deduplicates findings from every agent. |
| agents.triager.enabled | false | Runs the PR suite matcher. Needs GITHUB_TOKEN. |
| figma.pages | [] | Figma page URLs to sync (xqa designs sync). Needs FIGMA_TOKEN. |
| figma.scale | 2 | Export scale factor (0.01–4). |
| figma.classifierModel | claude-haiku-4-5-20251001 | Haiku model used to classify frames as UI designs. |
| figma.maxConcurrentPages | 1 | Reserved for future parallel sync; only 1 is accepted today. |
Capabilities
Each agent has a capabilities block of opt-in feature flags. Enabling a capability doesn't enable the agent — both enabled: true and capabilities.<name>: true are required.
The explorer's videoRecording capability records the simulator screen to an MP4 that the viewer app uses for playback.
Visual analysis
Explorer can perform visual review in addition to structural exploration. Enable via agents.explorer.visual.enabled: true in .xqa/config.yaml. When designs are available, point designsDir at a directory of *.png artboards.
| visual.enabled | designsDir | Behavior |
| ---------------- | -------------- | ----------------------------------------------------------------------- |
| false | any | Structural only (default; identical to today's explorer). |
| true | set, non-empty | Full visual review with artboard comparison and read_artboard budget. |
| true | unset or empty | Visual review without artboard reference (design-system-blind). |
explorer:
visual:
enabled: true
designsDir: .xqa/designs
matchTolerance: balanced # strict | balanced | loose
candidateCount: 3
readArtboardImageTokenBudget: 80000
# Per-run in-memory cache that skips visual_pass on screens already
# analysed during this run. Every analysed screen's fingerprint is
# kept for the lifetime of the run.
cacheTolerance: strict # strict | balanced | loose — how lenient to be when calling two screens "the same"Environment variables
Secrets stay in .env.local (loaded by dotenv) or your shell. Lock the file down:
chmod 600 .env.localANTHROPIC_API_KEY(required) — Anthropic Claude API key for agent reasoningFIGMA_TOKEN(required forxqa designs sync) — Figma personal access token withfile_readscope; format:figd_…GITHUB_TOKEN(optional) — required forxqa triage
Video recording
videoRecording is an independent capability that records the simulator screen to an MP4 (used by the viewer app for playback):
agents:
explorer:
enabled: true
capabilities:
videoRecording: trueMigration from legacy env vars
Legacy QA_* and XQA_* environment variables are rejected at startup with a LEGACY_ENV_DETECTED error. Move their values into .xqa/config.yaml:
| Legacy env var | New config path |
| ---------------------------- | -------------------------------- |
| QA_RUN_ID | run.id |
| QA_EXPLORE_TIMEOUT_SECONDS | agents.explorer.timeoutSeconds |
| QA_BUILD_ENV | agents.explorer.buildEnv |
| QA_DISMISSALS_PATH | run.dismissalsPath |
| XQA_SUITES_DIR | suites.directory |
Architecture
Key files and directories:
src/index.ts— CLI entry point; wires commander commands and manages graceful shutdown via process lockssrc/commands/— Command implementations (init, update, explore, spec, review, completion)src/suite/— Suite runner: config parsing, work-item building, worker pool, hookssrc/core/— Pure functions: completion generation, verbose/timeout option parsing, last-path trackingsrc/shell/— I/O wrappers: app/explore context reading, debug logging, display factory, preflight, xqa directory discoverysrc/config.ts,src/config-schema.ts— Configuration loading and validation with Zodsrc/review-session.ts— Interactive finding review loop with dismissal trackingsrc/spec-frontmatter.ts— Spec markdown frontmatter parsing (YAML)src/spec-slug.ts— Spec filename to slug derivation for output organizationsrc/pid-lock.ts— Process-level mutual exclusion to prevent concurrent runs
Error Types
Core error discriminated unions:
ConfigError— Configuration validation failed (INVALID_CONFIG)AppContextError— Failed to read app.md or explore.md (READ_FAILED)XqaDirectoryError— No .xqa directory found (XQA_NOT_INITIALIZED)SpecFrontmatterError— Malformed spec markdown (MISSING_FRONTMATTER, MISSING_FIELD, PARSE_ERROR)LastPathError— No findings path provided and no prior session (NO_ARG_AND_NO_STATE)SuiteConfigError— Suite config JSON malformed or schema-invalid (INVALID_SUITE_CONFIG)HookError— Suite hook failure (HOOK_SPAWN_FAILED, HOOK_EXIT_NONZERO, HOOK_TIMEOUT, HOOK_ABORTED)
Development
Install dependencies:
pnpm installBuild the CLI:
pnpm run buildRun tests:
pnpm run testType check:
pnpm run typecheckLint and format:
pnpm run lint
pnpm run lint:fixFull quality check (lint, typecheck, test):
pnpm run check
pnpm run check:fixWatch mode (build + re-run on file changes):
pnpm run devLink binary globally (symlinks dist/xqa.cjs to ~/.local/bin/xqa):
pnpm run build:linkUnlink binary:
pnpm run build:unlinkProject Structure
src/
index.ts # CLI entry point
config.ts # Config loading and types
config-schema.ts # Zod schema for env vars
constants.ts # Tool lists and timeouts
pid-lock.ts # Process exclusion lock
spec-slug.ts # Spec file to slug conversion
spec-frontmatter.ts # Spec YAML parsing
review-session.ts # Interactive finding review loop
commands/
init-command.ts # Project initialization
update-command.ts # Skill updates
install-skills.ts # Bundled skill installer
explore-command.ts # Breadth-first exploration
spec-command.ts # Spec-based exploration
spec-resolver.ts # Spec file discovery and parsing
review-command.ts # Finding triage workflow
completion-command.ts # Shell completion generation
item-events.ts # Start/complete/fail event emitters
core/
parse-verbose.ts # Verbose flag parsing
parse-timeout-seconds.ts # Timeout flag parsing
completion-generator.ts # Bash/zsh completion script generation
last-path.ts # Last findings path tracking
shell/
app-context.ts # Read app.md and explore.md
xqa-directory.ts # Locate .xqa directory
preflight.ts # Environment preflight checks
display-factory.ts # Solo and suite display factories
debug-logger.ts # Debug event logging
debug-agent-events.ts # Agent event debug formatter
debug-suite-events.ts # Suite event debug formatter
debug-logger-core.ts # Pure logging helpers
trigger-abort.ts # Abort signal plumbing
suite/
types.ts # Suite work-item and findings types
core/ # Pure: config parser, item builders, hook env, queue
shell/ # I/O: worker pool, hook runner, findings writer
commands/ # run-command, execute-item, suite-run-context
__tests__/
*.test.ts # Test files co-located with src/