@exodus/xqa

v11.5.0

Published

6 hours ago

AI-powered QA CLI tool for autonomous mobile app testing

0High
0Medium
0Low

joshuabot

@exodus/xqa

AI-powered QA agent CLI for Exodus applications.

Overview

xqa automates mobile app QA by connecting to physical devices or emulators and running intelligent exploration and spec-based testing. The CLI orchestrates the pipeline that spawns agents to interact with your app, capture screenshots, and generate findings based on user-defined specs or breadth-first exploration.

The tool manages configuration, project initialization, session state tracking, and interactive review workflows for triaging findings.

Commands

init

Initialize a new xqa project in the current directory.

Creates a .xqa/ directory with app.md and explore.md templates plus subdirectories for specs, designs, and suites. Installs bundled xqa skills.

xqa init

update

Update installed xqa skills to the current CLI version.

xqa update

explore [prompt]

Run the explorer agent; omit prompt for a full breadth-first sweep.

Optional focus hint for the explorer agent. Omit to explore the entire app from the starting state. Generates a findings JSON file in .xqa/output/ and prints the path upon completion.

xqa explore                          # breadth-first exploration
xqa explore "test the login flow"    # focused exploration
xqa explore -v prompt,screen         # verbose output for categories
xqa explore -v                       # verbose output for all categories
xqa explore -t 600                   # override explorer timeout (seconds)
xqa explore --debug                  # log timing and event details to stderr
xqa explore --udid ABCD1234          # target a specific booted simulator
xqa explore --visual                 # force the visual-quality pass on

Flags:

-v, --verbose [categories] — Log categories (prompt, tools, screen, memory). Default: all if flag is present without value.
-t, --timeout <seconds> — Explorer timeout in seconds (overrides agents.explorer.timeoutSeconds in .xqa/config.yaml).
--debug — Log timing and event details to stderr.
--udid <id> — Target simulator UDID. Overrides auto-detect of first booted; exits with code 2 if the UDID is not booted.
--visual — Force the explorer visual-quality pass on, overriding agents.explorer.visual.enabled in .xqa/config.yaml. Without a designsDir it runs in no-designs heuristic mode (emits design-system-violation findings).

spec [spec-file]

Run the explorer agent against a spec file.

Loads a spec markdown file from .xqa/specs/ (or an absolute path) and executes the agent against it. Omit the argument to pick from available specs interactively.

xqa spec                                      # interactive spec picker
xqa spec .xqa/specs/authentication.test.md    # explicit spec file
xqa spec -v tools,memory                      # verbose output
xqa spec --debug                              # debug logging

Flags:

-v, --verbose [categories] — Same as explore.
--debug — Log timing and event details to stderr.

Spec file format (YAML frontmatter + markdown):

---
feature: 'Feature Name'
timeout: 300
---

# Spec content

Frontmatter fields: feature (required), timeout (optional, seconds).

run

Run a test suite or a set of spec files in parallel across booted simulators.

Exactly one of --suite or --spec is required.

xqa run --suite smoke                         # run .xqa/suites/smoke.suite.json
xqa run --spec 'specs/**/*.test.md'           # run matching spec files
xqa run --suite smoke --only spec-login       # run a single work item by id
xqa run --suite smoke --debug                 # debug logging
xqa run --suite smoke --udid ABCD1234         # constrain the suite to one booted simulator

Flags:

--suite <name> — Name of the suite (<name>.suite.json) under .xqa/suites/.
--spec <globs...> — Glob patterns matching spec files, resolved from the xqa directory.
--only <id> — Run only the work item with the given id (requires --suite). Ids are deterministic: spec-<name-without-specs-prefix> for specs, freestyle-<index> for freestyle entries. Hooks still run. Output still lands at output/suite/<suiteId>/<date>/<runId>/findings.json with the single item in items[].
--debug — Log timing and event details to stderr.
--udid <id> — Target simulator UDID. When supplied, the suite is constrained to that one simulator; exits with code 2 if the UDID is not booted.

plan

Generate or evolve the manual test plan for the current branch.

Inspects the git diff between the current branch and its upstream, asks the planner agent to emit Markdown scenario specs, and writes them to .xqa/test-plan/default/ (or a custom directory). Subcommands let you refine individual scenarios, append new scenarios after fresh commits, and correlate findings from a run against the plan.

xqa plan                                          # generate scenarios from current diff
xqa plan --intent "login changes" --out .xqa/test-plan/my-slug
xqa plan --base develop                           # diff against a branch other than origin/HEAD
xqa plan edit .xqa/test-plan/my-slug/scenario-1.test.md --feedback "rename step 2"
xqa plan extend                                   # append scenarios for fresh commits
xqa plan report --findings .xqa/output/.../findings.json --specs .xqa/test-plan/my-slug

Flags:

--intent <text> — Optional focus hint passed to the planner.
--out <dir> — Output directory for the generated scenarios (default: <xqa>/test-plan/default).
--base <ref> — Base git ref to diff against. When omitted, xqa auto-detects the base from an open PR via gh pr view and falls back to origin/HEAD. Pass explicitly to override.
--debug — Log base/head refs, diff summary, existing specs count, classification, the full prompt sent to the model, and the raw AI response to stderr. Useful for investigating model-abstained empty results.

Subcommands:

xqa plan edit <file> --feedback <text> — apply user-requested edits to an existing scenario spec.
xqa plan extend [--intent <text>] [--out <dir>] — append new scenarios for commits since the last plan was generated.
xqa plan report --findings <path> [--specs <dir>] — correlate findings with scenarios and write report.json next to the plan.

What does it do?

xqa plan reads the branch diff, summarizes it, and feeds the context to the planner agent, which emits one Markdown scenario spec per suggested flow. The specs are written to the plan directory so you can review or hand them to xqa run. After running the scenarios, xqa plan report correlates the resulting findings back to each scenario so you can see which flows passed, which surfaced issues, and which were skipped. xqa plan edit lets you nudge a single scenario with natural-language feedback; xqa plan extend picks up commits added after the initial generation and appends new scenarios without touching the existing ones.

review [findings-path]

Review findings and mark false positives.

Interactive session for triaging findings generated by explore or spec runs. Mark findings as dismissed (with optional reason) or undo previous dismissals. Dismissals are written to dismissals.json next to the .xqa directory (override with run.dismissalsPath in .xqa/config.yaml). Defaults to the last findings path if omitted.

xqa review                                     # use last findings file
xqa review .xqa/output/findings-abc123.json    # explicit path

designs sync

Pull Figma page designs into the local designs directory.

Reads figma.pages from .xqa/config.yaml, fetches each Figma page via the REST API, classifies exported frames with the Haiku classifier, and writes approved PNGs to agents.explorer.visual.designsDir. Requires FIGMA_TOKEN and ANTHROPIC_API_KEY.

xqa designs sync             # sync all configured pages (uses cache)
xqa designs sync --no-cache  # bypass classifier cache and lastModified gate

designs rebuild

Force a full design sync from Figma.

Bypasses the cache and last-modified gate while preserving the manifest so stale managed files can be removed safely. If a previous sync quarantined a corrupt manifest, rebuild re-derives a new manifest from the configured pages.

xqa designs rebuild

completion

Output shell completion script.

Generate completion script for bash or zsh. Pipe output to shell config file to enable tab completion.

xqa completion bash  # generate bash completions
xqa completion zsh   # generate zsh completions

Suite config

Suite files live at .xqa/suites/<name>.suite.json and declare the work items plus optional hooks.

{
  "specs": ["specs/send.test.md"],
  "freestyle": [{ "prompt": "explore settings", "timeoutSeconds": 300 }],
  "hooks": {
    "beforeEach": {
      "script": "qa/prepare-sim.mjs",
      "env": { "APP_PROFILE": "funded" },
      "timeoutSeconds": 120,
      "retries": 3
    }
  }
}

Fields:

specs (optional) — glob patterns resolved from the xqa directory.
freestyle (optional) — either a positive integer (N empty entries) or an array of { prompt?, timeoutSeconds } entries.
At least one of specs or freestyle must resolve to a work item.
hooks.beforeEach (optional) — runs before every work item on every simulator. Use for project-owned setup (wallet provisioning, cache warming, login seeding).

The hook script is invoked as a Node child process. It receives:

Inherited process.env
Suite-declared env overlaid with reserved keys (reserved wins)
Reserved xqa-owned keys: XQA_SIM_UDID, XQA_ITEM_ID, XQA_ITEM_TYPE, XQA_ITEM_NAME, XQA_SUITE, and (when item type is spec) XQA_SPEC_PATH

Suite-declared env cannot override reserved keys — the parser rejects such configs.

Contract:

Exit 0 → proceed with item.
Non-zero exit → item marked failed, executeItem skipped, counts toward simulator-unhealthy threshold.
Default 120s timeout, overridable via hooks.beforeEach.timeoutSeconds.
Default 3 retries on failure (HOOK_EXIT_NONZERO, HOOK_TIMEOUT, HOOK_SPAWN_FAILED), overridable via hooks.beforeEach.retries (range 0..10). Set to 0 to disable retries. Aborts (HOOK_ABORTED) are never retried.
A HOOK_RETRY suite event is emitted before each retry attempt with attempt, maxAttempts, and previousErrorType.
Honors the suite abort signal.

Configuration

Configuration splits in two: non-sensitive runtime settings in .xqa/config.yaml, secrets in the environment.

`.xqa/config.yaml`

xqa init writes this file with sensible defaults. It's the canonical home for agent toggles and tunables:

version: 1

run:
  # id: my-run
  # dismissalsPath: .xqa/dismissals.json

suites:
  directory: .xqa/suites

agents:
  explorer:
    enabled: true
    timeoutSeconds: 1200
    buildEnv: dev
    capabilities:
      videoRecording: false
      viewUiServer: true
      findingScreenshots: true

  consolidator:
    enabled: true

  triager:
    enabled: false

figma:
  pages:
    - https://www.figma.com/design/<fileKey>/My-App?node-id=1-2
  scale: 2
  classifierModel: claude-haiku-4-5-20251001
  maxConcurrentPages: 1

Field reference:

| Field | Default | Description | | ------------------------------------------------- | --------------------------- | ------------------------------------------------------------------ | | version | 1 | Config schema version. | | run.id | (auto) | Fixed run ID. Omit for sequential per-run IDs. | | run.dismissalsPath | .xqa/dismissals.json | Where xqa review persists dismissals. | | suites.directory | .xqa/suites | Directory containing *.suite.json files. | | agents.explorer.enabled | true | Runs the explorer agent. | | agents.explorer.timeoutSeconds | 1200 | Wall-clock limit per explore/spec run. | | agents.explorer.buildEnv | dev | dev or prod. dev ignores debug overlays as findings. | | agents.explorer.capabilities.videoRecording | false | Records the simulator screen to MP4. | | agents.explorer.capabilities.viewUiServer | true | Registers the view_ui MCP tool for reading the UI tree. | | agents.explorer.capabilities.findingScreenshots | true | Writes per-finding PNGs. | | agents.consolidator.enabled | true | Merges and deduplicates findings from every agent. | | agents.triager.enabled | false | Runs the PR suite matcher. Needs GITHUB_TOKEN. | | figma.pages | [] | Figma page URLs to sync (xqa designs sync). Needs FIGMA_TOKEN. | | figma.scale | 2 | Export scale factor (0.01–4). | | figma.classifierModel | claude-haiku-4-5-20251001 | Haiku model used to classify frames as UI designs. | | figma.maxConcurrentPages | 1 | Reserved for future parallel sync; only 1 is accepted today. |

Capabilities

Each agent has a capabilities block of opt-in feature flags. Enabling a capability doesn't enable the agent — both enabled: true and capabilities.<name>: true are required.

The explorer's videoRecording capability records the simulator screen to an MP4 that the viewer app uses for playback.

Visual analysis

Explorer can perform visual review in addition to structural exploration. Enable via agents.explorer.visual.enabled: true in .xqa/config.yaml. When designs are available, point designsDir at a directory of *.png artboards.

| visual.enabled | designsDir | Behavior | | ---------------- | -------------- | ----------------------------------------------------------------------- | | false | any | Structural only (default; identical to today's explorer). | | true | set, non-empty | Full visual review with artboard comparison and read_artboard budget. | | true | unset or empty | Visual review without artboard reference (design-system-blind). |

explorer:
  visual:
    enabled: true
    designsDir: .xqa/designs
    matchTolerance: balanced # strict | balanced | loose
    candidateCount: 3
    readArtboardImageTokenBudget: 80000
    # Per-run in-memory cache that skips visual_pass on screens already
    # analysed during this run. Every analysed screen's fingerprint is
    # kept for the lifetime of the run.
    cacheTolerance: strict # strict | balanced | loose — how lenient to be when calling two screens "the same"

Environment variables

Secrets stay in .env.local (loaded by dotenv) or your shell. Lock the file down:

chmod 600 .env.local

ANTHROPIC_API_KEY (required) — Anthropic Claude API key for agent reasoning
FIGMA_TOKEN (required for xqa designs sync) — Figma personal access token with file_read scope; format: figd_…
GITHUB_TOKEN (optional) — required for xqa triage

Video recording

videoRecording is an independent capability that records the simulator screen to an MP4 (used by the viewer app for playback):

agents:
  explorer:
    enabled: true
    capabilities:
      videoRecording: true

Migration from legacy env vars

Legacy QA_* and XQA_* environment variables are rejected at startup with a LEGACY_ENV_DETECTED error. Move their values into .xqa/config.yaml:

| Legacy env var | New config path | | ---------------------------- | -------------------------------- | | QA_RUN_ID | run.id | | QA_EXPLORE_TIMEOUT_SECONDS | agents.explorer.timeoutSeconds | | QA_BUILD_ENV | agents.explorer.buildEnv | | QA_DISMISSALS_PATH | run.dismissalsPath | | XQA_SUITES_DIR | suites.directory |

Architecture

Key files and directories:

src/index.ts — CLI entry point; wires commander commands and manages graceful shutdown via process locks
src/commands/ — Command implementations (init, update, explore, spec, review, completion)
src/suite/ — Suite runner: config parsing, work-item building, worker pool, hooks
src/core/ — Pure functions: completion generation, verbose/timeout option parsing, last-path tracking
src/shell/ — I/O wrappers: app/explore context reading, debug logging, display factory, preflight, xqa directory discovery
src/config.ts, src/config-schema.ts — Configuration loading and validation with Zod
src/review-session.ts — Interactive finding review loop with dismissal tracking
src/spec-frontmatter.ts — Spec markdown frontmatter parsing (YAML)
src/spec-slug.ts — Spec filename to slug derivation for output organization
src/pid-lock.ts — Process-level mutual exclusion to prevent concurrent runs

Error Types

Core error discriminated unions:

ConfigError — Configuration validation failed (INVALID_CONFIG)
AppContextError — Failed to read app.md or explore.md (READ_FAILED)
XqaDirectoryError — No .xqa directory found (XQA_NOT_INITIALIZED)
SpecFrontmatterError — Malformed spec markdown (MISSING_FRONTMATTER, MISSING_FIELD, PARSE_ERROR)
LastPathError — No findings path provided and no prior session (NO_ARG_AND_NO_STATE)
SuiteConfigError — Suite config JSON malformed or schema-invalid (INVALID_SUITE_CONFIG)
HookError — Suite hook failure (HOOK_SPAWN_FAILED, HOOK_EXIT_NONZERO, HOOK_TIMEOUT, HOOK_ABORTED)

Development

Install dependencies:

pnpm install

Build the CLI:

pnpm run build

Run tests:

pnpm run test

Type check:

pnpm run typecheck

Lint and format:

pnpm run lint
pnpm run lint:fix

Full quality check (lint, typecheck, test):

pnpm run check
pnpm run check:fix

Watch mode (build + re-run on file changes):

pnpm run dev

Link binary globally (symlinks dist/xqa.cjs to ~/.local/bin/xqa):

pnpm run build:link

Unlink binary:

pnpm run build:unlink

Project Structure

src/
  index.ts                    # CLI entry point
  config.ts                   # Config loading and types
  config-schema.ts            # Zod schema for env vars
  constants.ts                # Tool lists and timeouts
  pid-lock.ts                 # Process exclusion lock
  spec-slug.ts                # Spec file to slug conversion
  spec-frontmatter.ts         # Spec YAML parsing
  review-session.ts           # Interactive finding review loop

  commands/
    init-command.ts           # Project initialization
    update-command.ts         # Skill updates
    install-skills.ts         # Bundled skill installer
    explore-command.ts        # Breadth-first exploration
    spec-command.ts           # Spec-based exploration
    spec-resolver.ts          # Spec file discovery and parsing
    review-command.ts         # Finding triage workflow
    completion-command.ts     # Shell completion generation
    item-events.ts            # Start/complete/fail event emitters

  core/
    parse-verbose.ts          # Verbose flag parsing
    parse-timeout-seconds.ts  # Timeout flag parsing
    completion-generator.ts   # Bash/zsh completion script generation
    last-path.ts              # Last findings path tracking

  shell/
    app-context.ts            # Read app.md and explore.md
    xqa-directory.ts          # Locate .xqa directory
    preflight.ts              # Environment preflight checks
    display-factory.ts        # Solo and suite display factories
    debug-logger.ts           # Debug event logging
    debug-agent-events.ts     # Agent event debug formatter
    debug-suite-events.ts     # Suite event debug formatter
    debug-logger-core.ts      # Pure logging helpers
    trigger-abort.ts          # Abort signal plumbing

  suite/
    types.ts                  # Suite work-item and findings types
    core/                     # Pure: config parser, item builders, hook env, queue
    shell/                    # I/O: worker pool, hook runner, findings writer
    commands/                 # run-command, execute-item, suite-run-context

  __tests__/
    *.test.ts                 # Test files co-located with src/

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@exodus/xqa

Overview

Commands

init

update

explore [prompt]

spec [spec-file]

run

plan

review [findings-path]

designs sync

designs rebuild

completion

Suite config

Configuration

.xqa/config.yaml

Capabilities

Visual analysis

Environment variables

Video recording

Migration from legacy env vars

Architecture

Error Types

Development

Project Structure

`.xqa/config.yaml`