qaosmonkey
v0.1.1
Published
Tech-agnostic agentic mobile exploratory testing CLI.
Maintainers
Readme
QAosMonkey
QAosMonkey is a tech-agnostic exploratory mobile testing agent. It drives iOS and Android emulators through agent-device, asks a configurable model what to try next, pauses when human help is required, and writes reproducible bug reports with screenshots and step traces.
Website: qaosmonkey.com
The current implementation is a working scaffold: it can already control an iOS simulator, collect accessibility snapshots, execute model decisions, persist run state, and generate reports.
Table of Contents
- Installation
- Requirements
- Find Your Simulator or Emulator Id
- First Verify QAosMonkey Works
- Smoke Test Your App
- When a Run Stops
- Guide What Must Be Tested
- Human Input During a Run
- Console Progress Logging
- Config Reference
- Supported Model Actions
- Device Driver Notes
- Useful Commands
- Publishing to npm
- Dependency Links
Installation
Install QAosMonkey from npm in the project where you want to run mobile smoke tests:
npm install --save-dev qaosmonkeyThen create a starter config:
npx qaosmonkey initRun QAosMonkey with:
npx qaosmonkey run --config qaos-monkey.config.tsDo not put -- between qaosmonkey and run when using npx. Use -- only with npm run qaosmonkey -- ... because that form is forwarding arguments through an npm script.
You can also run without adding it to package.json:
npx qaosmonkey@latest --helpRequirements
- Node.js 22.6 or newer.
- For iOS: Xcode with a booted iOS Simulator.
- For Android: a running Android emulator.
agent-device, either installed globally or run throughnpx.- One model provider:
- a CLI command such as Codex CLI or Claude Code that reads a prompt from stdin and prints one JSON decision, or
- an OpenAI-compatible or Anthropic API key.
Useful sanity checks:
node --version
npx agent-device --help
npx agent-device devices --platform ios --jsonFind Your Simulator or Emulator Id
QAosMonkey needs a stable device identifier so agent-device controls the simulator/emulator you intend.
iOS Simulator UDID
With an iOS simulator already booted, use either command:
npx agent-device devices --platform ios --json
xcrun simctl list devices bootedLook for the booted simulator entry. The UDID is the long UUID-like value.
Example:
iPhone 16 Pro Max (83E1501B-FFFD-4ACE-87D2-B80B8247D272) (Booted)Use that value in config:
const simulatorUdid = "83E1501B-FFFD-4ACE-87D2-B80B8247D272";
device: {
command: ["npx", "agent-device", "--platform", "ios", "--udid", simulatorUdid],
id: simulatorUdid
}Android Emulator Serial
With an Android emulator already running, use either command:
npx agent-device devices --platform android --json
adb devicesLook for the running emulator serial. It usually looks like emulator-5554.
Example:
emulator-5554 deviceUse that value in config:
const emulatorSerial = "emulator-5554";
device: {
command: ["npx", "agent-device", "--platform", "android", "--serial", emulatorSerial],
id: emulatorSerial
}First Verify QAosMonkey Works
Before pointing QAosMonkey at your own app, run the included iOS simulator smoke test. This does not need an LLM. It uses a deterministic local model fixture, opens iOS Settings, taps General, captures screenshots, and writes a report.
- Boot an iOS simulator.
xcrun simctl list devices booted- Update the simulator UDID in qaos-monkey.ios-sim.config.ts if it differs from your booted simulator.
const simulatorUdid = "YOUR_BOOTED_SIMULATOR_UDID";- Run the smoke test.
npm run qaosmonkey -- run --config qaos-monkey.ios-sim.config.ts- Open the generated report.
ls .qaos-monkey/runs
cat .qaos-monkey/runs/<runId>/report.mdA successful run should show Status: finished, two visited screens, screenshots under screenshots/, and a trace containing tap then finish.
Smoke Test Your App
Use this path once the built-in Settings smoke test works.
- Create your config.
npm run qaosmonkey -- initThis creates qaos-monkey.config.ts. Edit it for your app.
- Set the platform and app id.
For iOS:
app: {
platform: "ios",
name: "My iOS App",
bundleId: "com.yourcompany.yourapp",
launchCommand: ["npx", "agent-device", "--platform", "ios", "--udid", "YOUR_SIMULATOR_UDID", "open", "com.yourcompany.yourapp"]
}For Android:
app: {
platform: "android",
name: "My Android App",
packageName: "com.yourcompany.yourapp",
launchCommand: ["npx", "agent-device", "--platform", "android", "--serial", "YOUR_EMULATOR_SERIAL", "open", "com.yourcompany.yourapp"]
}- Configure the device driver.
For iOS:
device: {
driver: "agent-device",
command: ["npx", "agent-device", "--platform", "ios", "--udid", "YOUR_SIMULATOR_UDID"],
id: "YOUR_SIMULATOR_UDID",
orientation: "portrait",
resetBeforeRun: false
}For Android:
device: {
driver: "agent-device",
command: ["npx", "agent-device", "--platform", "android", "--serial", "YOUR_EMULATOR_SERIAL"],
id: "YOUR_EMULATOR_SERIAL",
orientation: "portrait",
resetBeforeRun: false
}Find ids with:
npx agent-device devices --platform ios --json
npx agent-device devices --platform android --json- Configure the model.
For a CLI provider, the command must read the full QAosMonkey prompt from stdin and print exactly one JSON action to stdout.
model: {
provider: "codex-cli",
command: ["codex", "exec", "--json", "--skip-git-repo-check"],
model: "gpt-5",
temperature: 0.4,
maxContextMessages: 30,
vision: true
}If you see spawn codex ENOENT, the codex executable is not on the PATH visible to QAosMonkey. On macOS with the Codex desktop app, use the full executable path:
model: {
provider: "codex-cli",
command: ["/Applications/Codex.app/Contents/Resources/codex", "exec", "--json", "--skip-git-repo-check"],
model: "gpt-5",
temperature: 0.4,
maxContextMessages: 30,
vision: true
}You can check what your shell sees with:
which codexFor an OpenAI-compatible API:
model: {
provider: "openai-compatible",
model: "gpt-4o",
apiKeyEnv: "OPENAI_API_KEY",
vision: true
}For Anthropic:
model: {
provider: "anthropic",
model: "claude-3-5-sonnet-latest",
apiKeyEnv: "ANTHROPIC_API_KEY",
vision: true
}- Add credentials if your app needs sign-in.
For local runs, copy the example env file and fill it with test credentials:
cp .env.example .env[email protected]
QAOSMONKEY_ADMIN_PASSWORD=replace-meThen reference those variables from config. Do not put the actual values in the config file.
credentials: {
envFile: ".env",
accounts: [
{
id: "admin",
description: "Admin test user. Can create projects, manage users, and access billing screens.",
fields: {
email: {
label: "Email",
env: "QAOSMONKEY_ADMIN_EMAIL"
},
password: {
label: "Password",
env: "QAOSMONKEY_ADMIN_PASSWORD",
sensitive: true
}
}
}
]
}For CI/CD, set the same environment variables in your secret store and omit envFile, or leave it set to .env if the file exists only in local development. CI environment variables take precedence.
QAosMonkey gives credential values to the model so it can type them into login forms, but redacts sensitive values before writing state.json, state.jsonl, and reports.
- Tune the exploration limits for a smoke test.
Start small. A first run should be short, low-risk, and easy to inspect.
exploration: {
goal: "Smoke test login, navigation, empty states, and obvious broken screens.",
persona: "You are an autonomous Chaos Monkey QA agent. Be curious, adversarial, and systematic.",
mustTest: [
"Login must work with the configured test account.",
"Create post: after tapping Post, the new post should be immediately visible."
],
maxSteps: 20,
minimumStepsBeforeFinish: 8,
timeLimitSeconds: 600,
destructiveLevel: "low",
maxRepeatedScreenVisits: 4,
excludedActions: [],
excludedScreens: ["Payment", "Delete account"],
allowlistRiskyAreas: []
}- Run QAosMonkey.
npm run qaosmonkey -- run --config qaos-monkey.config.ts- Read the report.
cat .qaos-monkey/runs/<runId>/report.md
cat .qaos-monkey/runs/<runId>/state.jsonEach run directory contains:
report.md: human-readable report.report.json: machine-readable report.state.json: latest persisted run state.state.jsonl: append-only event trace for every step. This is the main per-step log file for debugging what QAosMonkey did.screenshots/: screenshots captured during the run.
By default all run artifacts are stored under .qaos-monkey/runs/<runId>/. You can change this with reporting.outputDir.
When a Run Stops
QAosMonkey stops a run in these cases:
- The model returns
{"action":"finish"}and the run has already reachedminimumStepsBeforeFinish. maxStepsis reached.timeLimitSecondsis reached.- The model asks for human input and the user responds with
/abort. - A paused run remains paused until you resume it with
qaosmonkey resume <runId>.
The model is told to return finish only when it has reached the step budget, satisfied the goal and mustTest guidance, or sees no meaningful unexplored controls left. QAosMonkey does not blindly accept early finish: before minimumStepsBeforeFinish, it chooses a simple fallback exploratory action instead.
Guide What Must Be Tested
Use exploration.goal for broad intent and exploration.mustTest for specific coverage the agent should prioritize. The items are natural language on purpose: describe what must be true, not the exact taps.
exploration: {
goal: "Explore the social app with emphasis on auth, posting, profile visibility, and destructive edge cases.",
mustTest: [
"The following features must at least work: login, sign up, create post, delete post.",
"When creating a post, make sure the post is immediately visible after tapping Post.",
"When blocking a user, make sure the blocked user can no longer see the blocker profile."
]
}QAosMonkey still chooses the concrete path itself, so it can handle different UI layouts and continue exploratory testing around the required checks.
Human Input During a Run
If the model encounters something it cannot solve itself, it can return ask_human. QAosMonkey pauses and prompts you in the terminal.
You can respond with:
- normal text, such as an OTP, email link, test credential, or instruction.
/resolvedafter you manually fix the blocker in the simulator./skipto continue without solving the blocker./abortto stop the run and still write artifacts.
Resume a paused run with:
npm run qaosmonkey -- resume <runId> --config qaos-monkey.config.tsConsole Progress Logging
During run and resume, QAosMonkey prints progress lines with the [qaosmonkey] prefix. These show the current step, screen signature, number of interactive refs, when the model is being called, what action the model chose, whether the device action is executing, and the result.
CLI model providers such as Codex CLI and Claude Code also stream lightweight progress while they are running. If the model command takes a long time, QAosMonkey prints a heartbeat every 10 seconds so you can tell it is still waiting rather than hung.
Configured credential secrets are redacted from runner logs. CLI provider stream logs intentionally summarize model decisions without printing typed values.
To silence progress logs:
QAOSMONKEY_QUIET=1 npm run qaosmonkey -- run --config qaos-monkey.config.tsConfig Reference
The config file exports a plain config object. Start from qaos-monkey.config.example.ts or generate qaos-monkey.config.ts with npm run qaosmonkey -- init.
app
Describes the app under test and how QAosMonkey should launch it.
app: {
platform: "ios",
name: "My App",
bundleId: "com.example.ios",
packageName: "com.example.android",
launchCommand: ["npx", "agent-device", "--platform", "ios", "--udid", "UDID", "open", "com.example.ios"],
installCommand: ["npx", "agent-device", "--platform", "ios", "install", "com.example.ios", "./MyApp.app"]
}platform:iosorandroid.name: optional human-readable app name used in reports and config clarity.bundleId: iOS bundle id, such ascom.company.app.packageName: Android package name, such ascom.company.app.launchCommand: optional explicit command used before a run starts. Prefer this when you need customagent-deviceflags or deep links.installCommand: optional command reserved for app installation workflows. The current runner stores it in config but does not automatically execute it yet.
device
Controls which automation driver and simulator/emulator QAosMonkey uses.
device: {
driver: "agent-device",
command: ["npx", "agent-device", "--platform", "ios", "--udid", "UDID"],
id: "UDID",
orientation: "portrait",
resetBeforeRun: false,
commandMap: {
snapshot: ["snapshot", "-i"],
tap: ["click"],
type: ["fill"],
press_back: ["back"],
screenshot: ["screenshot"],
logs: ["logs"],
launch: ["open"]
}
}driver:agent-deviceormaestro.command: base command prepended to all driver actions. Use this to bind platform, UDID, serial, session, or config flags.id: optional simulator/emulator/device id for documentation and future driver behavior.orientation:portraitorlandscape. The current runner records this but does not rotate automatically yet.resetBeforeRun: reserved for app-reset behavior. The current runner records this but does not reset automatically yet.commandMap: optional overrides for driver subcommands if your local tool version differs from QAosMonkey defaults.
model
Chooses who decides the next action.
model: {
provider: "openai-compatible",
model: "gpt-4o",
baseUrl: "https://api.openai.com/v1",
apiKeyEnv: "OPENAI_API_KEY",
command: ["codex", "exec", "--json"],
temperature: 0.4,
maxContextMessages: 30,
vision: true
}provider: one ofopenai-compatible,anthropic,codex-cli, orclaude-code.model: provider-specific model name.baseUrl: optional API base URL. Useful for OpenAI-compatible gateways or self-hosted endpoints.apiKeyEnv: environment variable containing the provider API key. Defaults areOPENAI_API_KEYandANTHROPIC_API_KEY.command: command used by CLI providers. It receives the QAosMonkey prompt on stdin and must print one JSON decision on stdout.temperature: provider sampling temperature.maxContextMessages: number of recent QAosMonkey steps included in each model prompt.vision: when true, API providers include screenshots when available.
Common CLI provider issue:
spawn codex ENOENT: use the full path to the Codex executable inmodel.command, or add Codex to the PATH of the process running QAosMonkey.
credentials
Provides test accounts to the agent without committing secrets.
credentials: {
envFile: ".env",
accounts: [
{
id: "admin",
description: "Admin test user. Can manage users and access privileged screens.",
fields: {
email: {
label: "Email",
env: "QAOSMONKEY_ADMIN_EMAIL",
sensitive: false
},
password: {
label: "Password",
env: "QAOSMONKEY_ADMIN_PASSWORD",
sensitive: true
}
}
}
]
}envFile: optional dotenv-style file loaded before a run..envis ignored by git.accounts: list of test accounts the model may use.id: short account identifier shown to the model, such asadminorfree_user.description: tells the model what the account can do, for example “this user is an admin.”fields: named credential fields. Common keys areemail,username,password,otpSeed, orapiToken.fields.*.label: human-readable field label for the model.fields.*.env: environment variable that contains the actual value.fields.*.sensitive: defaults totrue. Sensitive values are redacted from persisted run state and reports.
CI/CD usage:
[email protected] \
QAOSMONKEY_ADMIN_PASSWORD=secret \
npm run qaosmonkey -- run --config qaos-monkey.config.tsSafety notes:
- Put real values in
.envor CI/CD secrets, not inqaos-monkey.config.ts. - Keep
.env.examplewith fake values only. - The model receives credential values so it can sign in. Use dedicated test accounts and non-production environments.
- QAosMonkey redacts configured sensitive values before writing artifacts, including failed command errors that may echo typed text.
exploration
Controls what the agent is trying to do and how far it may go.
exploration: {
goal: "Smoke test login, navigation, empty states, and obvious broken screens.",
persona: "You are an autonomous Chaos Monkey QA agent. Be curious, adversarial, and systematic.",
mustTest: [
"Login must work with the configured test account.",
"Create post: after tapping Post, the new post should be immediately visible.",
"Blocking a user: the blocked user should no longer be able to see the blocker profile."
],
maxSteps: 20,
timeLimitSeconds: 600,
destructiveLevel: "low",
maxRepeatedScreenVisits: 4,
excludedActions: [],
excludedScreens: ["Payment", "Delete account"],
allowlistRiskyAreas: []
}goal: task-level instruction given to the model on every decision.persona: behavioral instruction that shapes exploration style.mustTest: natural-language required coverage. Use this for features, assertions, and business rules that must be exercised at least once while still letting the model decide the exact path.maxSteps: hard cap on model decisions in a run.minimumStepsBeforeFinish: prevents a model from ending the run too early. If the model returnsfinishbefore this many steps, QAosMonkey picks a simple fallback exploratory action instead.timeLimitSeconds: hard cap on run duration.destructiveLevel:low,medium, orhigh; tells the model how aggressive it may be.maxRepeatedScreenVisits: number of visits to the same screen signature before QAosMonkey records a blocker-style finding.excludedActions: action names the runner must skip, such astypeorpress_back.excludedScreens: case-insensitive screen names or destination labels to avoid. QAosMonkey treats title/header-like text as the current screen identity and blocks taps/types/scrolls/swipes on controls whose own label matches an excluded screen. A login page that merely contains a smallForgot passwordlink will still be testable, but tapping that link is blocked.allowlistRiskyAreas: names of risky areas the model may interact with. Use this to explicitly permit flows like payment, account deletion, or production-impacting actions later.
humanInput
Controls how QAosMonkey asks for help.
humanInput: {
provider: "cli"
}provider: currently onlycli. The terminal prompts you when the model returnsask_human.
reporting
Controls where artifacts go.
reporting: {
outputDir: ".qaos-monkey/runs",
retainScreenshots: true
}outputDir: root directory for run artifacts.retainScreenshots: when true, QAosMonkey asks the driver to save screenshots for observations.
Supported Model Actions
Every model provider must return one JSON object using one of these actions:
{"action":"tap","ref":"@e1","reason":"Open the login form"}
{"action":"tap","x":120,"y":340,"reason":"Fallback coordinate tap"}
{"action":"type","ref":"@e2","value":"[email protected]","submit":false,"reason":"Try invalid email"}
{"action":"scroll","direction":"down","reason":"Look for more settings"}
{"action":"swipe","direction":"left","reason":"Test carousel navigation"}
{"action":"press_back","reason":"Check back navigation"}
{"action":"dismiss_overlay","reason":"React Native warning/error overlay is blocking the app"}
{"action":"wait","milliseconds":1000,"reason":"Wait for loading"}
{"action":"ask_human","reason":"Captcha blocks progress","options":["provided","resolved","skip","abort"]}
{"action":"log_bug","finding":{"severity":"high","category":"functional","title":"Login button does nothing","description":"Tapping Login has no visible effect.","expected":"Login should submit or show validation.","actual":"No change after tap.","stepsToReproduce":["Open app","Tap Login"],"confidence":0.8},"reason":"Observed no-op on primary action"}
{"action":"finish","reason":"Smoke coverage target reached"}The runner validates refs against the current accessibility snapshot before executing tap/type actions.
Device Driver Notes
The default driver wraps agent-device commands:
- snapshot:
snapshot -i - tap:
click - type into field:
fill - scroll:
scroll - swipe:
swipe - back:
back - screenshot:
screenshot - launch:
open
If your installed agent-device version uses different command names, override them in device.commandMap.
Useful Commands
npm test
npm run qaosmonkey -- --help
npm run qaosmonkey -- init
npm run qaosmonkey -- run --config qaos-monkey.config.ts
npm run qaosmonkey -- resume <runId> --config qaos-monkey.config.ts
npm run qaosmonkey -- report <runId> --config qaos-monkey.config.tsPublishing to npm
The npm package name is qaosmonkey, not qaos-monkey. This follows the project naming rule: use QAosMonkey for the brand, qaosmonkey for terminal and package registry identifiers, and qaos-monkey for files and directories.
Publishing is handled by .github/workflows/publish-npm.yml. The workflow runs tests, builds the package, shows npm pack --dry-run, and publishes the root package only. The website is excluded through package.json files and .npmignore.
To enable publishing:
- Create an npm access token for the
qaosmonkeypackage. - Add it to the GitHub repository secrets as
NPM_TOKEN. - Push a version tag such as
v0.1.1, create and publish a GitHub Release, or run the workflow manually from GitHub Actions.
You can verify the package contents locally with:
npm pack --dry-runDependency Links
The published qaosmonkey package currently uses Node.js built-ins for its core runtime and calls external tools or APIs through configuration. These are the main projects and services QAosMonkey integrates with:
- Node.js for the CLI runtime.
- agent-device as the default iOS/Android device driver.
- Maestro as the optional fallback driver.
- OpenAI API for OpenAI-compatible model providers.
- Anthropic Claude API for Anthropic model providers.
- Codex CLI for local CLI-based model execution.
- Claude Code for local CLI-based model execution.
The documentation website is a separate private package under website/. Its main dependencies are:
- Docusaurus for the documentation site.
- React and React DOM for the website UI.
- MDX for docs content.
- clsx for conditional CSS class names.
- Prism React Renderer for code highlighting.
For dependency license notes, see THIRD_PARTY_LICENSES_SUMMARY.md.
