@shaykec/ai-tester
v0.4.4
Published
AI-powered testing toolkit for mobile, web, and macOS apps
Downloads
43
Maintainers
Readme
AI Tester
AI-powered testing toolkit for mobile, web, and macOS applications
Installation
npm install -g @shaykec/ai-testerOr use directly with npx:
npx @shaykec/ai-testerOverview
AI Tester enables exploratory testing with AI assistance in Cursor, then automatically generates regression tests for continuous testing. Works across iOS, Android, macOS, and Web platforms.
┌─────────────────────────────────────────────────────────────────┐
│ AI TESTER │
├─────────────────────────────────────────────────────────────────┤
│ EXPLORE (Cursor + MCP) REGRESS (CLI + CI/CD) │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ Appium │──generates──────│ Maestro │ │
│ │ Playwright │ ↓ │ Playwright │ │
│ └─────────────┘ flows └─────────────┘ │
│ │ │ │
│ AI Vision Standard CLIs │
│ Fallback (CI/CD ready) │
└─────────────────────────────────────────────────────────────────┘✨ Features
- 🔍 Exploratory Testing - Use Appium (mobile) or Playwright (web) through Cursor AI
- 📝 Direct Test Generation - AI generates tests directly from context (no intermediate format)
- 🔄 Stateless API - Every call is self-contained, connections auto-managed
- 📸 Inspection by Default - Actions return screenshot + element tree automatically
- 🎬 Video Recording - Automatic video capture (enabled by default) for debugging and CI/CD
- ⚡ Batch Operations - Multiple actions in one call for efficiency
- 👁️ AI Vision Fallback - Self-healing element location when locators break
- 🏃 Test Runner - Execute Maestro YAML tests with step-by-step inspection on failure
- 🖼️ Visual Regression - Screenshot comparison with baseline management and exclusion regions
- 🔐 Custom Authentication - Configurable HTTP-based auth for enterprise/internal apps
🎯 Use Cases
- AI-Powered QA Automation - Point an AI agent at your app and let it autonomously explore, find bugs, and generate test suites
- Cross-Platform Regression Testing - Write a single YAML test and run it across Web, iOS, Android, and macOS
- Visual Regression Testing - Pixel-level screenshot comparison with baseline management and exclusion regions
- Self-Healing Tests - AI vision automatically re-locates elements when selectors break after UI changes
- Synthetic Monitoring - Schedule critical user journeys (login, checkout, search) to run against production on a cadence
- UX Audit & Documentation - AI walks through every flow, captures annotated screenshots, and generates up-to-date docs
- CI/CD Test Pipeline - Explore and generate tests in Cursor, then run them in CI with Playwright or Maestro
- Bug Discovery Bot - An exploratory agent that systematically navigates your app looking for crashes, console errors, and visual anomalies
🤖 Testing Agent (Cursor IDE + CLI)
AI Tester ships with an Agent Skill that turns Cursor into a dedicated testing agent. The skill provides testing domain expertise (exploration strategy, bug detection heuristics, test quality patterns) on top of the MCP tools.
The skill activates automatically when you ask about testing, or invoke it explicitly with /ai-tester in chat.
Cursor IDE (Chat)
Ask the agent to test your app directly in Cursor:
"Test the login flow in my web app at http://localhost:3000"
"Explore my iOS app (com.myapp.ios) and find bugs"
"Run the test at tests/flows/checkout.yaml and fix any failures"Cursor CLI (Terminal)
Use the agent command for terminal-based testing:
# Interactive session
agent "Test the login flow in my web app at http://localhost:3000"
# Non-interactive (CI/CD pipelines)
agent -p "Run all smoke tests at tests/flows/ on web" --output-format json
# Fix failing tests automatically
agent -p --force "Fix the failing test at tests/flows/login.yaml"Cloud Agent (Long-Running Tasks)
Push long exploration tasks to a Cloud Agent and pick up results later:
& explore my iOS app comprehensively and create a full regression suiteInstalling the Skill
The skill is scaffolded automatically when you initialize ai-tester:
# Install Cursor CLI (if not already installed)
curl https://cursor.com/install -fsSL | bash
# Initialize ai-tester (creates skill + test directories + MCP config)
npx @shaykec/ai-tester init
# Restart Cursor, then start testingOr install the skill from GitHub via Cursor Settings > Rules > Add Rule > Remote Rule (Github).
🔐 Authentication (Optional)
AI Tester supports custom HTTP-based authentication for testing apps that require login. This is useful for:
- Enterprise apps with SSO/internal auth
- Wix internal testing with Pilot login
- Any app with a programmatic auth endpoint
Quick Setup
Set environment variables:
export CUSTOM_AUTH_URL="https://your-auth-service.com/api/login" export CUSTOM_AUTH_KEY="your-api-key"Or configure in
.cursor/mcp.json:{ "mcpServers": { "ai-tester": { "command": "npx", "args": ["@shaykec/ai-tester"], "env": { "CUSTOM_AUTH_URL": "https://your-auth-service.com/api/login", "CUSTOM_AUTH_KEY": "your-api-key" } } } }Use in tests:
# Maestro YAML - customLogin: [email protected] - assertVisible: "Dashboard"Or via MCP tool:
custom_login({ email: "[email protected]", platform: "web", url: "http://localhost:3000" })
For Wix employees: Copy .env.wix.example to .env.wix and configure your credentials.
🗣️ Prompts for AI Agents
Once you have AI Tester configured in Cursor, use these prompts:
| What You Want | Example Prompt | |---------------|----------------| | Test a feature | "Test the login in my iOS app (com.myapp.ios)" | | E2E test | "Create an E2E test for the checkout flow on web at http://localhost:3000" | | Explore for bugs | "Explore my Android app and look for UI issues" | | Regression suite | "Create regression tests for user settings" | | Run existing test | "Run the test at tests/flows/login.yaml on iOS" | | Fix failing test | "Run this test and fix any failures" |
📖 Full Prompt Guide → - Copy-paste prompts for every testing scenario
Platform Support
| Platform | Exploration | Regression | |----------|-------------|------------| | iOS | Appium | Maestro | | Android | Appium | Maestro | | macOS | Accessibility APIs | AppleScript | | Web | Playwright | Playwright Test |
Quick Start
1. Install from npm
npm install -g @shaykec/ai-testerOr use directly with npx:
npx @shaykec/ai-tester --help2. Configure MCP in Cursor
Add to your project's .cursor/mcp.json:
{
"mcpServers": {
"ai-tester": {
"command": "npx",
"args": ["-y", "-p", "@shaykec/ai-tester", "ai-tester-mcp"]
}
}
}With custom authentication (optional):
{
"mcpServers": {
"ai-tester": {
"command": "npx",
"args": ["-y", "-p", "@shaykec/ai-tester", "ai-tester-mcp"],
"env": {
"CUSTOM_AUTH_URL": "https://your-auth-service.com/api/login",
"CUSTOM_AUTH_KEY": "your-api-key"
}
}
}
}Restart Cursor to load the MCP server.
3. Start Testing
In Cursor, ask the AI to test your app:
"Test the login flow in my iOS app (com.myapp.ios)"
"Explore the checkout process on my web app at http://localhost:3000"
"Generate a regression test from this session"📖 See Prompt Guide for ready-to-use prompts for all testing scenarios.
Alternative: Install from Source
For development or to get the latest features:
1. Clone and Build
git clone https://github.com/shayke-cohen/ai-tester.git
cd ai-tester
yarn install
yarn build2. Configure MCP in Cursor
{
"mcpServers": {
"ai-tester": {
"command": "node",
"args": ["./dist/mcp/index.js"]
}
}
}Use in Your Project
Want to use AI Tester to test your own application? Follow these steps.
1. Initialize in Your Project
cd /path/to/your-project
# Using npx
npx @shaykec/ai-tester init
# Or if installed globally
ai-tester init
# Creates tests/ directory structure and registry2. Configure MCP in Cursor
Add to your project's .cursor/mcp.json:
{
"mcpServers": {
"ai-tester": {
"command": "npx",
"args": ["-y", "-p", "@shaykec/ai-tester", "ai-tester-mcp"]
}
}
}Restart Cursor to load the MCP server.
3. Workflow for Agents (Consolidated 10-Tool API)
AI Tester uses a consolidated API with just 10 tools. After the first inspect(), context is inherited:
- Inspect the app - Start by inspecting (sets context):
inspect({ platform: "web", url: "http://localhost:3000" }) - Interact with elements - Use
act()for all interactions (context inherited):act({ action: "tap", selector: "#login" }) act({ action: "input", selector: "#email", text: "[email protected]" }) - Use batch for multiple actions - Reduce round trips:
batch({ actions: [ { action: "input", selector: "#email", text: "[email protected]" }, { action: "input", selector: "#password", text: "secret" }, { action: "tap", selector: "#login" } ] }) - Make assertions:
assert({ type: "visible", selector: ".dashboard" }) - Generate tests directly - AI has full context, generate Playwright/Maestro code directly
The 10 Tools: inspect, act, navigate, batch, assert, wait, device, test, logs, session
5. Running Generated Tests
Generated tests can be run directly with their native runners:
# Web tests (Playwright)
npx playwright test tests/web/login.spec.ts
# Mobile tests (Maestro)
maestro test tests/flows/login.yaml6. Platform-Specific Requirements
| Platform | Requirements | |----------|--------------| | Web | Base URL of your app | | iOS | Xcode, Simulator, bundle ID or .app path | | Android | Android Studio, Emulator, package name or .apk path | | macOS | App name, accessibility permissions |
Usage Examples
Mobile Testing (iOS/Android) - Consolidated API
User: "Test login on iOS simulator"
Cursor AI (context inherited after first call):
1. inspect({ platform: "ios", app: "com.example.app" })
→ Returns screenshot + element tree, sets context
2. act({ action: "input", strategy: "accessibility id", selector: "email_input", text: "[email protected]" })
→ Context inherited, returns screenshot
3. act({ action: "input", strategy: "accessibility id", selector: "password_input", text: "pass123" })
4. act({ action: "tap", strategy: "accessibility id", selector: "login_button" })
→ Returns screenshot showing login result
5. (AI generates Maestro test directly from context)Web Testing - Consolidated API
User: "Test the product page"
Cursor AI (context inherited after first call):
1. inspect({ platform: "web", url: "http://localhost:3000" })
→ Returns screenshot + accessibility tree, sets context
2. act({ action: "tap", selector: "[data-testid='products']" })
→ Context inherited, returns screenshot after click
3. assert({ type: "visible", selector: ".product-list" })
4. (AI generates Playwright test directly from context)Batch Operations (Reduce Round Trips)
User: "Fill out the login form"
Cursor AI:
batch({
actions: [
{ action: "input", selector: "#email", text: "[email protected]" },
{ action: "input", selector: "#password", text: "password123" },
{ action: "tap", selector: "#login-button" },
{ action: "wait", duration: 1000 }
]
})
→ Executes all actions in ONE call (context inherited), returns final screenshotRunning Generated Tests
# Run Playwright tests directly
npx playwright test tests/login.spec.ts
# Run Maestro flows directly
maestro test flows/login.yaml
# Run all Playwright tests
npx playwright test
# Run with specific browser
npx playwright test --project=firefoxHandling Permission Dialogs
When your app requests permissions (camera, location, notifications), system dialogs appear. Here's how to handle them:
Quick fix: Tap the button directly
# Wait for and tap the permission button
ai-tester wait --for element --text "Allow"
ai-tester act --action tap --text "Allow"Better: Auto-accept alerts during session
# iOS: Auto-accept all system alerts
ai-tester inspect --platform ios --app com.example.app --auto-accept-alerts
# Android: Auto-grant all permissions
ai-tester inspect --platform android --app com.example.app --auto-grant-permissionsBest: Pre-grant permissions before testing
# iOS Simulator - grant all permissions
xcrun simctl privacy booted grant all com.example.app
# Android - grant specific permissions
adb shell pm grant com.example.app android.permission.CAMERA
adb shell pm grant com.example.app android.permission.ACCESS_FINE_LOCATIONCommon dialog button texts by platform:
- iOS: "Allow", "Don't Allow", "Allow While Using App", "Allow Once"
- Android: "Allow", "Deny", "While using the app", "Only this time"
- macOS: "OK", "Allow", "Open"
🎬 Video Recording
Video recording is enabled by default on all platforms. Videos are automatically captured during test sessions for debugging, CI/CD evidence, and exploratory testing review.
How It Works
// Video recording is ON by default
inspect({ platform: "web", url: "http://localhost:3000" })
// Session automatically records video
// When session ends, video is returned:
// → { video: { path: "tests/videos/web-abc123.webm", duration: 45.2, format: "webm" } }
// Disable video recording when needed
inspect({ platform: "ios", app: "com.example.app", recordVideo: false })Video Storage
Videos are saved to tests/videos/ with the naming convention:
{platform}-{sessionId}-{timestamp}.{format}| Platform | Format | Method |
|----------|--------|--------|
| Web | WebM | Playwright recordVideo context |
| iOS | MP4 | Appium startRecordingScreen() |
| Android | MP4 | Appium startRecordingScreen() |
| macOS | MOV | screencapture -V native command |
CLI Usage
# Inspect with video (default)
npx @shaykec/ai-tester inspect --platform web --url http://localhost:3000
# Disable video recording
npx @shaykec/ai-tester inspect --platform ios --app com.example.app --record-video false
# Run test with video
npx @shaykec/ai-tester run-test --platform web --path tests/flows/login.yaml
# Video saved when test completesUse Cases
- Debug test failures - Review video to see exactly what happened
- CI/CD artifacts - Attach videos to test reports for evidence
- Exploratory testing - Capture live sessions for later analysis
- Bug reports - Include video with reproducible steps
📋 Log Retrieval
AI Tester provides unified log retrieval across all platforms - runtime logs, crash reports, diagnostic reports, and system console logs.
Sources
| Source | Description | Platforms |
|--------|-------------|-----------|
| buffer | Runtime logs captured during session | All |
| device | Mobile device logs (syslog, logcat) | iOS, Android |
| network | Network request/response logs | Web |
| crash | Crash reports (.ips files) | macOS, iOS Simulator |
| diagnostic | Hang/spin/resource reports | macOS |
| console | System unified logs | macOS, iOS Simulator |
Target Auto-Detection
The logs tool auto-detects the target based on your current session:
- If you've inspected a macOS app → logs from host
- If you've inspected an iOS simulator app → logs from simulator
- If you've inspected a real device → logs from device
Override with explicit --target:
# Get macOS crash reports
npx @shaykec/ai-tester logs --target host --log-type crash --since 24h
# Get iOS simulator console logs
npx @shaykec/ai-tester logs --target simulator --log-type console --filter "MyApp"
# Get all logs with filtering
npx @shaykec/ai-tester logs --target host --log-type all --level error --limit 50Filtering Options
| Option | Description | Example |
|--------|-------------|---------|
| --since | Time range start | 5m, 1h, 24h, 7d, ISO date |
| --until | Time range end | Same formats as --since |
| --level | Min log level | debug, info, warn, error |
| --filter | Text substring match | "Calculator" |
| --regex | Regex pattern | "error\|warning" |
| --limit | Max entries | 50 |
| --include-stack-trace | Include crash stack traces | (flag) |
MCP Tool Usage
// Get macOS crash reports from last 24 hours
logs({ target: "host", logType: "crash", since: "24h" })
// Get iOS simulator logs filtered by app
logs({ target: "simulator", logType: "console", filter: "MyApp" })
// Get runtime buffer logs with level filter
logs({ source: "buffer", level: "error", limit: 50 })
// Get network logs filtered by URL
logs({ source: "network", filterUrl: "/api/", filterMethod: "POST" })
// Get crash reports with stack traces
logs({ target: "host", logType: "crash", includeStackTrace: true })Crash Info in Inspect
When inspecting a macOS app, recent crash information is automatically included if any crashes occurred in the last 24 hours:
inspect({ platform: "macos", appName: "Calculator" })
// → { ..., crashInfo: { count: 2, lastCrash: { timestamp: "...", reason: "..." } } }🖼️ Visual Regression Testing
Visual regression testing compares screenshots against saved baselines to detect unintended UI changes. The feature is integrated into the inspect tool - no new tools needed.
Save a Baseline
During exploration, save important visual states as baselines:
// Save current screen as baseline
inspect({
platform: "web",
url: "http://localhost:3000",
saveBaseline: "checkout_page"
})
// → { baselineSaved: { name: "checkout_page", path: "tests/baselines/web/checkout_page.png" } }Compare to Baseline
Compare current state against a saved baseline:
inspect({
platform: "web",
url: "http://localhost:3000",
compareBaseline: "checkout_page",
threshold: 0.05 // 5% tolerance
})
// Match: { visualComparison: { match: true, diffPercent: 0.02 } }
// Mismatch: { visualComparison: { match: false, diffPercent: 12.5, diffImage: "base64...", guidance: "..." } }Exclude Dynamic Regions
Avoid false positives from timestamps, ads, or dynamic content:
inspect({
platform: "ios",
app: "com.example.app",
compareBaseline: "dashboard",
ignoreRegions: [
{ x: 10, y: 50, width: 200, height: 30 }, // Rectangle coordinates
{ selector: ".timestamp" }, // CSS selector (web)
{ id: "loading_spinner" }, // Accessibility ID (mobile)
{ maskImage: "tests/masks/dashboard.png" } // Reusable mask image
]
})Maestro YAML Integration
Use assertVisualMatch in your Maestro test files:
# Simple form
- tapOn:
id: "submit_button"
- assertVisualMatch: "checkout_success"
# With options
- assertVisualMatch:
baseline: "order_summary"
threshold: 0.05
ignoreRegions:
- { selector: ".order-number" }
- { x: 0, y: 0, width: 100, height: 50 }Storage Structure
tests/
baselines/ # Golden images (per platform)
web/
checkout_page.png
ios/
main_screen.png
masks/ # Reusable exclusion masks
dashboard_mask.png
screenshots/ # Runtime captures + diffs
diffs/
diff_2024-01-15.pngAnnotation Tool
The annotate tool creates visual annotation overlays for any platform, perfect for AI agents to identify and act on specific UI elements.
Platform Support
| Platform | autoAnnotate | Method | |----------|--------------|--------| | Web | ✅ Yes | Playwright + JS overlay | | iOS (Simulator) | ✅ Yes | Swift overlay | | Android (Emulator) | ✅ Yes | Swift overlay | | React Native | ✅ Yes | Swift overlay | | macOS | ✅ Yes | Swift overlay |
Usage
// Auto-annotate iOS Simulator (native or React Native)
annotate({
platform: "ios",
app: "com.example.app",
autoAnnotate: true
})
// Auto-annotate Android Emulator
annotate({
platform: "android",
app: "com.example.app",
autoAnnotate: true
})
// Auto-annotate Web
annotate({
platform: "web",
url: "http://localhost:3000",
autoAnnotate: true
})Response
Returns structured annotation data with element selectors, bounds, and annotated screenshots:
{
"success": true,
"session": {
"annotations": [
{
"number": 1,
"element": {
"selector": "~login_button",
"tagName": "XCUIElementTypeButton",
"bounds": { "x": 50, "y": 300, "width": 200, "height": 44 }
},
"label": "Login"
}
],
"screenshot": "base64...",
"elementTree": [...]
}
}CLI Tool Commands
All MCP tools are also available as CLI commands for scripting, CI/CD, and LLM integration:
# Get LLM-friendly schema for all commands
npx @shaykec/ai-tester schema --all --json
# Inspect a web page
npx @shaykec/ai-tester inspect --platform web --url http://localhost:3000
# Tap an element
npx @shaykec/ai-tester tap --platform ios --app com.example.app --text "Login"
# Run a test
npx @shaykec/ai-tester run-test --platform web --url http://localhost:3000 --path tests/flows/login.yaml
# List available simulators
npx @shaykec/ai-tester list-devices --platform ios --jsonIf installed globally, you can use ai-tester directly:
ai-tester inspect --platform web --url http://localhost:3000Available CLI Commands
| Category | Commands |
|----------|----------|
| Actions | tap, input, swipe, press, navigate, batch |
| Observation | inspect, find-element, wait-for, logs |
| Assertions | expect-visible, expect-text |
| Devices | list-devices, boot-simulator, shutdown-simulator, install-app |
| Test Runner | run-test, parse-test, export-test |
| Schema | schema - LLM-friendly command documentation |
LLM-Friendly Schema
The schema command provides machine-readable documentation for LLMs:
# Get schema for a specific command
npx @shaykec/ai-tester schema tap --json
# Get all schemas (for LLM context)
npx @shaykec/ai-tester schema --all --json
# Human-readable help
npx @shaykec/ai-tester schema tapAPI Overview
Key Tools
| Tool | Description |
|------|-------------|
| inspect | Get screenshot + element tree (auto-connects) |
| tap | Click/tap element (returns inspection by default) |
| input | Type text into field |
| batch | Multiple actions in one call |
| screenshot | Just the screenshot |
| navigate | Go to URL/deep link |
| evaluate | Execute JavaScript in app context (React Native + Web) |
JavaScript Evaluation
The evaluate action allows executing JavaScript directly in the application context. This is powerful for:
- Reading/writing React state - Access component state, Redux stores, context values
- Debugging - Inspect runtime values, call internal functions
- Test setup - Pre-populate state, skip authentication flows
- Assertions - Verify internal state matches expectations
Platform Support
| Platform | Method | Requirements |
|----------|--------|--------------|
| Web | Playwright page.evaluate() | Active Playwright session |
| iOS (React Native) | Chrome DevTools Protocol via Metro | Metro bundler running (port 8081) |
| Android (React Native) | Chrome DevTools Protocol via Metro | Metro bundler running (port 8081) |
Usage Examples
// Web: Get document title
act({ platform: "web", action: "evaluate", expression: "document.title" })
// Web: Read React state from window
act({ platform: "web", action: "evaluate", expression: "window.__STORE__.getState()" })
// React Native: Read component state via DevTools hook
act({
platform: "ios",
action: "evaluate",
expression: "__REACT_DEVTOOLS_GLOBAL_HOOK__.renderers.get(1).getFiberRoots(1).values().next().value.current.memoizedState"
})
// React Native: Show alert
act({ platform: "ios", action: "evaluate", expression: "__r(2).Alert.alert('Test', 'Hello from evaluate!')" })
// React Native: Reload app
act({ platform: "ios", action: "evaluate", expression: "__r(2).DevSettings.reload()" })React Native Internals
For React Native apps, you can access:
__REACT_DEVTOOLS_GLOBAL_HOOK__- React DevTools hook for fiber tree access__r- Metro's require function for accessing bundled modules (e.g.,__r(2).Alert)- Component state via fiber tree traversal (
memoizedState,memoizedProps) - Global variables exposed by the app
Connection Parameters
Every action includes target info (auto-managed):
| Platform | Parameters |
|----------|------------|
| Web | platform: "web", url: "http://..." |
| iOS | platform: "ios", app: "com.bundle.id" |
| Android | platform: "android", app: "com.package.name" |
| macOS | platform: "macos", appName: "App Name" |
Guide Tools
ai-tester run [options]
-i, --id <id> Run specific test by ID
-t, --tags <tags> Filter by tags (comma-separated)
-p, --platform <p> Filter by platform (ios, android, macos, web)
--type <type> Filter by type (maestro, playwright, applescript)
--headed Run browser tests in headed mode
--update-snapshots Update visual regression snapshots
--report <format> Report format (json, html, console)MCP Tools Reference (57 tools)
Guide Tools (4)
| Tool | Description |
|------|-------------|
| guide_test_generation | Get guidance on test generation workflow |
| guide_available_tools | List all tools by category |
| guide_locator_strategies | Get locator strategies for a platform |
| guide_workflow | Get step-by-step workflow for a task |
Unified Tools (29) - Recommended
Platform-agnostic tools that work across iOS, Android, macOS, and Web with a platform parameter:
Session Lifecycle:
| Tool | Description |
|------|-------------|
| start_session | Start testing session for any platform |
| end_session | End testing session |
| start_test | Start session + recording + inspect in ONE call (saves round trips) |
Actions:
| Tool | Description |
|------|-------------|
| tap | Tap/click element (with optional auto-inspect) |
| input | Enter text into field (with optional auto-inspect) |
| clear | Clear text from input field |
| swipe | Swipe gesture (mobile) or scroll (web) |
| scroll_to | Scroll until element is visible |
| type | Type character by character (for autocomplete) |
| press | Press keyboard key |
Navigation:
| Tool | Description |
|------|-------------|
| navigate | Navigate to URL (web) or deep link (mobile) |
| select | Select dropdown/picker option |
| hover | Hover over element (web only) |
Inspection:
| Tool | Description |
|------|-------------|
| inspect | Get screenshot + element tree for any platform |
| screenshot | Capture screenshot for any platform |
| get_source | Get raw page/screen source |
| annotate | Visual annotation overlay with autoAnnotate mode |
Elements:
| Tool | Description |
|------|-------------|
| find_element | Find UI element by locator |
| find_elements | Find multiple UI elements |
| wait_for | Wait for element to appear |
| get_text | Get element text content |
| get_attribute | Get element attribute |
Assertions:
| Tool | Description |
|------|-------------|
| expect_visible | Assert element is visible |
| expect_text | Assert element contains text |
Batch & Test Management:
| Tool | Description |
|------|-------------|
| batch | Execute multiple actions in ONE call |
| generate_test | Generate test from session (Maestro/Playwright/AppleScript) |
| run_test | Run test by ID or path |
| run_tests_by_tags | Run all tests matching tags |
| regenerate_test | Regenerate test from session |
| list_devices | List available simulators/emulators |
Session Management (5)
| Tool | Description |
|------|-------------|
| session_start | Start recording a testing session |
| session_stop | Stop recording and save session |
| session_annotate | Add comment to session |
| session_status | Get current session status |
| session_set_auto_capture | Enable auto screenshot/element tree capture |
Test Registry (8)
| Tool | Description |
|------|-------------|
| registry_list | List all tests |
| registry_get | Get test by ID |
| registry_search | Search tests |
| registry_add | Add test to registry |
| registry_update | Update test |
| registry_delete | Delete test |
| registry_tags | List all tags |
| registry_stats | Get registry statistics |
Device Management (4)
| Tool | Description |
|------|-------------|
| boot_simulator | Boot iOS Simulator or Android Emulator |
| shutdown_simulator | Shutdown simulator/emulator |
| install_app | Install app on device |
| launch_app | Launch app on device |
Web-Only (1)
| Tool | Description |
|------|-------------|
| playwright_expect_url | Assert current URL matches pattern |
Validation (1)
| Tool | Description |
|------|-------------|
| maestro_validate | Validate a Maestro flow file |
AI Vision (5)
| Tool | Description |
|------|-------------|
| vision_locate_element | Find element using AI vision |
| vision_describe_screen | Get AI description of screen |
| vision_suggest_locators | Generate alternative locators |
| vision_compare_screens | Compare two screenshots |
| vision_verify_element | Verify element appears as expected |
Test Registry
Tests are tracked in tests/registry.json:
{
"tests": {
"ios-login": {
"id": "ios-login",
"name": "Login Flow",
"platform": "ios",
"type": "maestro",
"flowPath": "tests/flows/mobile/ios/login.yaml",
"sessionPath": "tests/sessions/ios-login.json",
"tags": ["auth", "smoke"],
"visualRegression": false,
"status": "active"
}
}
}Project Structure
ai-tester/
├── src/
│ ├── mcp/ # MCP Server
│ │ ├── index.ts # Entry point
│ │ └── tools/ # Tool implementations
│ └── cli/ # CLI Tool
│ └── index.ts
├── test-apps/ # Demo apps for testing
│ ├── react-native/
│ ├── kotlin-android/
│ ├── swift-ios/
│ ├── swift-macos/
│ └── react-web/
├── tests/ # Generated tests
│ ├── registry.json
│ ├── flows/ # Maestro flows
│ ├── web/ # Playwright specs
│ └── sessions/ # Raw sessions
└── scripts/ # Setup scriptsRequirements
- Node.js 20+
- Yarn 4.x
- Xcode (for iOS)
- Android Studio (for Android)
- Java 17+ (for Maestro - auto-installed if missing)
Note: Appium and Maestro are automatically installed and managed by the MCP server:
- Appium: Auto-installed via npm when first mobile tool is called, including drivers (xcuitest for iOS, uiautomator2 for Android)
- Maestro: Auto-installed via official install script when first Maestro tool is called
- Appium Server: Auto-started/stopped - no manual management needed
Installation Scripts
# Install all dependencies and configure git hooks
./scripts/setup.sh
# Validate setup
yarn validateThe setup script configures Git hooks that run yarn lint and yarn typecheck before each commit.
Testing
Running Tests
# Run all tests
yarn test
# Run specific test suites
yarn test:e2e:cli # CLI regression tests (basic)
yarn test:e2e:cli:full # CLI E2E tests with real browser (CLI_E2E=1)
# Run validation tests
yarn test:validation:unit # Unit tests
yarn test:validation:integration # Integration tests
yarn test:validation:e2e # E2E tests
# Run test flows by platform
yarn test:flows:web # Web Maestro flows
yarn test:flows:ios # iOS Maestro flows
yarn test:flows:android # Android Maestro flows
yarn test:flows:macos # macOS AppleScript flows
# Run by tags
yarn test:sanity # Sanity tests (P0)
yarn test:smoke # Smoke tests
yarn test:regression # Full regression suiteCLI E2E Regression Suite
The CLI E2E regression suite (tests/validation/e2e/cli-e2e-regression.test.ts) provides comprehensive validation of all CLI functionality:
| Test Category | Tests | Description |
|---------------|-------|-------------|
| Help & Version | 3 | --help, --version, command help |
| Registry Commands | 7 | stats, list, init with filters |
| Schema Commands | 8 | JSON/human schema output |
| Device Management | 4 | list-devices with real simulators |
| Test Runner | 6 | parse-test, export-test |
| Error Handling | 4 | Invalid commands/options |
| Web E2E | 3 | Real browser: inspect, navigate, expect-visible |
| Integration | 2 | Cross-command consistency |
The Web E2E tests launch a real Chromium browser, navigate to websites, capture screenshots, and verify elements.
# Run basic CLI tests (no browser)
yarn test:e2e:cli
# Run full E2E including real browser interactions
yarn test:e2e:cli:fullCI/CD Integration
GitHub Actions Example
name: Regression Tests
on: [push]
jobs:
test:
runs-on: macos-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node
uses: actions/setup-node@v4
with:
node-version: '20'
- name: Setup Java 17 (for Maestro)
uses: actions/setup-java@v4
with:
distribution: 'temurin'
java-version: '17'
- name: Enable Corepack
run: corepack enable
- name: Install dependencies
run: yarn install
- name: Install AI Tester
run: npm install -g @shaykec/ai-tester
- name: Run Playwright tests
run: npx playwright test tests/web/
- name: Run Maestro tests with JUnit reports
run: |
# Install Maestro CLI
curl -Ls "https://get.maestro.mobile.dev" | bash
export PATH="$PATH:$HOME/.maestro/bin"
# Run tests with JUnit XML report for CI integration
maestro test tests/flows/ --format junit --output test-results.xml
- name: Publish Test Results
uses: dorny/test-reporter@v1
if: always()
with:
name: Maestro Tests
path: test-results.xml
reporter: java-junit
- name: Upload test artifacts
uses: actions/upload-artifact@v4
if: failure()
with:
name: test-artifacts
path: ./test-output/Using AI Tester CLI in CI
You can also use the AI Tester CLI for running tests with built-in reporting:
- name: Run tests via AI Tester CLI
run: |
npx @shaykec/ai-tester run-test --platform ios --app com.example.app \
--path tests/flows/ --mode native \
--report-format junit --report-output test-results.xml \
--test-output-dir ./test-outputReport Formats
| Format | Flag | Use Case |
|--------|------|----------|
| JUnit XML | --format junit | CI/CD integration (GitHub Actions, Jenkins, CircleCI) |
| HTML | --format html | Human-readable reports for review |
Test Artifacts
Use --test-output-dir to capture screenshots, logs, and debug information:
maestro test tests/flows/ --test-output-dir ./test-outputThis directory will contain:
- Screenshots captured during test execution
- Logs and debug information
- AI analysis reports (if enabled with
--analyze)
Contributing
See CONTRIBUTING.md for guidelines.
License
MIT - see LICENSE for details.

