@x360airuntest/agent

v0.1.8

Published

5 days ago

X360AIRunTest Web Agent

Downloads

800

What is X360AIRunTest?

X360AIRunTest is an AI-powered browser automation and testing framework built on top of Playwright. Instead of writing brittle CSS selectors or XPath expressions, you describe your test steps in plain English - the framework uses a Large Language Model (LLM) to translate your intent into real browser actions.

// Plain English. No selectors. No XPath. Just intent.
await agent.executeTask([
  'Navigate to https://app.example.com/login',
  'Enter [email protected] in the Email field',
  'Enter SecurePass123 in the Password field',
  'Click the Sign In button',
  'Verify the Dashboard heading is displayed',
].join('\n'));

The Problem with Traditional Test Automation

Traditional Playwright tests break as soon as a developer changes an element ID, class name, or page structure:

// ❌ Breaks when the developer renames the button
await page.locator('#login-btn-v2').click();
await page.locator('.auth-form > input[type="email"]').fill('[email protected]');

This leads to:

Frequent test failures that block CI/CD pipelines
Hours spent updating selectors instead of writing new tests
Loss of confidence in the test suite

The Solution

X360AIRunTest automatically adapts to UI changes using AI-powered recovery, reducing test maintenance time by over 80% while keeping your costs minimal through a Traditional-First, AI-When-Needed strategy.

Why Automation Engineers Choose X360AIRunTest

| Benefit | Detail | |---|---| | Self-Healing Tests | AI automatically recovers when selectors break due to UI changes | | Zero Selector Rewrites | Write tests in natural language - no XPath or CSS knowledge required | | Playwright Native | Drop-in enhancement; works alongside your existing Playwright suite | | Full Evaluation Engine | Every test run is graded: Passed, AI Smart Passed, or Failed | | Multi-LLM Support | Azure OpenAI, OpenAI, Anthropic Claude, Google Gemini, DeepSeek | | Debug Artifacts | Detailed per-action logs, screenshots, DOM trees, and LLM responses | | Cost Optimised | Traditional selectors run first - AI only activates on failure |

How It Works

┌─────────────────────────────────────────────────────────────┐
│                        Your Test File                        │
│  steps = ["Navigate to ...", "Click ...", "Verify ..."]      │
└────────────────────────┬────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────┐
│                     X360AIAgent                              │
│  • Reads the accessibility DOM tree of the current page      │
│  • Sends the DOM + your instruction to the configured LLM    │
│  • LLM returns the best action: click, type, scroll, …       │
│  • Agent executes the action in the real browser             │
└────────────────────────┬────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────┐
│                  Evaluation Engine (optional)                 │
│  • Compares planned steps vs. executed steps                 │
│  • Classifies result: Passed / AI Smart Passed / Failed      │
│  • Writes agentReport.txt + taskOutput.json to debug/        │
└─────────────────────────────────────────────────────────────┘

The Traditional-First, AI-When-Needed Strategy

X360AIRunTest wraps your existing Playwright page object. Stable selectors run at full native speed with zero LLM cost. AI recovery only activates when a selector throws.

import { test, expect } from '@playwright/test';
import { X360AIAgent } from '@x360airuntest/agent';

let agent: X360AIAgent;

test.beforeAll(async () => {
  agent = new X360AIAgent({
    llm: { provider: 'openai', model: 'gpt-4o-mini' },
  });
});

test.afterAll(async () => {
  await agent.closeAgent();
});

test('login with intelligent recovery', async ({ page }) => {
  const aiPage = await agent.wrapPage(page);

  await page.goto('https://app.example.com/login');

  // Native Playwright - zero cost, full speed
  await page.locator('#email').fill('[email protected]');
  await page.locator('#password').fill('SecurePass123');

  // AI recovery activates only if this selector fails
  try {
    await page.locator('#login-button').click();
  } catch {
    // ⚠️ If .ai() also throws, the error propagates and fails the test.
    // Wrap in a nested try/catch if you want to handle AI failure explicitly.
    try {
      await aiPage.ai('Click the Sign In button');
    } catch (aiError) {
      throw new Error(`Both native selector and AI recovery failed: ${aiError}`);
    }
  }

  // Standard Playwright assertion
  await expect(page.locator('[data-testid="dashboard-header"]'))
    .toBeVisible();
});

What this means in practice:

✅ Stable elements run at full Playwright speed (zero LLM cost)
✅ AI activates only for problematic or recently changed selectors
✅ Optimal balance of speed, reliability, and cost
⚠️ If .ai() throws inside a catch block, the error is not swallowed — it propagates up and fails the test. Always wrap AI recovery in its own try/catch if you need explicit failure handling.

Pricing Plans

🚀 X360 Managed LLM (Recommended)

Fully Hosted Solution

We handle all infrastructure, configuration, and model management for you.

| Feature | Detail | |---|---| | Configuration | Zero - plug in your API key and go | | Compliance | SOC 2 Type II compliant infrastructure | | Models | GPT-4o, Claude, Gemini, DeepSeek (multi-model) | | Updates | Automatic optimisations and model updates | | Reliability | 99.9% uptime SLA | | Support | Enterprise-grade dedicated support |

Perfect for: Teams who want to focus on testing, not infrastructure management.

🔧 Bring Your Own LLM

Self-Managed Solution

Connect X360AIRunTest directly to your own LLM provider credentials.

| Feature | Detail | |---|---| | Providers | Azure OpenAI, OpenAI, Anthropic, Gemini, DeepSeek | | Control | Full control over model versions and endpoints | | Cost | You pay your LLM provider directly | | Privacy | Data stays within your own cloud tenancy |

Perfect for: Teams with existing Azure OpenAI, OpenAI, Anthropic, Google Gemini, or DeepSeek agreements, or those with strict data residency requirements.

Installation & Setup

Prerequisites

Node.js 18 or later
A Playwright project (or start a new one)
An LLM provider API key (Azure OpenAI recommended)
X360 Automation API credentials

Step 1 - Install the Package

npm install @x360airuntest/agent
# or
yarn add @x360airuntest/agent
# or
pnpm add @x360airuntest/agent

Step 2 - Install Playwright (if not already installed)

npx playwright install chromium

Step 3 - Get Your X360 Automation API Credentials

X360 Automation API credentials are required.

How to Get Your Credentials

Sign up or log in at x360aitech.com.
Create your account and at least one Project - a Project must exist before an API key can be used.
Navigate to Settings → API Keys and create a new key. Copy the value - this is your X360_AUTOMATION_API_KEY.
On the same API Keys page you will also find your Account ID and Project ID. Copy both values - these are your X360_AUTOMATION_ACCOUNT_ID and X360_AUTOMATION_PROJECT_ID.

| Credential | Where to Find It | |---|---| | X360_AUTOMATION_API_KEY | Settings → API Keys | | X360_AUTOMATION_ACCOUNT_ID | Settings → API Keys | | X360_AUTOMATION_PROJECT_ID | Settings → API Keys |

💡 First time? Visit x360aitech.com to create your account, or contact [email protected] for access.

Step 4 - Configure Environment Variables

Create a .env file in your project root. Never commit this file.

# ─── Azure OpenAI (recommended for production) ────────────────────────
AZURE_OPENAI_API_KEY=your-azure-openai-key
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_DEPLOYMENT=...
AZURE_OPENAI_API_VERSION=...

# ─── Alternative: Standard OpenAI ────────────────────────────────────
# OPENAI_API_KEY=sk-...

# ─── Alternative: Anthropic Claude ───────────────────────────────────
# ANTHROPIC_API_KEY=sk-ant-...

# ─── X360 Automation API ───────
X360_AUTOMATION_API_KEY=your-x360-api-key
X360_AUTOMATION_ACCOUNT_ID=your-account-id
X360_AUTOMATION_PROJECT_ID=your-project-id

Create a .env.example file (safe to commit to version control):

AZURE_OPENAI_API_KEY=
AZURE_OPENAI_ENDPOINT=
AZURE_OPENAI_DEPLOYMENT=gpt-4o
AZURE_OPENAI_API_VERSION=2024-08-01-preview
X360_AUTOMATION_API_KEY=
X360_AUTOMATION_ACCOUNT_ID=
X360_AUTOMATION_PROJECT_ID=

Add .env to your .gitignore:

.env
node_modules/
dist/
debug/

Step 5 - Recommended Project Structure

your-project/
├── .env                      ← Secret keys - NEVER commit
├── .env.example              ← Key names only - safe to commit
├── tsconfig.json
├── package.json
├── helpers/
│   └── agent-helper.ts       ← Shared agent factory (see below)
├── tests/
│   ├── login.spec.ts
│   ├── products.spec.ts
│   ├── cart.spec.ts
│   └── checkout.spec.ts
└── debug/                    ← Auto-generated test artifacts

Quick Start

Create a Shared Agent Factory

Create helpers/agent-helper.ts to centralise your agent configuration:

import { X360AIAgent } from '@x360airuntest/agent';
import dotenv from 'dotenv';

dotenv.config();

export const createAgent = (): X360AIAgent => {
  return new X360AIAgent({
    llm: {
      provider: 'azure-openai',
      model: process.env.AZURE_OPENAI_DEPLOYMENT || 'gpt-4o',
    },
    // Optional if X360_AUTOMATION_* variables are set in your .env file
    automationApiKey: process.env.X360_AUTOMATION_API_KEY,
    automationAccountId: process.env.X360_AUTOMATION_ACCOUNT_ID,
    automationProjectId: process.env.X360_AUTOMATION_PROJECT_ID,
  });
};

// Credentials for SauceDemo - adapt this structure to fit your own project and architecture
export const CREDENTIALS = {
  standard: { username: 'standard_user', password: 'secret_sauce' },
  locked:   { username: 'locked_out_user', password: 'secret_sauce' },
  problem:  { username: 'problem_user', password: 'secret_sauce' },
  invalid:  { username: 'invalid_user', password: 'wrong_password' },
} as const;

export const SAUCE_DEMO_URL = 'https://www.saucedemo.com/';

Write Your First Test

// tests/login.spec.ts
import { test } from '@playwright/test';
import { createAgent, CREDENTIALS, SAUCE_DEMO_URL } from '../helpers/agent-helper';

test('standard user can log in', async () => {
  let agent: X360AIAgent;

  try {
    const result = await agent.executeTask([
      `Navigate to ${SAUCE_DEMO_URL}`,
      `Enter the username as ${CREDENTIALS.standard.username}`,
      `Enter the password as ${CREDENTIALS.standard.password}`,
      'Click the LOGIN button',
      'Verify the Products page heading is displayed',
    ].join('\n'));

    console.log(result.output);
  } finally {
    await agent.closeAgent();
  }
});

Run Your Test

npx playwright test tests/login.spec.ts

Implementation Patterns

Pattern 1 - `agent.executeTask()` (Recommended for Full AI Tests)

Best for production tests that need per-step tracking and evaluation.

const agent = createAgent();

const result = await agent.executeTask([
  'Navigate to https://www.saucedemo.com/',
  'Enter standard_user in the Username field',
  'Enter secret_sauce in the Password field',
  'Click the Login button',
  'Verify the Products page heading is displayed',
].join('\n'));

console.log(result.output);
await agent.closeAgent();

Pattern 2 - `page.aiAction()` / `page.ai()` (Granular Recovery - Recommended for Existing Tests)

Best for Playwright-style tests where you want AI recovery for specific actions only.

const agent = createAgent();
const page = await agent.newPage();

await page.goto('https://www.saucedemo.com/');
await page.aiAction('enter standard_user in the Username field');
await page.aiAction('enter secret_sauce in the Password field');
await page.aiAction('click the Login button');

await agent.closeAgent();

Pattern 3 - `agent.wrapPage()` (Drop-In Recovery for Existing Playwright Tests)

The most cost-efficient pattern. Use your existing Playwright selectors with AI as a fallback.

⚠️ Error propagation: If .ai() throws inside a catch block, the error is not swallowed — it re-throws as an X360AIRunTestAgentError and fails the test. Wrap AI recovery in its own try/catch to handle or enrich failure messages.

test('cart flow with recovery', async ({ page }) => {
  const aiPage = await agent.wrapPage(page);

  await page.goto('https://www.saucedemo.com/');

  // Native Playwright first (zero cost)
  try {
    await page.locator('#user-name').fill('standard_user');
    await page.locator('#password').fill('secret_sauce');
    await page.locator('#login-button').click();
  } catch {
    // AI recovery if selector fails.
    // If .ai() also throws, that error propagates and fails the test.
    try {
      await aiPage.ai('Log in with standard_user and secret_sauce');
    } catch (aiError) {
      throw new Error(`Both native selector and AI recovery failed: ${aiError}`);
    }
  }

  // Standard assertion - no AI needed here
  await expect(page.locator('.title')).toHaveText('Products');
});

Supported AI Models

| Provider | Models | Environment Variables | |---|---|---| | Azure OpenAI (recommended) | gpt-4o, gpt-4o-mini | AZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_DEPLOYMENT | | OpenAI | gpt-4o, gpt-4o-mini, gpt-4-turbo | OPENAI_API_KEY | | Anthropic | claude-opus-4, claude-sonnet-4-5, claude-haiku-3-5 | ANTHROPIC_API_KEY | | Google Gemini | gemini-2.0-flash, gemini-1.5-pro | GEMINI_API_KEY | | DeepSeek | deepseek-chat, deepseek-reasoner | DEEPSEEK_API_KEY |

Provider Configuration Examples

Azure OpenAI:

llm: {
  provider: 'azure-openai',
  model: 'gpt-4o',
  // Optional if AZURE_OPENAI_* variables are set in your .env file
  apiKey: process.env.AZURE_OPENAI_API_KEY,
  endpoint: process.env.AZURE_OPENAI_ENDPOINT,
  deployment: process.env.AZURE_OPENAI_DEPLOYMENT,
  apiVersion: process.env.AZURE_OPENAI_API_VERSION || '2024-08-01-preview',
}

Anthropic Claude:

llm: {
  provider: 'anthropic',
  model: 'claude-sonnet-4-5',
  // Optional if ANTHROPIC_API_KEY is set in your .env file
  apiKey: process.env.ANTHROPIC_API_KEY,
}

OpenAI:

llm: {
  provider: 'openai',
  model: 'gpt-4o',
  // Optional if OPENAI_API_KEY is set in your .env file
  apiKey: process.env.OPENAI_API_KEY,
}

Evaluation & Reporting

When enableEvaluation: true is set, the agent runs a second LLM pass after each task to grade the execution.

Test Result Classifications

| Classification | Meaning | |---|---| | ✅ Passed | Every step executed in exact order; all validations matched precisely | | 🤖 AI Smart Passed | All steps completed with minor acceptable differences (reworded messages, reordered steps, alternative paths) | | ❌ Failed | A step failed outright, the test objective was not achieved, or a required validation was missing |

AI Smart Passed - Examples

| Expected | Actual | Result | |---|---|---| | "Login failed" | "Invalid credentials" | 🤖 AI Smart Passed | | "Error: user is locked" | "Epic sadface: Sorry, this user has been locked out." | 🤖 AI Smart Passed | | "Email is required" | "Please enter your email address" | 🤖 AI Smart Passed |

Note: Numeric values (amounts, counts, IDs) are always exact-matched. Only text/message assertions use semantic comparison.

Debug Artifacts

When debug: true is set, X360AIRunTest writes a full artifact set to the debug/ folder after each run:

debug/
├── agentReport.txt           ← Human-readable summary (start here)
├── taskOutput.json           ← Machine-readable step-by-step log
├── agentHistory.json         ← Full agent thought process
├── action-0/
│   ├── dom-tree.txt          ← Accessibility DOM at this step
│   ├── llm-response.json     ← LLM decision and reasoning
│   ├── found-element.json    ← Element the agent targeted
│   └── metadata.json         ← Timing, tokens, selector used
├── action-1/
│   └── ...
└── video.webm                ← Full screen recording (if enabled)

Step Annotations

Control evaluation behaviour with inline annotations:

| Annotation | Meaning | |---|---| | #ExpectedToFail | Step is expected to fail (e.g., testing a blocked feature) | | #MustPass | Step must not fail under any circumstances | | #WarningOnly | Failure raises a warning but does not fail the test | | #ExactMatchOnly | Disables semantic comparison; requires exact text match | | #Eval: custom note | Attaches a human-readable evaluation note to the step |

const steps = [
  'Navigate to https://app.example.com',
  'Click the Delete Account button #MustPass',
  'Verify the confirmation dialog is displayed',
  'Click Cancel #WarningOnly',
];

Best Practices

✅ When to Use `agent.executeTask()`

Full end-to-end flows written in plain English
New test suites where you want maximum AI coverage
Exploratory automation against frequently-changing UIs

✅ When to Use `agent.wrapPage()` Recovery

Existing Playwright tests that keep breaking due to UI changes
Stable tests you want to protect with minimal changes
Cost-sensitive scenarios where AI should only activate on failure

❌ When to Avoid AI Recovery

Simple, static pages with stable, semantic HTML
High-frequency smoke tests where execution time is critical
Steps that interact with data (amounts, IDs) - use exact Playwright assertions

💡 Cost Optimisation

Always try native selectors first; use AI as a last resort
Use gpt-4o-mini for simple navigation and interaction tasks
Reserve gpt-4o or claude-opus for complex multi-step reasoning
Set enableEvaluation: false during development to reduce LLM calls
Use debug: false in CI to skip artifact writes

🛠️ Maintenance Workflow

Run tests - recovery mode catches broken selectors automatically
Review the agent report - debug/agentReport.txt shows exactly what recovered
Apply permanent fixes - update the selector in your test if the change is stable
Remove recovery code - once the selector is fixed, remove the AI fallback

// Before (AI recovery active)
try {
  await page.locator('#login-btn').click();
} catch {
  await aiPage.ai('Click the login button');
}

// After (permanent fix applied - AI no longer needed here)
await page.locator('[data-testid="btn-login"]').click();

Complete Working Example

A full end-to-end e-commerce test with login, cart, and checkout:

// tests/checkout.spec.ts
import { test, expect } from '@playwright/test';
import { createAgent, CREDENTIALS, SAUCE_DEMO_URL } from '../helpers/agent-helper';

let agent: ReturnType<typeof createAgent>;

test.beforeAll(() => {
  agent = createAgent();
});

test.afterAll(async () => {
  await agent.closeAgent();
});

test('complete checkout flow', async ({ page }) => {
  const aiPage = await agent.wrapPage(page);

  // ── Login ────────────────────────────────────────────────────────────
  await page.goto(SAUCE_DEMO_URL);
  await page.locator('#user-name').fill(CREDENTIALS.standard.username);
  await page.locator('#password').fill(CREDENTIALS.standard.password);
  await page.locator('#login-button').click();

  await expect(page.locator('.title')).toHaveText('Products');

  // ── Add item to cart ─────────────────────────────────────────────────
  try {
    await page.locator('[data-test="add-to-cart-sauce-labs-backpack"]').click();
  } catch {
    // Nested try/catch ensures a clear failure message if AI recovery also fails
    try {
      await aiPage.ai('Click "Add to cart" on the Sauce Labs Backpack product');
    } catch (aiError) {
      throw new Error(`Both native selector and AI recovery failed: ${aiError}`);
    }
  }

  await expect(page.locator('.shopping_cart_badge')).toHaveText('1');

  // ── Checkout ─────────────────────────────────────────────────────────
  await page.locator('.shopping_cart_link').click();
  await page.locator('[data-test="checkout"]').click();

  await aiPage.ai('Fill in First Name as John');
  await aiPage.ai('Fill in Last Name as Smith');
  await aiPage.ai('Fill in Zip/Postal Code as 10001');

  await page.locator('[data-test="continue"]').click();
  await page.locator('[data-test="finish"]').click();

  // ── Verify ───────────────────────────────────────────────────────────
  await expect(page.locator('[data-test="complete-header"]'))
    .toContainText('Thank you');
});

Advanced Topics

Data Extraction

Extract structured data from any page using a Zod schema:

import { z } from 'zod';

const page = await agent.newPage();
await page.goto('https://www.saucedemo.com/inventory.html');

const inventory = await page.extract(
  'get all product names and prices from the inventory list',
  z.object({
    products: z.array(z.object({
      name: z.string(),
      price: z.number(),
    })),
  })
);

console.log(inventory.products);
// [{ name: 'Sauce Labs Backpack', price: 29.99 }, ...]

Custom Actions

import { z } from 'zod';
import { AgentActionDefinition } from '@x360airuntest/agent';

const LoginActionDefinition: AgentActionDefinition = {
  type: 'customLogin',
  toolName: 'customLogin',
  toolDescription: 'Log in using the application-specific login flow',
  actionParams: z.object({
    username: z.string(),
    password: z.string(),
  }),
  run: async ({ params, page }) => {
    await page.goto('https://app.example.com/login');
    await page.locator('#username').fill(params.username);
    await page.locator('#password').fill(params.password);
    await page.locator('#login-btn').click();
    return { success: true };
  },
};

const agent = new X360AIAgent({
  llm: { provider: 'openai', model: 'gpt-4o' },
  // Optional if X360_AUTOMATION_* variables are set in your .env file
  automationApiKey: process.env.X360_AUTOMATION_API_KEY,
  automationAccountId: process.env.X360_AUTOMATION_ACCOUNT_ID,
  automationProjectId: process.env.X360_AUTOMATION_PROJECT_ID,
  customActions: [LoginActionDefinition],
});

iFrame Support

X360AIRunTest automatically detects and interacts with elements inside iFrames:

await agent.executeTask(
  'Navigate to https://demo.automationtesting.in/Frames.html and fill in the text box inside the nested iFrame'
);

File Uploads

await agent.executeTask([
  'Navigate to https://app.example.com/upload',
  'Upload /tmp/report.pdf to the Attachment field',
].join('\n'));

Troubleshooting

Authentication Errors

401 Unauthorized or missing credentials error:

Ensure all three X360 credentials are set in your .env file:

X360_AUTOMATION_API_KEY=your-api-key
X360_AUTOMATION_ACCOUNT_ID=your-account-id
X360_AUTOMATION_PROJECT_ID=your-project-id

These are required for all AI methods: agent.executeTask(), agent.wrapPage(), page.ai(), page.aiAction(), and page.extract().

403 Forbidden:

Confirm the API key has not expired.
Confirm the account and project IDs are correct.
Contact [email protected] if the issue persists.

Agent Does Not Find the Element

Enable debug: true to capture the DOM snapshot at the failing step.
Open debug/action-N/dom-tree.txt to inspect what the agent sees.
Rewrite your step instruction to use the exact label or role visible in the DOM tree.

Example improvement:

// ❌ Too vague
'Click the button'

// ✅ Specific and unambiguous
'Click the "Add to Cart" button next to the Sauce Labs Backpack product'

Test Is Slow or Expensive

Replace AI steps with native Playwright selectors for stable, unchanging elements.
Use gpt-4o-mini instead of gpt-4o for simpler tasks.
Set enableEvaluation: false during development.
Scope page.aiAction() calls to only the steps that actually need recovery.

Browser Does Not Launch

# Re-install the browser binary
npx playwright install chromium

Support & Resources

| Resource | Link | |---|---| | 🐛 Bug Reports & Feature Requests | GitHub Issues | | 📧 Technical Support | [email protected] | | 💼 Enterprise & Licensing | [email protected] | | 🌐 Website | x360aitech.com |

Contributing

We welcome contributions from the automation engineering community.

Getting Started

# 1. Clone the repository
git clone https://github.com/X360-Technologies/x360-ai-test-runner-lite.git

# 2. Install dependencies
yarn install

# 3. Build the project
yarn build

# 4. Run verification
node verify-build.js

Contribution Guidelines

Follow the short imperative commit style: Fix dom state retrieval, Add custom action support
Reference issues in commit messages: Fix selector resolution (#42)
Add tests for all new features
Update the relevant chapter in docs/ for any API or behaviour changes
PRs must pass CI before merging

Repository Structure

| Directory | Purpose | |---|---| | src/agent/ | Core agent logic, task loop, actions | | src/llm/ | LLM adapters (OpenAI, Anthropic, Gemini, DeepSeek) | | src/context-providers/ | DOM capture and accessibility tree extraction | | src/evaluation/ | Test evaluation and classification engine | | src/utils/ | Shared helpers, logging| | src/custom-actions/ | Extension point for domain-specific actions | | docs/ | End-user documentation (12 chapters) | | scripts/ | Integration smoke tests and eval harnesses | | examples/ | Reference implementations |

License

Licensing information not yet determined.

For licensing enquiries, please contact: [email protected]