ai-wright

v0.0.11

Published

3 months ago

AI-steps in your Playwright scripts

0High
0Medium
0Low

testchimphq

ai-wright

AI steps (ai.act, ai.verify) in your Playwright tests — open source, vision-enabled, with no vendor lock-in.

Introduction

ai-wright lets you include AI-native actions, verifications, and data extraction in any Playwright test. It supports:

BYOL (bring-your-own-license) to use your own OpenAI, Google Gemini, or Anthropic Claude API keys.
or use your TestChimp license key to avoid paying separately for token usage.

Unlike other solutions, ai-wright relies on vision intelligence: screenshots are annotated with Set-of-Marks (SoM) overlays and combined with DOM element maps for disambiguation, so the LLM can navigate complex UIs with far greater accuracy and resilience.

Why teams adopt ai-wright:

Vendor flexibility & BYOL – use your own OpenAI, Gemini, or Claude keys, or your TestChimp license (to avoid separate token usage costs).
Vision-first semantics – SoM overlays + DOM metadata give the model precise context.
Resilient prompting – pre-action planning (eg: handling blockers like modals before addressing the actual requirement step), retry guidance, ability to handle coarse-grained steps with multi-step planning.
Open source – complete transparency and community support.
Pluggable LLM providers – extend to any LLM provider (eg: a local LLM) by implementing a provider (see src/llm-providers/README.md).

Usage Guide

Installation

npm install ai-wright
# or
yarn add ai-wright

Then import the library inside your Playwright tests:

import { ai } from 'ai-wright';

AI Commands

`ai.act(objective, {page,test})`

Executes one or more UI actions to satisfy the given objective. The library:

Waits for page stability.
Generates a SoM map + screenshot.
Queries the LLM for pre-actions necessary (e.g., close modals) and main commands.
Runs each command sequentially with detailed retries, to achieve the given objective.

await ai.act('Log in as [email protected] with password TestPass123', {
  page,
  test,
});

`ai.verify(requirement, {page,test}, options?)`

Vision-driven assertion that works like expect. It fails the Playwright step if the LLM reports verificationSuccess = false or if the reported confidence falls below options.confidence_threshold (default 70%).

await ai.verify('The toast should say "Message sent"', {
  page,
  test,
}, {
  confidence_threshold: 85,
});

`ai.extract(requirement, context, options?)`

Pulls structured data from the page. Set options.return_type to shape the output ('string' | 'string_array' | 'int' | 'int_array').

const orderIds = await ai.extract('List the order IDs from the table', {
  page,
  test,
}, {
  return_type: 'string_array',
});

Authentication

ai-wright chooses credentials in priority order:

OpenAI API key
- Set OPENAI_API_KEY env var. (and optionally OPENAI_MODEL, defaults to gpt-5-mini).
TestChimp API keys
- Set TESTCHIMP_API_KEY + TESTCHIMP_PROJECT_ID, or TESTCHIMP_USER_AUTH_KEY + TESTCHIMP_USER_MAIL.
- Benefit: reuse your existing TestChimp account, letting their backend proxy the LLM and cover token costs.
Google Gemini API key
- Set GEMINI_API_KEY (and optionally GEMINI_MODEL, defaults to gemini-1.5-flash).
Anthropic Claude API key
- Set CLAUDE_API_KEY (and optionally CLAUDE_MODEL, defaults to claude-3-sonnet-20240229).

The selection order is configurable in src/llm-providers/config.ts. See the LLM provider guide for instructions on adding new providers.

Example Playwright Test

import { test } from '@playwright/test';
import { ai } from 'ai-wright';

test('send message', async ({ page }) => {
  await page.goto('https://studio--cafetime-afg2v.us-central1.hosted.app/');

  await ai.act('Log in with [email protected] / TestPass123', { page, test });

  await ai.act('Open the Messages tab and send "Hello"', { page, test });

  await ai.verify('The message input field should be empty afterwards', { page, test });
});

Advanced Configuration

Environment variables:

| Variable | Description | Default | | --- | --- | --- | | AI_PLAYWRIGHT_DEBUG | Enable verbose logging (1, true, on, yes). | off | | AI_PLAYWRIGHT_TEST_TIMEOUT_MS | Extend Playwright test timeouts automatically; 0 disables extension. | 180000 | | AI_PLAYWRIGHT_MAX_WAIT_RETRIES | How many times the LLM may request additional waits. | 2 | | LLM_CALL_TIMEOUT | Max duration (ms) for each LLM request. | 120000 | | COMMAND_EXEC_TIMEOUT | Timeout (ms) for individual DOM actions. | 5000 | | NAVIGATION_COMMAND_TIMEOUT | Timeout (ms) for navigation actions. | 15000 | | OPENAI_MODEL | Override the OpenAI model (ignored when using TestChimp). | gpt-5-mini | | GEMINI_API_KEY | Google Gemini API key used by the Gemini provider. | — | | GEMINI_MODEL | Override the Gemini model. | gemini-1.5-flash | | CLAUDE_API_KEY | Anthropic Claude API key used by the Claude provider. | — | | CLAUDE_MODEL | Override the Claude model. | claude-3-sonnet-20240229 | | CLAUDE_MAX_TOKENS | Override the Claude response token limit. | 1024 |

Optional context/options:

context.logger: (message: string) => void to receive internal log output.
options.confidence_threshold: override verification threshold per call.
options.return_type: control extraction result shape.

Comparison with Other Solutions

ZeroStep

Requires a proprietary license key causing vendor-lock.
Unmaintained and limited to GPT-3.5.
Tied to CDP (Chrome DevTools Protocol), so works only with Chrome.
Offers fewer resilience mechanisms than SoM plus multi-strategy retries.

auto-playwright

The agent relies on DOM context, significantly limiting its ability to navigate complex UIs.
Reliance on DOM means the prompt sizes are unbounded.
Verifications require you to parse the AI response manually and call expect yourself; ai.verify does this automatically.
Project activity is minimal, raising maintenance concerns.

Fully-Agentic Test Suites

Vendor lock in. Fully agentic tests require proprietary runners and custom formats, which result in vendor lock-in.
All steps are agentic, resulting in slow, costly, non-deterministic tests.
ai-wright enables a hybrid approach: keep 90% of your test deterministic Playwright code, and inject AI only for the messy, nondeterministic UI flows / verifications.
This balances speed and reliability while still unlocking AI flexibility where you actually need it.

Start by installing the package, set either TestChimp or OpenAI credentials, and layer ai.act, ai.verify, or ai.extract onto the toughest parts of your Playwright suite.

AI where it helps, plain Playwright everywhere else.

Author

ai-wright is an Open source project contributed by TestChimp — an AI QA platform that learns your web app through explorations, to provide context-aware AI assistance to QA workflows.

License

Distributed under the GNU Affero General Public License v3.0. See the LICENSE file for full terms.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

ai-wright

Introduction

Usage Guide

Installation

AI Commands

ai.act(objective, {page,test})

ai.verify(requirement, {page,test}, options?)

ai.extract(requirement, context, options?)

Authentication

Example Playwright Test

Advanced Configuration

Comparison with Other Solutions

ZeroStep

auto-playwright

Fully-Agentic Test Suites

Author

License

`ai.act(objective, {page,test})`

`ai.verify(requirement, {page,test}, options?)`

`ai.extract(requirement, context, options?)`