ai-wright
v0.0.10
Published
AI-steps in your Playwright scripts
Readme
ai-wright
AI steps (ai.act, ai.verify) in your Playwright tests — open source, vision-enabled, with no vendor lock-in.
Introduction
ai-wright lets you include AI-native actions, verifications, and data extraction in any Playwright test. It supports:
- BYOL (bring-your-own-license) to use your own OpenAI, Google Gemini, or Anthropic Claude API keys.
- or use your TestChimp license key to avoid paying separately for token usage.
Unlike other solutions, ai-wright relies on vision intelligence: screenshots are annotated with Set-of-Marks (SoM) overlays and combined with DOM element maps for disambiguation, so the LLM can navigate complex UIs with far greater accuracy and resilience.
Why teams adopt ai-wright:
- Vendor flexibility & BYOL – use your own OpenAI, Gemini, or Claude keys, or your TestChimp license (to avoid separate token usage costs).
- Vision-first semantics – SoM overlays + DOM metadata give the model precise context.
- Resilient prompting – pre-action planning (eg: handling blockers like modals before addressing the actual requirement step), retry guidance, ability to handle coarse-grained steps with multi-step planning.
- Open source – complete transparency and community support.
- Pluggable LLM providers – extend to any LLM provider (eg: a local LLM) by implementing a provider (see src/llm-providers/README.md).
Usage Guide
Installation
npm install ai-wright
# or
yarn add ai-wrightThen import the library inside your Playwright tests:
import { ai } from 'ai-wright';AI Commands
ai.act(objective, {page,test})
Executes one or more UI actions to satisfy the given objective. The library:
- Waits for page stability.
- Generates a SoM map + screenshot.
- Queries the LLM for pre-actions necessary (e.g., close modals) and main commands.
- Runs each command sequentially with detailed retries, to achieve the given objective.
await ai.act('Log in as [email protected] with password TestPass123', {
page,
test,
});ai.verify(requirement, {page,test}, options?)
Vision-driven assertion that works like expect. It fails the Playwright step if the LLM reports verificationSuccess = false or if the reported confidence falls below options.confidence_threshold (default 70%).
await ai.verify('The toast should say "Message sent"', {
page,
test,
}, {
confidence_threshold: 85,
});ai.extract(requirement, context, options?)
Pulls structured data from the page. Set options.return_type to shape the output ('string' | 'string_array' | 'int' | 'int_array').
const orderIds = await ai.extract('List the order IDs from the table', {
page,
test,
}, {
return_type: 'string_array',
});Authentication
ai-wright chooses credentials in priority order:
- OpenAI API key
- Set
OPENAI_API_KEYenv var. (and optionallyOPENAI_MODEL, defaults togpt-5-mini).
- Set
- TestChimp API keys
- Set
TESTCHIMP_API_KEY+TESTCHIMP_PROJECT_ID, orTESTCHIMP_USER_AUTH_KEY+TESTCHIMP_USER_MAIL. - Benefit: reuse your existing TestChimp account, letting their backend proxy the LLM and cover token costs.
- Set
- Google Gemini API key
- Set
GEMINI_API_KEY(and optionallyGEMINI_MODEL, defaults togemini-1.5-flash).
- Set
- Anthropic Claude API key
- Set
CLAUDE_API_KEY(and optionallyCLAUDE_MODEL, defaults toclaude-3-sonnet-20240229).
- Set
The selection order is configurable in src/llm-providers/config.ts. See the LLM provider guide for instructions on adding new providers.
Example Playwright Test
import { test } from '@playwright/test';
import { ai } from 'ai-wright';
test('send message', async ({ page }) => {
await page.goto('https://studio--cafetime-afg2v.us-central1.hosted.app/');
await ai.act('Log in with [email protected] / TestPass123', { page, test });
await ai.act('Open the Messages tab and send "Hello"', { page, test });
await ai.verify('The message input field should be empty afterwards', { page, test });
});Advanced Configuration
Environment variables:
| Variable | Description | Default |
| --- | --- | --- |
| AI_PLAYWRIGHT_DEBUG | Enable verbose logging (1, true, on, yes). | off |
| AI_PLAYWRIGHT_TEST_TIMEOUT_MS | Extend Playwright test timeouts automatically; 0 disables extension. | 180000 |
| AI_PLAYWRIGHT_MAX_WAIT_RETRIES | How many times the LLM may request additional waits. | 2 |
| LLM_CALL_TIMEOUT | Max duration (ms) for each LLM request. | 120000 |
| COMMAND_EXEC_TIMEOUT | Timeout (ms) for individual DOM actions. | 5000 |
| NAVIGATION_COMMAND_TIMEOUT | Timeout (ms) for navigation actions. | 15000 |
| OPENAI_MODEL | Override the OpenAI model (ignored when using TestChimp). | gpt-5-mini |
| GEMINI_API_KEY | Google Gemini API key used by the Gemini provider. | — |
| GEMINI_MODEL | Override the Gemini model. | gemini-1.5-flash |
| CLAUDE_API_KEY | Anthropic Claude API key used by the Claude provider. | — |
| CLAUDE_MODEL | Override the Claude model. | claude-3-sonnet-20240229 |
| CLAUDE_MAX_TOKENS | Override the Claude response token limit. | 1024 |
Optional context/options:
context.logger:(message: string) => voidto receive internal log output.options.confidence_threshold: override verification threshold per call.options.return_type: control extraction result shape.
Comparison with Other Solutions
ZeroStep
- Requires a proprietary license key causing vendor-lock.
- Unmaintained and limited to GPT-3.5.
- Tied to CDP (Chrome DevTools Protocol), so works only with Chrome.
- Offers fewer resilience mechanisms than SoM plus multi-strategy retries.
auto-playwright
- The agent relies on DOM context, significantly limiting its ability to navigate complex UIs.
- Reliance on DOM means the prompt sizes are unbounded.
- Verifications require you to parse the AI response manually and call
expectyourself;ai.verifydoes this automatically. - Project activity is minimal, raising maintenance concerns.
Fully-Agentic Test Suites
- Vendor lock in. Fully agentic tests require proprietary runners and custom formats, which result in vendor lock-in.
- All steps are agentic, resulting in slow, costly, non-deterministic tests.
ai-wrightenables a hybrid approach: keep 90% of your test deterministic Playwright code, and inject AI only for the messy, nondeterministic UI flows / verifications.- This balances speed and reliability while still unlocking AI flexibility where you actually need it.
Start by installing the package, set either TestChimp or OpenAI credentials, and layer ai.act, ai.verify, or ai.extract onto the toughest parts of your Playwright suite.
AI where it helps, plain Playwright everywhere else.
Author
ai-wright is an Open source project contributed by TestChimp — an AI QA platform that learns your web app through explorations, to provide context-aware AI assistance to QA workflows.
License
Distributed under the GNU Affero General Public License v3.0. See the LICENSE file for full terms.
