qortest

v0.2.0

Published

a month ago

AI-powered browser testing for Playwright — write tests in plain English

0High
0Medium
0Low

nq14

playwright testing ai natural-language browser-testing automation

Qortest

AI-powered browser testing — write tests in plain English.

Replace brittle selectors with natural language. Instead of page.locator('button.submit-form > span.label'), just write t.act('Click the submit button').

import { test, expect } from "@playwright/test";
import { qor } from "qortest";

test("user can add items to cart", async ({ page }) => {
  const t = qor(page);

  await page.goto("https://shop.example.com");
  await t.act("Click on <Running Shoes>");
  await t.act("Select size <10> from the size dropdown");
  await t.act("Click <Add to Cart>");

  const cartCount = await t.query("What number is shown on the cart badge?");
  expect(cartCount).toBe("1");
});

Installation

npm install qortest @playwright/test
npx playwright install  # if you haven't set up browsers yet

1. Set your API key:

export QORTEST_API_KEY=sk-...

Or in .env.test:

QORTEST_API_KEY=sk-...
QORTEST_MODEL=gpt-4.1-mini

2. Add the reporter to playwright.config.ts:

reporter: [
  ["html"],
  ["qortest/reporter"],
],

3. Write a test:

import { test, expect } from "@playwright/test";
import { qor } from "qortest";

test("homepage loads", async ({ page }) => {
  const t = qor(page);
  await page.goto("https://example.com");
  const heading = await t.query("What is the main heading text?");
  expect(heading).toBe("Example Domain");
});

4. Run it:

npx playwright test

API

`t.act(instruction)` — interact with the page

await t.act("Click the login button");
await t.act("Type <[email protected]> in the email field");
await t.act("Select <Canada> from the country dropdown");
await t.act("Scroll down to the pricing section");

Wrap element names and values in <...> to distinguish them from the surrounding instruction.

`t.query(instruction)` — extract information

const price = await t.query("What is the total price shown?");
const count = await t.query("How many items are in the cart?");

Returns a string.

`t.assert(instruction)` — verify page state

const ok = await t.assert("Is the success message visible?");
expect(ok).toBe(true);

Returns a boolean.

`t.run(...steps)` — batch steps

await t.run(
  "Click the email field",
  "Type <[email protected]>",
  "Click the login button",
  "? Welcome dashboard is visible",  // ? prefix = assert, throws if false
);

Caching — full guide

Qortest caches the selectors it discovers so repeat runs don't call the LLM. Cache is a JSON file you can commit to git.

export QORTEST_CACHE=true
export QORTEST_CACHE_DIR=.qortest  # default

On a cache hit, qortest re-runs the cached selector against a fresh snapshot. If the page structure has changed (fingerprint mismatch), it falls back to a fresh LLM call. Cache hits cost zero tokens.

To force fresh LLM calls for a run:

export QORTEST_CACHE_BUST=true

Fallback Model

Configure a more capable fallback model for retries. On a second attempt (after a suspicious result or execution failure), qortest automatically uses the fallback.

export QORTEST_FALLBACK_MODEL=gpt-4o

Fixture

For cleaner test setup, use the built-in fixture:

// qor-test.ts
import { test as base } from "@playwright/test";
import { qorFixture, type QorFixture } from "qortest";

export const test = base.extend<QorFixture>({
  ...qorFixture(base),
});

// my-spec.ts
import { test } from "./qor-test";

test("example", async ({ page, t }) => {
  await t.act("Click login");
});

Report

The reporter (configured in step 2 above) injects a summary banner into Playwright's HTML report and writes a standalone qortest-report/index.html. It shows per-test LLM calls, cache hit rate, retries, token usage, and estimated cost for OpenAI, Anthropic, Google, MiniMax, and GLM models.

Configuration

| Variable | Description | Default | |---|---|---| | QORTEST_API_KEY | LLM provider API key | — | | QORTEST_MODEL | Model to use | gpt-4.1-mini | | QORTEST_BASE_URL | Custom endpoint (any OpenAI-compatible API) | https://api.openai.com/v1 | | QORTEST_FALLBACK_MODEL | Model used on retry attempts | — | | QORTEST_CACHE | Enable selector caching | false | | QORTEST_CACHE_DIR | Cache directory | .qortest | | QORTEST_CACHE_BUST | Force fresh LLM calls for this run | false | | QORTEST_RETRIES | LLM call retries on network error | 0 | | QORTEST_DEBUG | Enable debug logging (DEBUG=qortest:*) | false |

How It Works

You call t.act("Click the login button").
Qortest captures an aria snapshot of the page — a compact semantic tree, typically 1–2 KB.
The snapshot + instruction go to your LLM.
The LLM returns a structured selector ({ role, name, op }).
Qortest executes it via Playwright and caches the selector for next time.

Aria snapshots are 50–100× smaller than screenshots, which keeps token costs low and latency fast. Structured selectors are deterministic — same input, same locator, every run.

Supported Browsers

Chromium
Firefox

Benchmarks

25-test suite on the-internet, gpt-4.1-mini, Chromium + Firefox, 3 workers.

| Mode | Pass rate | Avg time | LLM calls | Cost/run | |------|-----------|----------|-----------|----------| | Qortest — cold (no cache) | 100% | ~1.5m | 51 | ~$0.13 | | Qortest — warm (cache hit) | 100% | ~57s | ~5 | ~$0.007 | | Raw Playwright | 100% | ~49s | 0 | $0 |

Once the cache is warm, qortest runs within ~15% of raw Playwright speed at ~$0.007/run — with no selectors to write or maintain.

Troubleshooting

See docs/troubleshooting.md for common issues — auth errors, element not found, flaky tests, cache problems, and debug logging.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme