qortest
v0.2.0
Published
AI-powered browser testing for Playwright — write tests in plain English
Maintainers
Readme
Qortest
AI-powered browser testing — write tests in plain English.
Replace brittle selectors with natural language. Instead of page.locator('button.submit-form > span.label'), just write t.act('Click the submit button').
import { test, expect } from "@playwright/test";
import { qor } from "qortest";
test("user can add items to cart", async ({ page }) => {
const t = qor(page);
await page.goto("https://shop.example.com");
await t.act("Click on <Running Shoes>");
await t.act("Select size <10> from the size dropdown");
await t.act("Click <Add to Cart>");
const cartCount = await t.query("What number is shown on the cart badge?");
expect(cartCount).toBe("1");
});Installation
npm install qortest @playwright/test
npx playwright install # if you haven't set up browsers yet1. Set your API key:
export QORTEST_API_KEY=sk-...Or in .env.test:
QORTEST_API_KEY=sk-...
QORTEST_MODEL=gpt-4.1-mini2. Add the reporter to playwright.config.ts:
reporter: [
["html"],
["qortest/reporter"],
],3. Write a test:
import { test, expect } from "@playwright/test";
import { qor } from "qortest";
test("homepage loads", async ({ page }) => {
const t = qor(page);
await page.goto("https://example.com");
const heading = await t.query("What is the main heading text?");
expect(heading).toBe("Example Domain");
});4. Run it:
npx playwright testAPI
t.act(instruction) — interact with the page
await t.act("Click the login button");
await t.act("Type <[email protected]> in the email field");
await t.act("Select <Canada> from the country dropdown");
await t.act("Scroll down to the pricing section");Wrap element names and values in <...> to distinguish them from the surrounding instruction.
t.query(instruction) — extract information
const price = await t.query("What is the total price shown?");
const count = await t.query("How many items are in the cart?");Returns a string.
t.assert(instruction) — verify page state
const ok = await t.assert("Is the success message visible?");
expect(ok).toBe(true);Returns a boolean.
t.run(...steps) — batch steps
await t.run(
"Click the email field",
"Type <[email protected]>",
"Click the login button",
"? Welcome dashboard is visible", // ? prefix = assert, throws if false
);Caching — full guide
Qortest caches the selectors it discovers so repeat runs don't call the LLM. Cache is a JSON file you can commit to git.
export QORTEST_CACHE=true
export QORTEST_CACHE_DIR=.qortest # defaultOn a cache hit, qortest re-runs the cached selector against a fresh snapshot. If the page structure has changed (fingerprint mismatch), it falls back to a fresh LLM call. Cache hits cost zero tokens.
To force fresh LLM calls for a run:
export QORTEST_CACHE_BUST=trueFallback Model
Configure a more capable fallback model for retries. On a second attempt (after a suspicious result or execution failure), qortest automatically uses the fallback.
export QORTEST_FALLBACK_MODEL=gpt-4oFixture
For cleaner test setup, use the built-in fixture:
// qor-test.ts
import { test as base } from "@playwright/test";
import { qorFixture, type QorFixture } from "qortest";
export const test = base.extend<QorFixture>({
...qorFixture(base),
});
// my-spec.ts
import { test } from "./qor-test";
test("example", async ({ page, t }) => {
await t.act("Click login");
});Report
The reporter (configured in step 2 above) injects a summary banner into Playwright's HTML report and writes a standalone qortest-report/index.html. It shows per-test LLM calls, cache hit rate, retries, token usage, and estimated cost for OpenAI, Anthropic, Google, MiniMax, and GLM models.
Configuration
| Variable | Description | Default |
|---|---|---|
| QORTEST_API_KEY | LLM provider API key | — |
| QORTEST_MODEL | Model to use | gpt-4.1-mini |
| QORTEST_BASE_URL | Custom endpoint (any OpenAI-compatible API) | https://api.openai.com/v1 |
| QORTEST_FALLBACK_MODEL | Model used on retry attempts | — |
| QORTEST_CACHE | Enable selector caching | false |
| QORTEST_CACHE_DIR | Cache directory | .qortest |
| QORTEST_CACHE_BUST | Force fresh LLM calls for this run | false |
| QORTEST_RETRIES | LLM call retries on network error | 0 |
| QORTEST_DEBUG | Enable debug logging (DEBUG=qortest:*) | false |
How It Works
- You call
t.act("Click the login button"). - Qortest captures an aria snapshot of the page — a compact semantic tree, typically 1–2 KB.
- The snapshot + instruction go to your LLM.
- The LLM returns a structured selector (
{ role, name, op }). - Qortest executes it via Playwright and caches the selector for next time.
Aria snapshots are 50–100× smaller than screenshots, which keeps token costs low and latency fast. Structured selectors are deterministic — same input, same locator, every run.
Supported Browsers
- Chromium
- Firefox
Benchmarks
25-test suite on the-internet, gpt-4.1-mini, Chromium + Firefox, 3 workers.
| Mode | Pass rate | Avg time | LLM calls | Cost/run | |------|-----------|----------|-----------|----------| | Qortest — cold (no cache) | 100% | ~1.5m | 51 | ~$0.13 | | Qortest — warm (cache hit) | 100% | ~57s | ~5 | ~$0.007 | | Raw Playwright | 100% | ~49s | 0 | $0 |
Once the cache is warm, qortest runs within ~15% of raw Playwright speed at ~$0.007/run — with no selectors to write or maintain.
Troubleshooting
See docs/troubleshooting.md for common issues — auth errors, element not found, flaky tests, cache problems, and debug logging.
License
MIT
