playwright-anchor

v0.2.0

Published

8 days ago

Self-healing Playwright locators that commit as a reviewable git diff and replay in CI with zero LLM. Heal from your coding agent over MCP, or with your own local LLM.

Downloads

328

0High
0Medium
0Low

mk668a

playwright self-healing locator selector flaky-tests e2e testing ollama local-llm ai mcp model-context-protocol agent

playwright-anchor

English · 日本語

Renamed a button's id and now a dozen Playwright tests fail? A local AI finds what each broken selector was pointing at, you commit the fix, and CI replays it with zero AI.

A Playwright test drives a real browser. To click a button or read a field, it first has to find that element on the page with a locator: a selector such as #buy-button. Locators are tied to the page's HTML, so the day a teammate renames that button from #buy-button to #checkout-cta, the locator matches nothing and the test fails red, even though the button still works perfectly for real users. That is a broken locator, and across a large test suite it happens all the time.

playwright-anchor fixes a broken locator once, on your own machine. It shows your own local AI model (an LLM such as llama3.2, run via Ollama, llama.cpp, LM Studio, anything OpenAI-compatible) a snapshot of the page and asks which element the old selector meant. From that answer it works out a new, sturdier selector and records it in a small file, .playwright-anchors.json:

--- a/.playwright-anchors.json
+++ b/.playwright-anchors.json
@@
+    "#old-buy-button": {
+      "healed": "[data-testid=\"buy-now\"]",
+      "healedAt": "2026-06-11T09:14:03.512Z",
+      "model": "llama3.2",
+      "reason": "same purchase button, renamed id",
+      "via": "ref"
+    }

From then on, every run (yours and CI's) just replays that committed fix. CI never talks to an AI, needs no API key, no Redis, no cloud service, and behaves the same way every time.

Heal once. Replay forever. Zero LLM in CI.

Why not runtime self-healing?

Every existing approach re-heals on every run, inside CI. That makes test results nondeterministic. It's the exact reason the Playwright team declined to build self-healing into Playwright ("It is important for our customers to know if the test failed or passed").

playwright-anchor treats healing as a code change. It happens on your machine, with your model, and lands in your repo as a reviewable diff. CI just replays it.

| | playwright-anchor | runtime self-healing tools | editor healing agents | |---|---|---|---| | When healing happens | once, locally, before commit | every run, inside CI | interactively, in your editor | | LLM calls in CI | 0 | per broken locator | — | | The fix is | a committed, reviewable JSON diff | runtime behavior | a patch to test source | | Extra infrastructure | none | varies (cache stores, API keys) | an agent loop |

Quick start

npm i -D playwright-anchor
ollama pull llama3.2        # or any model you like

Swap one import and use anchor() where locators tend to rot:

// before: import { test, expect } from '@playwright/test';
import { test, expect } from 'playwright-anchor';

test('checkout', async ({ page, anchor }) => {
  await page.goto('/shop');

  await anchor('#buy-button').click();          // actions work directly
  const status = await anchor('.order-status'); // `await` → genuine Locator
  await expect(status).toHaveText('purchased'); // web-first assertions work unchanged
});

That's it. While selectors keep working, anchor() behaves exactly like page.locator(). When one breaks:

Locally (heal mode): your local model gets an accessibility snapshot of the page, picks the element the broken selector meant, playwright-anchor derives a durable selector (test-id → id → stable attributes → CSS path, uniqueness-verified), saves it to .playwright-anchors.json, and the test proceeds. You review the diff and commit it.
In CI (replay mode, automatic when CI is set): the committed heal resolves instantly with zero LLM calls. A cache miss fails loudly with instructions. Never silently, never nondeterministically.

Modes

| Mode | When | Behavior | |---|---|---| | heal | default locally | broken selector → cache → local LLM (once) → commit | | replay | default when CI env is set | broken selector → cache only. Miss = actionable failure. Never calls an LLM. | | off | — | anchor() behaves like page.locator() |

Configuration

Via test.use() (or per-project in playwright.config.ts):

test.use({
  anchorOptions: {
    mode: 'heal',                          // heal | replay | off
    cacheFile: '.playwright-anchors.json', // relative to Playwright rootDir
    resolveTimeout: 2000,                  // ms before a selector counts as broken
    testIdAttribute: 'data-testid',        // first choice for derived selectors
    llm: {
      baseURL: 'http://127.0.0.1:11434/v1', // any OpenAI-compatible endpoint
      model: 'llama3.2',
      // apiKey: only if your own endpoint needs one
    },
  },
});

Environment variables override options: PLAYWRIGHT_ANCHOR_MODE, PLAYWRIGHT_ANCHOR_CACHE, PLAYWRIGHT_ANCHOR_LLM_URL, PLAYWRIGHT_ANCHOR_LLM_MODEL, PLAYWRIGHT_ANCHOR_LLM_API_KEY.

How healing works (and why small models are enough)

The LLM is never asked to write a selector. It receives Playwright's accessibility snapshot (ariaSnapshot({ mode: 'ai' })) where every element carries a [ref=eN] marker, and only has to point at the right element:

{"ref": "e12", "reason": "same purchase button, renamed id"}

playwright-anchor then derives the committed selector deterministically in the browser (preferring your test-id attribute, then ids, then stable attributes, then a minimal CSS path), and verifies it resolves uniquely before saving. Picking one element from a labeled list is easy enough that 3–8B local models handle it well; the part that must be precise is never delegated to the model.

CLI

npx playwright-anchor heal     # run tests in heal mode, then show what was healed
npx playwright-anchor replay   # verify locally what CI will do (zero LLM)
npx playwright-anchor list     # print committed heals
npx playwright-anchor rm "#old-selector"   # drop one heal (re-heals next run)

heal/replay pass any extra arguments through to npx playwright test.

MCP server (heal from any agent, no local model)

Prefer to drive healing from your coding agent? playwright-anchor mcp is a stdio MCP server. Your agent's own model does the element-pick, so you don't need Ollama (or any local model) at all, and the server itself never calls an LLM. CI is unchanged: it still replays with zero LLM.

claude mcp add --transport stdio playwright-anchor -- npx -y playwright-anchor mcp

Or commit a project .mcp.json so your whole team gets it:

{
  "mcpServers": {
    "playwright-anchor": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "playwright-anchor", "mcp"]
    }
  }
}

Then ask your agent to "heal my broken Playwright selectors". It calls anchor_heal; for each broken locator the server hands it a page snapshot, the agent picks the element and calls anchor_pick, and the durable selector is derived and written to .playwright-anchors.json for you to review and commit. Tools: anchor_heal, anchor_pick, anchor_replay, anchor_list, anchor_remove, anchor_status.

The server needs the optional dependency @modelcontextprotocol/sdk (installed automatically; if your install skipped optional deps, run npm i @modelcontextprotocol/sdk). Claude Code has no MCP sampling yet, so the heal flows through the explicit anchor_heal then anchor_pick tools, which is exactly this path.

Using with Claude Code / coding agents

This bundled Claude Code skill is the local-LLM variant of the agent workflow (the agent drives npx playwright-anchor heal against your Ollama). If you'd rather skip the local model entirely, use the MCP server above instead. Either way it's the same "agent proposes, human reviews, CI replays" loop, at dev time only:

cp -r node_modules/playwright-anchor/skills/playwright-anchor .claude/skills/

The skill makes the agent: run npx playwright-anchor heal against your local model, show you git diff .playwright-anchors.json, verify with replay (zero LLM, exactly what CI runs), and leave the commit decision to you.

BYO model: your hardware, your keys, your choice

This tool is bring-your-own-inference by design:

No maintainer-provided API, no embedded keys, no telemetry. Nobody pays per-token costs but you, and with a local model, you don't either.
Provider-agnostic. Anything speaking the OpenAI chat-completions protocol works: Ollama, llama.cpp server, LM Studio, vLLM, or your own Anthropic/OpenAI key via their OpenAI-compatible endpoints. Swap with one env var.
Fully local / offline capable. The default configuration (http://127.0.0.1:11434/v1) never leaves your machine.
CI needs no model at all. Replay mode is pure JSON lookup.

FAQ

What if the original selector starts working again? The original always wins: cache entries are only consulted when the original fails. Stale entries are inert (and easy to spot in the JSON).

What about dynamic pages where the element genuinely isn't there? Then the heal fails too. anchor() does not invent elements. You get an AnchorHealError/AnchorReplayError instead of a false green. Healing repairs renamed/moved elements, it does not paper over real regressions.

Do I have to wrap every locator? No. Use anchor() for selectors that historically rot (deep CSS, generated ids); keep page.getByRole() and friends everywhere else.

Does this replace good locators? No. It's a safety net plus a migration path: every heal upgrades a brittle selector to the most durable one available (ideally your test-id).

Can I use this from Claude Code / Cursor / my agent loop? Yes, as the heal step, at dev time. Let your agent run PLAYWRIGHT_ANCHOR_MODE=heal npx playwright test, then review the .playwright-anchors.json diff like any other change it proposes. CI is unaffected either way: it only replays the committed cache.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

playwright-anchor

Why not runtime self-healing?

Quick start

Modes

Configuration

How healing works (and why small models are enough)

CLI

MCP server (heal from any agent, no local model)

Using with Claude Code / coding agents

BYO model: your hardware, your keys, your choice

FAQ

License