elementus-ai
v1.0.2
Published
Self-healing element resolution for Playwright, WDIO & Appium. AI-powered fallback when selectors break.
Maintainers
Readme
Elementus
Self-healing element resolution for Playwright, WebDriverIO & Appium.
When a selector breaks, Elementus uses AI to find the element by natural-language description. Works with any action (click, fill, hover) and any assertion (toHaveText, toBeVisible). Supports local LLMs via LM Studio and cloud LLMs via Google Gemini API.
Installation
npm install elementus-aiOne-Prompt Setup
Copy this prompt to your AI coding agent (Claude, Cursor, Copilot, etc.) and it will analyze your project and integrate Elementus automatically:
I just installed the npm package "elementus-ai" — a self-healing element resolution library for test automation. Analyze my project and integrate it. Follow these steps:
1. DETECT MY FRAMEWORK
- Search for: playwright.config, wdio.conf, appium config files
- Check package.json for: @playwright/test, playwright, webdriverio, wdio, appium
- Read a few existing test files to understand the test structure
- If none found, tell me you can't detect a supported framework and stop
2. CHOOSE THE LLM PROVIDER
- Ask me: "Do you want to use a local LLM (LM Studio, free, private) or Google Gemini (cloud, fast, ~$0.01/500 tests)?"
- If Gemini: ask for API key or check for GEMINI_API_KEY env var
- If LM Studio: use defaults (localhost:1234, gemma model)
3. INTEGRATE BASED ON MY FRAMEWORK
For Playwright:
- Create or update a fixtures file that wraps page with el.wrapPage(page)
- Make sure all tests import from the fixtures file instead of @playwright/test
- Set actionTimeout: 10000 in playwright config (Elementus respects framework timeouts)
For WebDriverIO:
- In wdio.conf.js before hook, wrap browser and override global $:
const wrapped = el.wrapBrowser(browser); globalThis.$ = wrapped.$.bind(wrapped)
- This way all page objects use plain $() with optional { ai } — zero changes needed
For Appium:
- Add el.wrapBrowser(driver) in the before hook
4. EXAMPLE TEST
Ask me one of these three options:
a) "Which test case would you like me to add { ai } to as an example?"
b) "Or should I pick one test from your repo that has fragile selectors?"
c) "Or do you prefer no changes to existing tests — just the setup/config?"
If (a): apply { ai } to 1-2 fragile locators in the test I specify
If (b): find one test with fragile selectors (auto-generated IDs, deep CSS paths, nth-child), apply { ai } to 1-2 locators in that single test, explain why you chose it
If (c): skip this step — just confirm the setup is ready and show a standalone code snippet of how { ai } would look with my framework
IMPORTANT: Never modify more than one test file. This is an example — the user decides where to apply { ai } going forward.
5. VERIFY
- If a test was modified: run that single test to confirm it passes
- If no test was modified: confirm Elementus loads without errors by running a quick import check
Rules:
- Only modify ONE test file maximum, and only 1-2 locators in it — this is a demo, not a migration
- Do NOT add { ai } to every locator — only to ones with fragile selectors
- Stable selectors (data-testid, explicit IDs, aria labels) should NOT get { ai } — zero overhead matters
- The { ai } description should use words that appear in or near the element's visible text
- Never add runtime dependencies — Elementus has zero deps by designQuick Start
const { createElementus } = require('elementus-ai')
const el = createElementus({
provider: 'gemini',
geminiApiKey: process.env.GEMINI_API_KEY,
})
// Wrap your page — add { ai } to any locator that might break:
const p = el.wrapPage(page)
await p.locator('#submit-btn', { ai: 'Submit order button' }).click()
// Locators WITHOUT { ai } work normally — zero overhead:
await p.locator('#stable-element').click()LLM Provider Setup
Option A: Local LLM via LM Studio (free, private)
- Download LM Studio
- Load a vision-capable model (e.g.,
gemma-4-26b-a4b-it) - Start the local server (default:
http://localhost:1234)
const el = createElementus({
provider: 'lmstudio',
lmStudioUrl: 'http://localhost:1234/v1/chat/completions',
model: 'gemma-4-26b-a4b-it',
})Option B: Google Gemini API (cloud, fast, better vision)
- Get an API key from Google AI Studio
const el = createElementus({
provider: 'gemini',
geminiApiKey: 'AIza...', // or set GEMINI_API_KEY env var
geminiModel: 'gemini-2.5-flash',
})Framework Setup
Playwright
Wrap page once, add { ai } to any locator:
const p = el.wrapPage(page)
await p.locator('#btn', { ai: 'Submit order button' }).click()
await p.locator('#email', { ai: 'Email input field' }).fill('[email protected]')Recommended: Playwright fixture (wrap once for all tests):
// fixtures.js
const { test: base } = require('@playwright/test')
const { createElementus } = require('elementus-ai')
const el = createElementus({ provider: 'gemini', geminiApiKey: '...' })
module.exports = base.extend({
page: async ({ page }, use) => {
await use(el.wrapPage(page))
}
})
// In tests — page is already wrapped:
test('example', async ({ page }) => {
await page.locator('#btn', { ai: 'Submit button' }).click()
})WebDriverIO
Override the global $ in your wdio.conf.js so all page objects work transparently:
// wdio.conf.js
const { createElementus } = require('elementus-ai')
const el = createElementus({ provider: 'gemini', geminiApiKey: '...' })
exports.config = {
// ... other config
async before() {
const wrapped = el.wrapBrowser(browser)
globalThis.$ = wrapped.$.bind(wrapped)
}
}
// In tests / page objects — plain $() with optional { ai }:
await $('[data-testid="btn-send"]') // unchanged, zero overhead
await $('#btn', { ai: 'Submit order button' }).click() // self-healing
await $('#email', { ai: 'Email input field' }).setValue('[email protected]')Appium (Native Android / iOS / Flutter)
Same wrapBrowser pattern. Elementus auto-detects native apps and parses the element tree from driver.getPageSource() (XML) instead of DOM scanning.
const d = el.wrapBrowser(driver)
await d.$('~loginButton', { ai: 'Login button on welcome screen' }).click()
await d.$('~emailField', { ai: 'Email input' }).setValue('[email protected]')Works with Flutter, React Native, native Android/iOS — any Appium driver.
API Reference
el.wrapPage(page)
Wraps a Playwright page. Returns a proxy where page.locator(selector, { ai: 'description' }) auto-creates AI-fallback locators. Locators without { ai } pass through unchanged.
el.wrapBrowser(browser)
Wraps a WDIO/Appium browser. Returns a proxy where browser.$(selector, { ai: 'description' }) auto-creates AI-fallback elements.
el.find(context, description)
Find element by description only (no locator needed). Returns a framework-native locator/element.
const found = await el.find(page, 'Submit order button')
await found.click()
await expect(found).toHaveText('Submit')el.locate(context, locator, description)
Try locator first, fall back to AI if it fails. Returns a framework-native locator/element. Respects your framework's configured action timeout.
const found = await el.locate(page, page.locator('#btn'), 'Submit button')
await found.click()el.click(context, locator, description)
Click with optimized fallback: uses page.goto() for links (avoids hover/overlay issues) and JS click for buttons (no mouse movement). Best for navigation clicks. Respects your framework's configured action timeout.
await el.click(page, page.locator('#nav-blog'), 'Blog page link')el.wrap(context, locator, description)
Low-level: wraps any single locator/element with AI fallback. Prefer wrapPage/wrapBrowser for cleaner code.
Configuration
createElementus({
// LLM Provider
provider: 'lmstudio', // 'lmstudio' | 'gemini'
// LM Studio
lmStudioUrl: 'http://localhost:1234/v1/chat/completions',
model: 'gemma-4-26b-a4b-it',
// Gemini
geminiApiKey: null, // or GEMINI_API_KEY env var
geminiModel: 'gemini-2.5-flash',
// Behavior
maxCandidates: 20, // max elements sent to LLM for disambiguation
visionMaxWidth: 1280, // max screenshot width (px) sent to vision LLM
// Debugging
debug: false, // save screenshots to debugDir
debugDir: './debug', // directory for debug screenshots
// Custom stop words
stopWords: null, // Set of words to ignore in descriptions
})Timeouts
Elementus respects your framework's configured timeouts. It does not override or race against them. Set appropriate action timeouts in your framework config:
// Playwright: playwright.config.js or test.use()
test.use({ actionTimeout: 10_000 }) // 10s before locator fails, then AI takes over
// WDIO: wdio.conf.js
waitforTimeout: 10000If a selector works, it returns immediately (zero overhead). If it fails after your configured timeout, Elementus AI fallback kicks in.
How It Works
When a selector fails, Elementus runs a 3-step pipeline:
Step 1: Locator — Try the original selector. If it works, done (zero overhead).
Step 2: DOM/Element Tree Scoring — Scan all interactive elements on the page (DOM for web, XML source for native apps). Score each by keyword and phrase relevance to the description. If one clear winner, use it. If multiple tied, send candidates to LLM. If all identical (e.g., 10x "Edit" buttons), use positional LLM with coordinates.
Step 3: Vision — Take a screenshot with a labeled grid overlay. Ask the vision LLM which region contains the target. Scroll there, re-scan. If still unresolved, ask for precise pixel coordinates.
Tips for Writing Descriptions
Good — use words that appear in or near the element:
'Submit order button'matches<button>Submit</button>'Email input field'matches<input>near "Email" label'Privacy Policy footer link'matches<a>Privacy Policy</a>
For identical elements, add positional context:
'first Edit button near the top''Delete button in the third row'
Avoid vague descriptions:
'the button'matches every button'Save Changes button'is specific and matchable
Platform Support
| Platform | Element scan | Vision | Status | |----------|-------------|--------|--------| | Playwright (web) | DOM | Screenshot + LLM | Full support | | WDIO (web) | DOM | Screenshot + LLM | Full support | | Appium (mobile web) | DOM | Screenshot + LLM | Full support | | Appium (native Android/iOS) | XML source | Screenshot + LLM | Full support | | Appium (Flutter) | XML source | Screenshot + LLM | Full support | | Appium (React Native) | XML source | Screenshot + LLM | Full support |
License
MIT
