agentester

v1.0.0

Published

24 days ago

Automated UI/UX quality testing — crawls your web app and generates a scored report with performance, accessibility and network error analysis.

0High
0Medium
0Low

cynchro

testing ui ux accessibility playwright crawler quality a11y performance

AgenTester

© 2025 Cynchro — www.cynchrolabs.com.ar

Automated UI/UX testing system. Paste a URL, hit Start — it crawls every route, runs a full test suite (CRUD, filters, pagination, buttons, navigation, JS errors), streams screenshots in real-time, and generates an HTML + JSON report with a quality score.

Quick start

Local (Node.js)

npm install
npm run setup          # installs Chromium
node server.js

Open http://localhost:3000

Docker

docker compose up -d --build

Open http://localhost:3000

Note: The container uses network_mode: host (Linux). This means 127.0.0.1 inside the container reaches your host machine directly — useful when the target app is also in Docker with a mapped port.

Architecture

cli.js             Headless CLI runner for CI/CD. Exit codes: 0=pass, 1=fail, 2=fatal.
server.js          Express + WebSocket server. One WS connection = one test session.
src/
  tester.js        Orchestrator. BFS crawler + per-page test runner + AI/regression hooks.
  analyzer.js      DOM reader. Extracts and classifies all interactive elements.
  executor.js      Test functions: CRUD, filters, pagination, navigation, buttons, selects, JS errors.
  reporter.js      Generates the HTML and JSON reports (including AI + regression sections).
  ai.js            LLM integration (Anthropic/DeepSeek/OpenAI): app detection, semantic analysis, narrative.
  regression.js    Baseline per URL: saves/loads snapshots, computes run diff.
  memory.js        Run history per app: trends, flaky test detection, cached page hints.
  alerts.js        Post-run notifications: Slack, generic webhook, threshold conditions.
  i18n.js          Translation dictionaries (ES / EN). t(key) factory pattern.
public/
  index.html       Single-page frontend. WebSocket client, real-time log, live screenshots.
reports/
  *.html / *.json  Generated reports.
  baselines/       Per-URL JSON baselines for regression comparison.
  memory/          Per-app run history and cached AI page hints.
.github/workflows/
  agentester.yml   Ready-to-use GitHub Actions workflow.
Dockerfile
docker-compose.yml
.env.example

Data flow

Browser (frontend)
  → WebSocket { type:'start', url, lang, credentials, options }
    → Tester.run()
        → _crawlAllPages()   BFS, up to 80 pages
        → _testOnePage()     per page, 90s timeout
            → analyzePage()  DOM snapshot
            → testCRUD / testFilters / testPagination / testNavigation / testButtons / testSelects / testConsoleErrors
        → reporter.save()    writes .html + .json
    → WebSocket { type:'complete', reportUrl, jsonUrl, summary }

Authentication

Four modes, selectable from the UI:

| Mode | When to use | |---|---| | No auth | Public app or already-authenticated session | | Username / Password | Standard login form (email + password inputs) | | Cookies JSON | Session cookie from DevTools → Application → Cookies | | Bearer Token | API-driven frontend that reads a JWT from a header |

Cookie injection example

Copy from DevTools → Application → Cookies, format as JSON array:

[
  {
    "name": "session",
    "value": "eyJhbGciOiJIUzI1NiJ9...",
    "domain": "myapp.com",
    "path": "/"
  }
]

Paste into the Cookies JSON textarea. The tester injects them into the browser context before the first navigation — the app sees an already-authenticated browser.

Bearer token

Paste the raw token (without Bearer prefix). It is added as Authorization: Bearer <token> to every HTTP request the browser makes.

localStorage injection (programmatic)

Pass credentials.localStorage as a key→value object in the WebSocket start message. AgenTester injects it after the first load and reloads the page:

{
  "type": "start",
  "url": "https://myapp.com",
  "credentials": {
    "type": "bearer",
    "token": "eyJ...",
    "localStorage": {
      "auth_token": "eyJ...",
      "user_role": "admin"
    }
  }
}

Session expiry

If a form-login session expires mid-crawl (browser is redirected back to the login page), AgenTester detects it and re-authenticates automatically before continuing.

How the crawler works

Starts at the provided URL
BFS-extracts all <a href> links on the same origin
Also clicks nav/sidebar/menu buttons and captures any URL changes (SPA route discovery)
Normalises URLs (strips ?page=N, hashes) to avoid duplicate visits
Skips binary files (images, PDFs, fonts, CSS, JS)
Caps at 80 pages (configurable via MAX_PAGES in src/tester.js)

Tests run on every page

| Test | What it checks | |---|---| | Page load | No 404/error text in body | | Navigation | Every nav link resolves without error | | CRUD – Create | Click create button → form appears → fill → submit → success/error feedback | | CRUD – Validation | Submit empty form → validation errors shown | | CRUD – Edit | Click edit button → form appears | | CRUD – Delete | Click delete → confirmation dialog present | | Filters | Apply filter → result count changes | | Pagination | Click Next → data changes | | Selects | Dropdowns have selectable options | | General buttons | Click each button → DOM reacts or navigates | | JS errors | window.onerror + unhandledrejection captured | | Network errors | 4xx/5xx responses logged as errors | | Accessibility | axe-core WCAG 2.1 AA scan per page — violations mapped to severity and factored into the score |

Quality score

Computed at the end of each run (0–100):

score = 100
score -= criticalErrors × 15
score -= highErrors     × 8
score -= mediumErrors   × 3
score -= lowErrors      × 1
score -= (failedTests / totalTests) × 20
score = clamp(0, 100)

Displayed live in the complete banner (green ≥ 80, yellow ≥ 60, red < 60) and stored in the JSON report.

Reports

Each test run produces two files in reports/:

report_<session>_<ts>.html — self-contained visual report with screenshots, filterable by severity
report_<session>_<ts>.json — structured data (no screenshots) for programmatic use, diffs, dashboards

JSON report structure

{
  "meta": {
    "sessionId": "...",
    "targetUrl": "https://myapp.com",
    "startTime": "2024-01-15T10:00:00Z",
    "generatedAt": "2024-01-15T10:12:34Z",
    "duration": 754
  },
  "summary": {
    "total": 142,
    "passed": 128,
    "failed": 9,
    "warnings": 5,
    "criticalErrors": 0,
    "highErrors": 3,
    "mediumErrors": 6,
    "lowErrors": 2,
    "score": 74
  },
  "errors": [ ... ],
  "testResults": [ ... ],
  "pagesVisited": [ ... ]
}

CI/CD Integration

AgenTester ships with a ready-to-use GitHub Actions workflow. Copy it into your app's repo and every push will automatically crawl your staging environment, score it, and fail the pipeline if quality drops.

How it works

Developer pushes code
        │
        ▼
GitHub Actions runner
        │
        ├─ npm ci  +  npm run setup (Chromium)
        │
        ├─ node cli.js --url https://staging.myapp.com --threshold 70
        │       │
        │       ├─ crawls all routes (BFS, up to 50 pages)
        │       ├─ runs UI/UX test suite per page
        │       ├─ computes quality score (0–100)
        │       └─ compares vs previous baseline (regression check)
        │
        ├─ exit 0  →  score ≥ 70 and no regression  →  pipeline PASSES ✅
        ├─ exit 1  →  score < 70 or regression       →  pipeline FAILS  ❌
        └─ exit 2  →  fatal error (app unreachable)  →  pipeline FAILS  ❌
                │
                ├─ Slack alert sent  (if ALERT_SLACK_URL is set)
                ├─ HTML report uploaded as artifact
                └─ Score summary posted to job summary page

Setup (3 steps)

1. Copy the workflow into your app repo

# from your app's repo root
mkdir -p .github/workflows
curl -o .github/workflows/agentester.yml \
  https://raw.githubusercontent.com/cynchro/AgenTester/main/examples/github-actions.yml

Or copy examples/github-actions.yml manually.

2. Set your staging URL

Repository → Settings → Variables → Actions → New variable
Name:  TEST_URL
Value: https://staging.myapp.com

3. (Optional) Add secrets for AI analysis and Slack alerts

Repository → Settings → Secrets → Actions → New secret
ANTHROPIC_API_KEY   (or DEEPSEEK_API_KEY / OPENAI_API_KEY)
ALERT_SLACK_URL     (Slack Incoming Webhook URL)

That's it. Push to main and the workflow runs automatically.

CLI reference

node cli.js --url https://staging.myapp.com --threshold 70
# exit 0 → score ≥ 70, no regression
# exit 1 → score < 70 or regression detected
# exit 2 → fatal error

| Flag | Default | Description | |---|---|---| | --url <url> | required | Target URL | | --threshold <n> | 0 | Minimum score for exit 0 | | --lang <es\|en> | en | Report language | | --output <file> | — | Save JSON result to file | | --username / --password | — | Form login credentials | | --token | — | Bearer token | | --no-crud / --no-nav / --no-filters / --no-pagination / --no-buttons | — | Disable specific test suites | | --no-a11y | — | Disable accessibility testing (WCAG 2.1 AA) | | --max-pages <n> | 80 | Maximum pages to crawl (1–200) | | --crawl-wait <ms> | 700 | Delay between crawl navigations | | --test-wait <ms> | 1000 | Delay before running tests on each page |

The workflow file (examples/github-actions.yml) triggers on push, PR, schedule (Mon–Fri 8 AM UTC), and manual dispatch. It publishes the score as a job summary and uploads the HTML report as a 30-day artifact.

Alerts

AgenTester sends notifications when quality thresholds are breached:

| Condition | Level | |---|---| | Score dropped ≥ N points vs baseline | Critical | | New critical errors found | Critical | | Regression detected | Warning | | Score below configured minimum | Warning | | Sustained score decline (trend) | Info |

Configure via environment variables — see .env.example.
Supported channels: Slack webhook, generic webhook (POST JSON).

AI Features (optional)

AgenTester can use an LLM to go beyond heuristic-based testing. Three capabilities are unlocked when an API key is present:

| Capability | When it runs | What it does | |---|---|---| | App Intelligence | After crawl | Detects app type (e-commerce, CRM, dashboard…), critical flows and risk areas | | Semantic page analysis | Per complex page | Understands page purpose and suggests specific assertions. Results are cached per page and reused on subsequent runs. | | Report narrative | End of run | Writes a 3–4 sentence QA summary in plain language, with regression status | | Smart memory | Every run | Stores run history, detects score trends and flaky tests. Re-uses cached AI analysis to reduce API calls. |

Supported LLM providers

Set one of these environment variables — AgenTester picks the first one found:

| Variable | Provider | Model used | |---|---|---| | ANTHROPIC_API_KEY | Claude (Anthropic) | claude-haiku-4-5 — with prompt caching | | DEEPSEEK_API_KEY | DeepSeek | deepseek-chat | | OPENAI_API_KEY | OpenAI | gpt-4o-mini | | OPENAI_API_KEY + OPENAI_BASE_URL | Any OpenAI-compatible API | configured model |

No key? No problem. AgenTester runs exactly as before — all AI calls are graceful no-ops.

Regression memory

Every run automatically saves a baseline for the target URL in reports/baselines/. The next run compares against it and reports:

Score delta vs previous run
New test failures (regressions)
Resolved issues
Pages that disappeared or were added

Configuration

Test options (UI checkboxes)

| Option | Default | Description | |---|---|---| | Navigation | ✅ | Test all nav links | | Full CRUD | ✅ | Create / Edit / Delete flows | | Filters / Search | ✅ | Filter inputs and selects | | Pagination | ✅ | Next page button | | General buttons | ✅ | All other buttons | | Slow motion | ✅ | 80 ms delay between actions (visible mode) | | Mode | Window | With window shows the browser; Headless hides it |

Environment variables

| Variable | Default | Description | |---|---|---| | PORT | 3000 | HTTP port | | FORCE_HEADLESS | false | Force headless mode regardless of UI setting. Set to true in Docker. | | NODE_ENV | — | Set to production in Docker | | ANTHROPIC_API_KEY | — | Enables AI features via Claude (Haiku). Takes priority over other LLM keys. | | DEEPSEEK_API_KEY | — | Enables AI features via DeepSeek Chat (OpenAI-compatible). | | OPENAI_API_KEY | — | Enables AI features via OpenAI GPT-4o mini. | | OPENAI_BASE_URL | — | Override base URL for any OpenAI-compatible provider (use with OPENAI_API_KEY). | | ALERT_SLACK_URL | — | Slack Incoming Webhook URL for alert notifications. | | ALERT_WEBHOOK_URL | — | Generic webhook URL (POST JSON) for alert notifications. | | ALERT_MIN_SCORE | 0 | Minimum acceptable score. Alert sent if score falls below this value. | | ALERT_SCORE_DROP_THRESHOLD | 10 | Score drop in points vs baseline that triggers an alert. |

Tuning crawler limits

Configurable per run — no code changes needed:

| Parameter | Default | UI | CLI | WS option | |---|---|---|---|---| | Max pages | 80 | ✅ Input field | --max-pages <n> | maxPages | | Crawl wait | 700 ms | — | --crawl-wait <ms> | crawlWait | | Test wait | 1000 ms | — | --test-wait <ms> | testWait |

Per-page test timeout is hardcoded to 90 seconds in _testOnePage.

i18n

The UI and all backend log/report strings support Spanish and English.

Frontend: language toggle button (🇪🇸 / 🇺🇸), persisted in localStorage
Backend: lang sent in the WebSocket start message → all phase logs, test names, error descriptions and the HTML report are generated in the selected language
Translation dictionaries: src/i18n.js (backend), LANGS object in public/index.html (frontend)

Docker details

# docker-compose.yml highlights
network_mode: host      # container shares host network stack
shm_size: 2gb           # Chromium needs shared memory
security_opt:
  - seccomp=unconfined  # required for Chromium sandbox in some kernels

The Playwright Docker image (mcr.microsoft.com/playwright:v1.59.1-jammy) includes Chromium and all system dependencies — no separate npm run setup needed inside the container.

Reports are mounted as a volume so they persist across container restarts:

volumes:
  - ./reports:/app/reports

API

| Endpoint | Description | |---|---| | GET / | Frontend UI | | GET /api/reports | JSON list of all report files, sorted by date | | GET /reports/<filename> | Serve a specific HTML or JSON report | | WS / | WebSocket endpoint for test sessions |

WebSocket messages

Client → Server

{ "type": "start", "url": "https://...", "lang": "es", "credentials": {...}, "options": {...} }
{ "type": "stop" }

Server → Client

{ "type": "log",            "level": "info|warn|error", "message": "...", "timestamp": "..." }
{ "type": "phase",          "name": "...", "progress": 42 }
{ "type": "phase_step",     "message": "..." }
{ "type": "screenshot",     "data": "<base64 jpeg>", "caption": "..." }
{ "type": "test_result",    "status": "pass|fail|warning", "test": "...", "details": "..." }
{ "type": "complete",       "reportUrl": "/reports/...", "jsonUrl": "/reports/...", "summary": {...}, "duration": 120 }
{ "type": "error",          "message": "..." }
{ "type": "ai_insight",     "appType": "crm", "description": "...", "criticalFlows": [...] }
{ "type": "ai_narrative",   "text": "The application scored 74/100..." }
{ "type": "regression_diff","scoreDelta": -8, "newFailures": 2, "resolved": 1, "isRegression": true, ... }

Contact

[email protected]

Buy me a coffee?

https://www.paypal.com/donate/?hosted_button_id=YX332RT7KSJ4Q

LANDING PAGE

See https://cynchrolabs.com.ar/#/landings/agentester

Contributing

See CONTRIBUTING.md.

Code of Conduct

See CODE_OF_CONDUCT.md.

License

MIT

Contact

[email protected]

⭐ If you like this project, give it a star!

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

AgenTester

Quick start

Local (Node.js)

Docker

Architecture

Data flow

Authentication

Cookie injection example

Bearer token

localStorage injection (programmatic)

Session expiry

How the crawler works

Tests run on every page

Quality score

Reports

JSON report structure

CI/CD Integration

How it works

Setup (3 steps)

CLI reference

Alerts

AI Features (optional)

Supported LLM providers

Regression memory

Configuration

Test options (UI checkboxes)

Environment variables

Tuning crawler limits

i18n

Docker details

API

WebSocket messages

Contact

Buy me a coffee?

LANDING PAGE

Contributing

Code of Conduct

License

Contact