agentester
v1.0.0
Published
Automated UI/UX quality testing — crawls your web app and generates a scored report with performance, accessibility and network error analysis.
Maintainers
Readme
AgenTester
© 2025 Cynchro — www.cynchrolabs.com.ar
Automated UI/UX testing system. Paste a URL, hit Start — it crawls every route, runs a full test suite (CRUD, filters, pagination, buttons, navigation, JS errors), streams screenshots in real-time, and generates an HTML + JSON report with a quality score.
Quick start
Local (Node.js)
npm install
npm run setup # installs Chromium
node server.jsOpen http://localhost:3000
Docker
docker compose up -d --buildOpen http://localhost:3000
Note: The container uses
network_mode: host(Linux). This means127.0.0.1inside the container reaches your host machine directly — useful when the target app is also in Docker with a mapped port.
Architecture
cli.js Headless CLI runner for CI/CD. Exit codes: 0=pass, 1=fail, 2=fatal.
server.js Express + WebSocket server. One WS connection = one test session.
src/
tester.js Orchestrator. BFS crawler + per-page test runner + AI/regression hooks.
analyzer.js DOM reader. Extracts and classifies all interactive elements.
executor.js Test functions: CRUD, filters, pagination, navigation, buttons, selects, JS errors.
reporter.js Generates the HTML and JSON reports (including AI + regression sections).
ai.js LLM integration (Anthropic/DeepSeek/OpenAI): app detection, semantic analysis, narrative.
regression.js Baseline per URL: saves/loads snapshots, computes run diff.
memory.js Run history per app: trends, flaky test detection, cached page hints.
alerts.js Post-run notifications: Slack, generic webhook, threshold conditions.
i18n.js Translation dictionaries (ES / EN). t(key) factory pattern.
public/
index.html Single-page frontend. WebSocket client, real-time log, live screenshots.
reports/
*.html / *.json Generated reports.
baselines/ Per-URL JSON baselines for regression comparison.
memory/ Per-app run history and cached AI page hints.
.github/workflows/
agentester.yml Ready-to-use GitHub Actions workflow.
Dockerfile
docker-compose.yml
.env.exampleData flow
Browser (frontend)
→ WebSocket { type:'start', url, lang, credentials, options }
→ Tester.run()
→ _crawlAllPages() BFS, up to 80 pages
→ _testOnePage() per page, 90s timeout
→ analyzePage() DOM snapshot
→ testCRUD / testFilters / testPagination / testNavigation / testButtons / testSelects / testConsoleErrors
→ reporter.save() writes .html + .json
→ WebSocket { type:'complete', reportUrl, jsonUrl, summary }Authentication
Four modes, selectable from the UI:
| Mode | When to use | |---|---| | No auth | Public app or already-authenticated session | | Username / Password | Standard login form (email + password inputs) | | Cookies JSON | Session cookie from DevTools → Application → Cookies | | Bearer Token | API-driven frontend that reads a JWT from a header |
Cookie injection example
Copy from DevTools → Application → Cookies, format as JSON array:
[
{
"name": "session",
"value": "eyJhbGciOiJIUzI1NiJ9...",
"domain": "myapp.com",
"path": "/"
}
]Paste into the Cookies JSON textarea. The tester injects them into the browser context before the first navigation — the app sees an already-authenticated browser.
Bearer token
Paste the raw token (without Bearer prefix). It is added as Authorization: Bearer <token> to every HTTP request the browser makes.
localStorage injection (programmatic)
Pass credentials.localStorage as a key→value object in the WebSocket start message. AgenTester injects it after the first load and reloads the page:
{
"type": "start",
"url": "https://myapp.com",
"credentials": {
"type": "bearer",
"token": "eyJ...",
"localStorage": {
"auth_token": "eyJ...",
"user_role": "admin"
}
}
}Session expiry
If a form-login session expires mid-crawl (browser is redirected back to the login page), AgenTester detects it and re-authenticates automatically before continuing.
How the crawler works
- Starts at the provided URL
- BFS-extracts all
<a href>links on the same origin - Also clicks nav/sidebar/menu buttons and captures any URL changes (SPA route discovery)
- Normalises URLs (strips
?page=N, hashes) to avoid duplicate visits - Skips binary files (images, PDFs, fonts, CSS, JS)
- Caps at 80 pages (configurable via
MAX_PAGESinsrc/tester.js)
Tests run on every page
| Test | What it checks |
|---|---|
| Page load | No 404/error text in body |
| Navigation | Every nav link resolves without error |
| CRUD – Create | Click create button → form appears → fill → submit → success/error feedback |
| CRUD – Validation | Submit empty form → validation errors shown |
| CRUD – Edit | Click edit button → form appears |
| CRUD – Delete | Click delete → confirmation dialog present |
| Filters | Apply filter → result count changes |
| Pagination | Click Next → data changes |
| Selects | Dropdowns have selectable options |
| General buttons | Click each button → DOM reacts or navigates |
| JS errors | window.onerror + unhandledrejection captured |
| Network errors | 4xx/5xx responses logged as errors |
| Accessibility | axe-core WCAG 2.1 AA scan per page — violations mapped to severity and factored into the score |
Quality score
Computed at the end of each run (0–100):
score = 100
score -= criticalErrors × 15
score -= highErrors × 8
score -= mediumErrors × 3
score -= lowErrors × 1
score -= (failedTests / totalTests) × 20
score = clamp(0, 100)Displayed live in the complete banner (green ≥ 80, yellow ≥ 60, red < 60) and stored in the JSON report.
Reports
Each test run produces two files in reports/:
report_<session>_<ts>.html— self-contained visual report with screenshots, filterable by severityreport_<session>_<ts>.json— structured data (no screenshots) for programmatic use, diffs, dashboards
JSON report structure
{
"meta": {
"sessionId": "...",
"targetUrl": "https://myapp.com",
"startTime": "2024-01-15T10:00:00Z",
"generatedAt": "2024-01-15T10:12:34Z",
"duration": 754
},
"summary": {
"total": 142,
"passed": 128,
"failed": 9,
"warnings": 5,
"criticalErrors": 0,
"highErrors": 3,
"mediumErrors": 6,
"lowErrors": 2,
"score": 74
},
"errors": [ ... ],
"testResults": [ ... ],
"pagesVisited": [ ... ]
}CI/CD Integration
AgenTester ships with a ready-to-use GitHub Actions workflow. Copy it into your app's repo and every push will automatically crawl your staging environment, score it, and fail the pipeline if quality drops.
How it works
Developer pushes code
│
▼
GitHub Actions runner
│
├─ npm ci + npm run setup (Chromium)
│
├─ node cli.js --url https://staging.myapp.com --threshold 70
│ │
│ ├─ crawls all routes (BFS, up to 50 pages)
│ ├─ runs UI/UX test suite per page
│ ├─ computes quality score (0–100)
│ └─ compares vs previous baseline (regression check)
│
├─ exit 0 → score ≥ 70 and no regression → pipeline PASSES ✅
├─ exit 1 → score < 70 or regression → pipeline FAILS ❌
└─ exit 2 → fatal error (app unreachable) → pipeline FAILS ❌
│
├─ Slack alert sent (if ALERT_SLACK_URL is set)
├─ HTML report uploaded as artifact
└─ Score summary posted to job summary pageSetup (3 steps)
1. Copy the workflow into your app repo
# from your app's repo root
mkdir -p .github/workflows
curl -o .github/workflows/agentester.yml \
https://raw.githubusercontent.com/cynchro/AgenTester/main/examples/github-actions.ymlOr copy examples/github-actions.yml manually.
2. Set your staging URL
Repository → Settings → Variables → Actions → New variable
Name: TEST_URL
Value: https://staging.myapp.com3. (Optional) Add secrets for AI analysis and Slack alerts
Repository → Settings → Secrets → Actions → New secret
ANTHROPIC_API_KEY (or DEEPSEEK_API_KEY / OPENAI_API_KEY)
ALERT_SLACK_URL (Slack Incoming Webhook URL)That's it. Push to main and the workflow runs automatically.
CLI reference
node cli.js --url https://staging.myapp.com --threshold 70
# exit 0 → score ≥ 70, no regression
# exit 1 → score < 70 or regression detected
# exit 2 → fatal error| Flag | Default | Description |
|---|---|---|
| --url <url> | required | Target URL |
| --threshold <n> | 0 | Minimum score for exit 0 |
| --lang <es\|en> | en | Report language |
| --output <file> | — | Save JSON result to file |
| --username / --password | — | Form login credentials |
| --token | — | Bearer token |
| --no-crud / --no-nav / --no-filters / --no-pagination / --no-buttons | — | Disable specific test suites |
| --no-a11y | — | Disable accessibility testing (WCAG 2.1 AA) |
| --max-pages <n> | 80 | Maximum pages to crawl (1–200) |
| --crawl-wait <ms> | 700 | Delay between crawl navigations |
| --test-wait <ms> | 1000 | Delay before running tests on each page |
The workflow file (examples/github-actions.yml) triggers on push, PR, schedule (Mon–Fri 8 AM UTC), and manual dispatch. It publishes the score as a job summary and uploads the HTML report as a 30-day artifact.
Alerts
AgenTester sends notifications when quality thresholds are breached:
| Condition | Level | |---|---| | Score dropped ≥ N points vs baseline | Critical | | New critical errors found | Critical | | Regression detected | Warning | | Score below configured minimum | Warning | | Sustained score decline (trend) | Info |
Configure via environment variables — see .env.example.
Supported channels: Slack webhook, generic webhook (POST JSON).
AI Features (optional)
AgenTester can use an LLM to go beyond heuristic-based testing. Three capabilities are unlocked when an API key is present:
| Capability | When it runs | What it does | |---|---|---| | App Intelligence | After crawl | Detects app type (e-commerce, CRM, dashboard…), critical flows and risk areas | | Semantic page analysis | Per complex page | Understands page purpose and suggests specific assertions. Results are cached per page and reused on subsequent runs. | | Report narrative | End of run | Writes a 3–4 sentence QA summary in plain language, with regression status | | Smart memory | Every run | Stores run history, detects score trends and flaky tests. Re-uses cached AI analysis to reduce API calls. |
Supported LLM providers
Set one of these environment variables — AgenTester picks the first one found:
| Variable | Provider | Model used |
|---|---|---|
| ANTHROPIC_API_KEY | Claude (Anthropic) | claude-haiku-4-5 — with prompt caching |
| DEEPSEEK_API_KEY | DeepSeek | deepseek-chat |
| OPENAI_API_KEY | OpenAI | gpt-4o-mini |
| OPENAI_API_KEY + OPENAI_BASE_URL | Any OpenAI-compatible API | configured model |
No key? No problem. AgenTester runs exactly as before — all AI calls are graceful no-ops.
Regression memory
Every run automatically saves a baseline for the target URL in reports/baselines/. The next run compares against it and reports:
- Score delta vs previous run
- New test failures (regressions)
- Resolved issues
- Pages that disappeared or were added
Configuration
Test options (UI checkboxes)
| Option | Default | Description |
|---|---|---|
| Navigation | ✅ | Test all nav links |
| Full CRUD | ✅ | Create / Edit / Delete flows |
| Filters / Search | ✅ | Filter inputs and selects |
| Pagination | ✅ | Next page button |
| General buttons | ✅ | All other buttons |
| Slow motion | ✅ | 80 ms delay between actions (visible mode) |
| Mode | Window | With window shows the browser; Headless hides it |
Environment variables
| Variable | Default | Description |
|---|---|---|
| PORT | 3000 | HTTP port |
| FORCE_HEADLESS | false | Force headless mode regardless of UI setting. Set to true in Docker. |
| NODE_ENV | — | Set to production in Docker |
| ANTHROPIC_API_KEY | — | Enables AI features via Claude (Haiku). Takes priority over other LLM keys. |
| DEEPSEEK_API_KEY | — | Enables AI features via DeepSeek Chat (OpenAI-compatible). |
| OPENAI_API_KEY | — | Enables AI features via OpenAI GPT-4o mini. |
| OPENAI_BASE_URL | — | Override base URL for any OpenAI-compatible provider (use with OPENAI_API_KEY). |
| ALERT_SLACK_URL | — | Slack Incoming Webhook URL for alert notifications. |
| ALERT_WEBHOOK_URL | — | Generic webhook URL (POST JSON) for alert notifications. |
| ALERT_MIN_SCORE | 0 | Minimum acceptable score. Alert sent if score falls below this value. |
| ALERT_SCORE_DROP_THRESHOLD | 10 | Score drop in points vs baseline that triggers an alert. |
Tuning crawler limits
Configurable per run — no code changes needed:
| Parameter | Default | UI | CLI | WS option |
|---|---|---|---|---|
| Max pages | 80 | ✅ Input field | --max-pages <n> | maxPages |
| Crawl wait | 700 ms | — | --crawl-wait <ms> | crawlWait |
| Test wait | 1000 ms | — | --test-wait <ms> | testWait |
Per-page test timeout is hardcoded to 90 seconds in _testOnePage.
i18n
The UI and all backend log/report strings support Spanish and English.
- Frontend: language toggle button (🇪🇸 / 🇺🇸), persisted in
localStorage - Backend:
langsent in the WebSocket start message → all phase logs, test names, error descriptions and the HTML report are generated in the selected language - Translation dictionaries:
src/i18n.js(backend),LANGSobject inpublic/index.html(frontend)
Docker details
# docker-compose.yml highlights
network_mode: host # container shares host network stack
shm_size: 2gb # Chromium needs shared memory
security_opt:
- seccomp=unconfined # required for Chromium sandbox in some kernelsThe Playwright Docker image (mcr.microsoft.com/playwright:v1.59.1-jammy) includes Chromium and all system dependencies — no separate npm run setup needed inside the container.
Reports are mounted as a volume so they persist across container restarts:
volumes:
- ./reports:/app/reportsAPI
| Endpoint | Description |
|---|---|
| GET / | Frontend UI |
| GET /api/reports | JSON list of all report files, sorted by date |
| GET /reports/<filename> | Serve a specific HTML or JSON report |
| WS / | WebSocket endpoint for test sessions |
WebSocket messages
Client → Server
{ "type": "start", "url": "https://...", "lang": "es", "credentials": {...}, "options": {...} }
{ "type": "stop" }Server → Client
{ "type": "log", "level": "info|warn|error", "message": "...", "timestamp": "..." }
{ "type": "phase", "name": "...", "progress": 42 }
{ "type": "phase_step", "message": "..." }
{ "type": "screenshot", "data": "<base64 jpeg>", "caption": "..." }
{ "type": "test_result", "status": "pass|fail|warning", "test": "...", "details": "..." }
{ "type": "complete", "reportUrl": "/reports/...", "jsonUrl": "/reports/...", "summary": {...}, "duration": 120 }
{ "type": "error", "message": "..." }
{ "type": "ai_insight", "appType": "crm", "description": "...", "criticalFlows": [...] }
{ "type": "ai_narrative", "text": "The application scored 74/100..." }
{ "type": "regression_diff","scoreDelta": -8, "newFailures": 2, "resolved": 1, "isRegression": true, ... }Contact
Buy me a coffee?
https://www.paypal.com/donate/?hosted_button_id=YX332RT7KSJ4Q
LANDING PAGE
See https://cynchrolabs.com.ar/#/landings/agentester
Contributing
See CONTRIBUTING.md.
Code of Conduct
See CODE_OF_CONDUCT.md.
License
Contact
⭐ If you like this project, give it a star!
