@usharma124/ui-qa
v1.0.1
Published
AI-powered UI/UX testing CLI with beautiful TUI
Readme
UI/UX QA Agent
AI-powered terminal UI that tests websites using browser automation and LLM analysis. It drives a real browser, executes intelligent test plans, and generates detailed quality reports with screenshots.
Features
- Beautiful Terminal UI: Interactive TUI built with Ink for a modern CLI experience
- Intelligent Test Planning: LLM generates test plans based on page content and goals
- Coverage-Guided Exploration: Advanced exploration engine with state fingerprinting and budget management
- Business Logic Validation: Validates websites against specification documents with requirement traceability
- Real Browser Testing: Uses Playwright for actual browser interaction
- Sitemap Discovery: Automatically discovers pages via sitemap.xml, robots.txt, or link crawling
- Parallel Page Testing: Tests multiple pages concurrently for faster results
- Visual Audits: Fast browser-based visual heuristics (overlapping elements, clipped text, tap targets)
- State Tracking: Detects revisits and tracks state transitions to avoid redundant testing
- Auth Fixture Management: Save and reuse authentication states for testing authenticated areas
- Comprehensive Reports: Scored reports with categorized issues and evidence
- Screenshot Capture: Automatic screenshots at key moments and on errors
- Local Storage: All results saved locally with markdown reports
- Update Notifications: Automatic checks for new versions
Prerequisites
- Bun or Node.js v18+ runtime
- OpenRouter API key
Installation
# Clone the repository
git clone <repo-url>
cd ui-qa-agent
# Install dependencies
bun install
# or: pnpm install
# Install browser (first time only)
bunx playwright install chromium
# Set up environment
cp env.example .env
# Edit .env and add your OPENROUTER_API_KEYUsage
UI/UX Testing Mode
# Start the TUI (interactive mode)
bun start
# Or with a URL directly
bun start https://example.com
# With custom goals
bun start https://example.com --goals "test checkout flow"
# Development mode (with hot reload)
bun dev
# Show help
bun start --helpBusiness Logic Validation Mode
# Validate a website against a specification
bun start validate --spec ./requirements.md --url https://app.example.com
# With custom output directory
bun start validate -s ./prd.md -u https://staging.app.com -o ./reports
# Show validation help
bun start validate --helpInteractive Mode
When you run without a URL, the TUI will prompt you to enter one:
- Enter the URL you want to test
- Select testing mode using arrow keys:
- Coverage-Guided Exploration (default): Smart exploration that maximizes coverage using state tracking and beam search
- Parallel Page Testing: Fast systematic testing of discovered pages using multiple browsers
- Watch the phases progress in real-time:
- Init: Opens browser and takes initial screenshot
- Discovery: Finds pages via sitemap or link crawling
- Planning: Creates intelligent test plan using LLM
- Traversal: Tests using your selected mode
- Execution: Runs additional planned tests
- Evaluation: Generates final scored report
- View results summary with score and issues
Validation Mode
The validation mode validates a website against a specification document:
- Provide a specification file (Markdown) and URL
- Watch the validation phases:
- Parsing: Parses specification document
- Extraction: Extracts testable requirements using LLM
- Rubric: Generates evaluation rubric
- Discovery: Discovers site structure
- Planning: Creates requirement-linked test plan
- Execution: Runs tests with browser automation
- Cross-Validation: Validates results against requirements
- Reporting: Generates traceability report
- View traceability report with requirement-to-evidence mapping
Keyboard Shortcuts
Enter- Submit URL / Confirm selection / Continue↑/↓- Select testing mode / Scroll through logsr- Retry after errorq- Quit (when not running)
Configuration
Environment Variables
Required
| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| OPENROUTER_API_KEY | Yes | - | Your OpenRouter API key |
Core Configuration
| Variable | Default | Description |
|----------|---------|-------------|
| OPENROUTER_MODEL | anthropic/claude-sonnet-4.5 | LLM model to use |
| MAX_STEPS | 20 | Maximum test steps per page |
| MAX_PAGES | 50 | Maximum pages to test |
| PARALLEL_BROWSERS | 5 | Number of parallel browser instances (1-10) |
| GOALS | homepage UX + primary CTA + form validation + keyboard | Default test goals |
| BROWSER_TIMEOUT | 60000 | Browser command timeout (ms) |
| NAVIGATION_TIMEOUT | 45000 | Page load timeout (ms) |
| ACTION_TIMEOUT | 15000 | Click/fill action timeout (ms) |
| DEBUG | false | Enable verbose output |
Coverage-Guided Exploration (Advanced)
| Variable | Default | Description |
|----------|---------|-------------|
| COVERAGE_GUIDED | false | Enable coverage-guided exploration engine |
| EXPLORATION_MODE | coverage_guided | Strategy: coverage_guided, breadth_first, depth_first, random |
| BEAM_WIDTH | 3 | Beam width for beam search exploration |
| BUDGET_MAX_STEPS_PER_STATE | 10 | Max steps per unique page state |
| BUDGET_MAX_UNIQUE_STATES | 100 | Max unique states to visit |
| BUDGET_MAX_TOTAL_STEPS | 500 | Max total steps across all states |
| BUDGET_STAGNATION_THRESHOLD | 15 | Steps without coverage gain before stopping |
| BUDGET_MAX_DEPTH | 10 | Maximum exploration depth |
| BUDGET_MAX_TIME_MS | 600000 | Time limit in milliseconds (10 minutes) |
Visual Audits
| Variable | Default | Description |
|----------|---------|-------------|
| VISUAL_AUDITS | true | Enable visual heuristic audits |
| BASELINE_DIR | .ui-qa/baselines | Directory for screenshot baselines |
| DIFF_THRESHOLD | 5 | Visual diff threshold percentage (0-100) |
Auth Fixtures
| Variable | Default | Description |
|----------|---------|-------------|
| AUTH_FIXTURE_DIR | .ui-qa/auth-fixtures | Directory for auth fixture storage |
| AUTH_FIXTURE | - | Auth fixture ID or name to use for testing |
CLI Options
Test Mode:
| Option | Description |
|--------|-------------|
| --goals <string> | Test goals to focus on |
| --help, -h | Show help message |
Validation Mode:
| Option | Description |
|--------|-------------|
| --spec, -s <file> | Path to requirements/specification file (required) |
| --url, -u <url> | URL to validate against (required) |
| --output, -o <dir> | Output directory for reports (default: ./reports) |
| --help, -h | Show help message |
Output
Test Mode Output
Results are saved to .ui-qa-runs/<run-id>/:
.ui-qa-runs/
└── cli-1234567890/
├── run.json # Run metadata and status
├── report.json # Full report with scores and issues
├── evidence.json # Detailed execution evidence
├── report.md # Human-readable markdown report
├── llm-fix.txt # Instructions for AI to fix issues
└── screenshots/ # All captured screenshots
├── 00-initial.png
├── step-01-after.png
└── ...Report Contents:
- Score: 0-100 quality score
- Summary: Overall assessment
- Issues: Categorized problems with:
- Severity (critical, high, medium, low)
- Category (accessibility, usability, performance, etc.)
- Reproduction steps
- Suggested fixes
- Screenshot evidence
Validation Mode Output
Results are saved to the specified output directory (default: ./reports):
reports/
└── validation-1234567890/
├── traceability-report.json # Complete validation report
├── traceability-report.md # Human-readable summary
└── screenshots/ # Evidence linked to requirements
├── req-001-login.png
└── ...Report Contents:
- Requirements: All extracted requirements with IDs, priorities, and acceptance criteria
- Rubric: Evaluation criteria with pass/fail conditions
- Results: Requirement validation results with:
- Status (pass/partial/fail/not_tested)
- Score (0-100 per requirement)
- Evidence screenshots
- Reasoning
- Overall Score: Weighted average based on rubric weights
- Coverage Score: Percentage of requirements successfully tested
- Traceability: Links requirements to test evidence
Project Structure
.
├── src/
│ ├── cli-ink.tsx # TUI entry point
│ ├── config.ts # Configuration management
│ ├── agentBrowser.ts # Browser automation wrapper
│ ├── ink/ # TUI components
│ │ ├── App.tsx # Main TUI application (test mode)
│ │ ├── ValidateApp.tsx # Validation mode TUI
│ │ ├── components/ # UI components
│ │ │ ├── Header.tsx
│ │ │ ├── UrlInput.tsx
│ │ │ ├── PhaseIndicator.tsx
│ │ │ ├── TaskList.tsx
│ │ │ ├── LogStream.tsx
│ │ │ ├── ResultsSummary.tsx
│ │ │ ├── RequirementList.tsx
│ │ │ ├── RubricDisplay.tsx
│ │ │ ├── TraceabilityReport.tsx
│ │ │ └── ValidationProgress.tsx
│ │ └── hooks/
│ │ ├── useQARunner.ts
│ │ └── useValidationRunner.ts
│ ├── qa/ # QA core logic
│ │ ├── planner.ts # Test plan generation
│ │ ├── executor.ts # Test execution
│ │ ├── judge.ts # Result evaluation
│ │ ├── run-streaming.ts # Streaming run orchestrator
│ │ ├── parallelTester.ts # Parallel page testing
│ │ └── types.ts # Type definitions
│ ├── validation/ # Validation feature
│ │ ├── run-validation.ts # Validation orchestrator
│ │ ├── extractor.ts # Requirement extraction
│ │ ├── rubric-generator.ts # Rubric generation
│ │ ├── cross-validator.ts # Cross-validation
│ │ ├── traceability.ts # Report generation
│ │ ├── parsers/ # Document parsers
│ │ └── types.ts # Validation types
│ ├── prompts/ # LLM prompts
│ │ ├── planner.ts
│ │ ├── judge.ts
│ │ ├── extractor.ts
│ │ ├── rubric.ts
│ │ └── cross-validator.ts
│ ├── updates/ # Update checking
│ │ ├── checker.ts
│ │ └── types.ts
│ ├── storage/
│ │ └── local.ts # Local file storage
│ └── utils/ # Utility functions
│ ├── browserPool.ts # Browser instance pooling
│ ├── sitemap.ts # Sitemap parsing
│ └── ...
├── tests/ # Test files
├── .ui-qa-runs/ # Generated results (gitignored)
└── package.jsonArchitecture
┌─────────────────────────────────────────────────────────────┐
│ Terminal UI (Ink) │
│ ┌──────────┐ ┌───────────┐ ┌──────────┐ ┌──────────────┐ │
│ │URL Input │ │ Phases │ │ Logs │ │ Results │ │
│ └──────────┘ └───────────┘ └──────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 1: Init + Phase 2: Discovery │
│ ┌────────────────────┐ ┌────────────────────────────┐ │
│ │ Browser + Audits │ → │ Sitemap/Link Crawling │ │
│ └────────────────────┘ └────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 3: Planning │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ LLM Planner (uses discovery results to create plan) │ │
│ └──────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 4: Traversal (Coverage-Guided by default) │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────┐ │
│ │ Coverage │ │ State │ │ Budget │ │ Explorer │ │
│ │ Tracker │ │ Tracker │ │ Tracker │ │ Engine │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────────┘ │
│ Alternative: Parallel page testing (COVERAGE_GUIDED=false) │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 5: Execution + Phase 6: Evaluation │
│ ┌────────────────────┐ ┌────────────────────────────┐ │
│ │ Additional Tests │ → │ Judge (LLM Evaluation) │ │
│ └────────────────────┘ └────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Local Storage │
│ Screenshots • Reports • Evidence • Markdown │
└─────────────────────────────────────────────────────────────┘Pipeline Components
Test Mode (6 Phases):
- Init: Opens browser, takes initial screenshot, runs DOM audits
- Discovery: Discovers site structure via sitemap.xml, robots.txt, or link crawling
- Planner: Analyzes page DOM + discovery results and creates intelligent test plans using LLM
- Traversal: Tests the site using coverage-guided exploration (default) or parallel page testing
- Execution: Runs additional planned tests from the LLM plan
- Evaluation: Judge (LLM) evaluates test evidence and generates scored reports with issues
Coverage-Guided Exploration (Default):
- Coverage Tracker: Monitors URLs, forms, dialogs, and interactions visited
- State Tracker: Fingerprints page states to detect revisits and transitions
- Budget Tracker: Enforces exploration limits (steps, states, time, stagnation)
- Action Selector: Scores actions by novelty, business criticality, risk, and branch factor
- Explorer Engine: Executes coverage-guided exploration with beam search and backtracking
Set COVERAGE_GUIDED=false to use parallel page testing instead.
Validation Mode:
- Parser: Parses specification documents (Markdown)
- Extractor: Uses LLM to extract testable requirements
- Rubric Generator: Creates evaluation rubrics with pass/fail conditions
- Planner: Creates requirement-linked test plans
- Executor: Runs tests with browser automation
- Cross-Validator: Validates test results against requirements
- Report Generator: Creates traceability reports linking requirements to evidence
Building for Distribution
# Build the CLI binary
bun run build
# The built CLI will be in dist/cli-ink.js
# You can run it with: node dist/cli-ink.jsPublishing to npm
# Build and publish
bun run prepublishOnly
npm publishAfter publishing, users can install and run:
npx @utsav/ui-qa https://example.comSafety
- Only uses dummy data for forms (
[email protected], "Test User") - Never submits payment forms
- Redacts sensitive data from snapshots before LLM processing
- Timeouts on all browser operations
Troubleshooting
Browser Installation
If browser commands fail:
# Reinstall browser
bunx playwright install chromiumAPI Key Issues
Make sure your .env file contains:
OPENROUTER_API_KEY=sk-or-v1-...Debug Mode
For verbose output:
DEBUG=true bun start https://example.comLicense
MIT
