@ganeshgaxy/noob-tester
v0.1.7
Published
Automated manual QA tester — CLI data layer + Claude Code skills
Readme
noob-tester
An AI-powered QA testing system that integrates with Claude Code as a persistent data layer — turning your AI agent into a fully autonomous test engineer. Give it a ticket and a target URL; it reads requirements, analyzes the codebase, writes test cases, executes them via browser automation and direct API testing, finds bugs, and delivers a comprehensive report with root cause analysis.
Table of Contents
What It Is
noob-tester is the CLI data layer for an AI testing agent. The CLI manages all persistent state — sessions, runs, test cases, run packs, UI maps, API maps, codebase indexes, coverage data, secrets, and issues — while Claude Code does the actual work using its tools (browser, file reading, API calls, MCP integrations).
The system is built around two core ideas:
- Skills — Markdown instruction files that tell Claude Code how to perform specific QA phases (analysis, test case generation, exploration, RCA, reporting). Each skill is self-contained and composable.
- Chain Commands — High-level CLI operations that combine multiple steps into a single robust command, reducing agent token usage and error rates compared to manual bash sequences.
Architecture
Claude Code (AI Agent)
│
│ uses skills (SKILL.md files)
│
▼
noob-tester CLI ←→ SQLite DB (~/.noob-tester/noob-tester.db)
│
├── Sessions & Runs (lifecycle tracking)
├── Run Packs (test case execution queue)
├── Test Cases (BDD / traditional format)
├── UI Maps (persistent selector knowledge base)
├── API Maps (endpoint registry + health)
├── Coverage Maps (source file → test case links)
├── Secrets (target-scoped credentials)
├── Codebase Index (FTS5 full-text search)
└── Artifacts (screenshots, HAR, snapshots)
│
▼
Watch Dashboard (React, port 4040)Installation
Prerequisites:
| Dependency | Required | Purpose |
| -------------------------------------------------------------------------------- | -------- | -------------------------------------------------------------------------------- |
| Claude Code | Yes | The AI agent that runs all skills |
| agent-browser | Yes | Browser automation for UI tests (/noob-explore) |
| Atlassian MCP | Yes | Ticket reading, MR discovery, result updates, Confluence. Required by all skills |
| git, curl, jq | Yes | Repo cloning + API test execution + JSON parsing |
| glab | Optional | GitLab CLI — reading MR diffs and repo browsing |
| 1Password CLI | Optional | secrets import-op — import credentials from 1Password vaults |
Install the CLI:
npm install -g @ganeshgaxy/noob-tester
noob-tester setup # verify all dependencies and databaseInstall skills into Claude Code:
cp -r skills/ ~/.claude/skills/Configure the Atlassian MCP server in your Claude Code settings — without it, skills cannot read tickets or discover linked repos.
Quick Start
# Register your repos
noob-tester repos add frontend https://gitlab.com/org/frontend
noob-tester repos add backend https://gitlab.com/org/backend
noob-tester repos group add myapp --repos frontend,backend
# Sync and index the codebase
noob-tester repos sync myapp
noob-tester repos index myapp
# Set up credentials
noob-tester secrets target add staging --url https://staging.app.com
noob-tester secrets set LOGIN_EMAIL "[email protected]" --target staging --role admin
noob-tester secrets set LOGIN_PASSWORD "op:Private/MyApp/password" --target staging --role admin
# Open the live dashboard
noob-tester watchIn Claude Code:
> Use noob-tester to test PROJ-123 at https://staging.app.com
> /noob-analyze PROJ-123
> /noob-testcase PROJ-123
> /noob-explore test the login page at https://staging.app.com
> /noob-api-explore run the API tests for PROJ-123
> /noob-rca analyze the failures
> /noob-report generate a report for PROJ-123Coverage and risk (CLI):
# Build coverage map and find gaps
noob-tester coverage build frontend
noob-tester coverage uncovered frontend
# Select tests affected by a branch
noob-tester testcase select --repo frontend --diff main
# Score and prioritize by risk
noob-tester testcase risk --ticket PROJ-123AI Skills
Skills are markdown instruction files (SKILL.md) installed into Claude Code's skills directory. Each skill is self-contained and composable — use them standalone or chain them through a full QA pipeline.
Install location: ~/.claude/skills/<skill-name>/SKILL.md
Skill Reference
| Skill | Phase | Trigger |
| -------------------- | --------------- | --------------------------------------------------------------------- |
| /noob-tester | Orchestrator | "test PROJ-123" — routes to the right skill or runs the full pipeline |
| /noob-analyze | 1 — Analysis | "analyze the impact of PROJ-123" |
| /noob-plan | 2 — Planning | "PROJ-123 is ready for QA, plan the testing" |
| /noob-testcase | 3 — Generation | "write test cases for PROJ-123" |
| /noob-explore | 4 — UI Testing | "test the login page at https://staging.app.com" |
| /noob-api-explore | 4 — API Testing | "run the API tests for PROJ-123" |
| /noob-rca | 5 — RCA | "why did these tests fail?" |
| /noob-report | 5 — Report | "generate a report for PROJ-123" |
| /noob-mr-pr | Utility | "review the MR for PROJ-123" |
| /noob-repos-setup | Utility | "set up repos for PROJ-123" |
| /noob-ticket-cache | Utility | ticket context caching (called internally by other skills) |
Skill Details
/noob-tester — Main orchestrator. Routes to the right skill based on natural language. Runs the full 5-phase pipeline when asked to "test" a ticket end-to-end.
/noob-analyze — Phase 1. Reads the ticket (via Atlassian MCP), auto-discovers repos from MR links and ticket description, syncs + indexes the codebase, and produces four analysis types:
- Gap analysis — what's missing or unclear in the requirements
- Requirements analysis — structured requirements breakdown
- Feasibility analysis — what can and can't be automated
- Impact analysis — traces every requirement through the full codebase dependency chain (UI → API → service → database), finds regression risks, shared code concerns, config/feature-flag issues, and hidden edge cases
/noob-plan — Phase 2. Runs when a ticket is dev-complete and ready for QA. Fetches linked MRs/PRs, reads actual code diffs, syncs the correct branch, reads prior analysis and UI map, and generates an ordered test plan with: strategy, test steps (confident vs uncertain), test notes, blockers, coverage gaps, and MR references.
/noob-testcase — Phase 3. Generates BDD (Given/When/Then) and Traditional (Steps/Expected) test cases from the plan, analysis, and deep codebase reading. Produces three types: direct_functional → impact_regression → general_regression. Tags every test case with a layer (ui, api, ui_api, database, ai, unit, other) that determines which runner can execute it.
/noob-explore — Phase 4, UI layer. Browser automation (agent-browser / Playwright). Executes ui and ui_api test cases one at a time via run packs. On every page: takes snapshot + screenshot, scans elements into the UI map (stable selectors: role[name="text"]), runs axe-core WCAG audit, checks visual regression against baseline. Uses credentials from secrets store. Recovers from selector failures by checking UI map staleness and retrying from a fresh snapshot.
/noob-api-explore — Phase 4, API layer. Executes ALL api layer test cases in a single invocation using curl + jq. Reads codebase once, authenticates once per role, loops through every API test. Validates status codes, response bodies, timing. Tracks created resources for cleanup in reverse order after each test.
/noob-rca — Phase 5, RCA. Classifies every failed/blocked run pack entry into: env_issue, flaky_selector, actual_bug, test_data_issue, network, auth_issue, timeout, unknown. Assigns confidence scores and suggested actions. Updates failure patterns for future risk scoring.
/noob-report — Phase 5, Report. Pulls all data for a ticket (analyses, plan, test cases, run packs, issues, RCA, a11y, visual diffs, tech issues), generates a PASS/FAIL/PARTIAL verdict, posts results to the ticket, and notifies Slack.
/noob-mr-pr — Utility. Reviews a GitLab MR or Bitbucket PR for a ticket — reads the diff, checks against acceptance criteria, surfaces concerns.
/noob-repos-setup — Utility. Ensures all repos for a ticket are registered, cloned/pulled, and indexed. Called internally by analysis and planning skills.
/noob-ticket-cache — Utility. Manages the ticket context cache (ticket info, MR diffs, comments, linked tickets). Prevents redundant Atlassian MCP and glab calls across skills.
Example Pipeline
# In Claude Code — full 5-phase pipeline
/noob-analyze PROJ-123 # Phase 1: Analysis + impact
/noob-plan PROJ-123 # Phase 2: Test plan from MR diffs
/noob-testcase PROJ-123 # Phase 3: Generate test cases
/noob-explore https://staging.app.com # Phase 4: Browser test (ui/ui_api)
/noob-api-explore PROJ-123 # Phase 4: API tests (api layer, all at once)
/noob-rca # Phase 5: Classify failures
/noob-report PROJ-123 # Phase 5: Final report + ticket updateCLI Reference
noob-tester repos — Manage repositories and codebase index
Register repos, group them, sync to local disk, and build a searchable index with BM25 full-text search + import dependency graph.
| Command | Description |
| -------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
| repos add <name> <url> | Register a repository |
| repos list | List all registered repos |
| repos delete <name> --yes | Delete a repo and its index |
| repos path <name> | Get local path of a synced repo |
| repos group add <name> --repos a,b,c | Create a repo group |
| repos group list | List all groups |
| repos group delete <name> | Delete a group |
| repos discover --ticket <id> | Find all repos for a ticket (from runs, test cases, UI maps) + ensure them. --url <extra> to add more. Auto: register + clone/pull + diff-aware index |
| repos ensure <urls...> | Register + clone/pull + index repos. Accepts URLs or names. Uses glab for GitLab repos. All repos in ~/.noob-tester/repos/ |
| repos sync <name> | Clone or pull a repo or group. --branch <branch> to checkout a specific branch. --reindex to auto-re-index if commit changed |
| repos index <name> | Diff-aware re-index (only changed files since last indexed commit). --full for complete rebuild. Records branch + commit |
| repos search <query> | Search indexed code |
| repos search <query> --expand | Search + show related files via import graph |
| repos search <query> --repos a,b | Search specific repos |
# Register and group
noob-tester repos add frontend https://gitlab.com/org/frontend
noob-tester repos add backend https://gitlab.com/org/backend
noob-tester repos group add myapp --repos frontend,backend
# Sync and index
noob-tester repos sync myapp
noob-tester repos index myapp
# ✔ frontend: 342 files, 1205 imports (main @ a1b2c3d4)
# ✔ backend: 189 files, 567 imports (main @ e5f6g7h8)
# Sync a specific branch (e.g. for testing an MR)
noob-tester repos sync frontend --branch feature/PROJ-123 --reindex
# ✔ frontend: switched to feature/PROJ-123 @ 9i0j1k2l, re-indexed
# Search with import graph expansion
noob-tester repos search "authentication middleware" --expand
# Finds auth.ts + every file that imports it + every file it imports
# Claude Code can also read synced repos directly
noob-tester repos path frontend
# /Users/you/.noob-tester/repos/frontendnoob-tester run — Manage test runs
| Command | Description |
| --------------------------------------------------- | ---------------------------------------------------------------------------------------------------------- |
| run resolve --input-type <type> --input-ref <ref> | Resume or create a run. Reuses existing running/pending run for same input-ref. --fresh to force new |
| run create --input-type <type> --input-ref <ref> | Always create a new run (CLI only — skills should use resolve) |
| run create ... --target-url <url> | Set the target app URL |
| run create ... --repo <url> | Attach repo URLs (repeatable) |
| run create ... --reuse-run <id> | Reuse prior analysis/plan (skip Phase 1 & 2) |
| run create ... --fresh | Ignore all prior run data |
| run create ... --force | Override — regenerate analysis/plan/testcases even if they exist |
| run create ... --capture <types> | Comma-separated capture types: screenshot,snapshot,video,har,console,trace (default: all) |
| run create ... --secret-target <name> | Secret target name for login credentials |
| run create ... --secret-role <role> | Secret role within the target (default: "default") |
| run update <id> --phase <n> | Update current phase |
| run complete <id> --status <s> --summary <text> | Mark run completed/failed |
| run get <id> | Get run details as JSON |
Input types: ticket, confluence, text, file
# From ticket with repos, capture config, and credential reference
noob-tester run create --input-type ticket --input-ref PROJ-123 \
--target-url https://staging.app.com \
--repo https://gitlab.com/org/frontend \
--repo https://gitlab.com/org/backend \
--capture screenshot,snapshot,video,har,console,trace \
--secret-target staging --secret-role admin
# Reuse prior analysis
noob-tester run create --input-type text --input-ref "re-test login" \
--target-url https://app.com --reuse-run <priorRunId>noob-tester testcase — Generate and manage test cases
Test cases in BDD or traditional format, with multi-session claim system.
Test case types (execution priority):
- direct_functional — core feature/fix tests (executed first)
- impact_regression — tests for impacted dependencies (executed second)
- general_regression — crucial flows not directly touched (executed last)
Test layers (what kind of test):
| Layer | Description |
|---|---|
| ui | Pure UI interaction — clicks, forms, navigation (default) |
| api | Pure API — request/response, status codes, payloads |
| ui_api | Spans both — UI action triggers API call, verify both sides |
| database | Data persistence, queries, migrations, constraints |
| ai | AI/ML features — prompts, responses, model behavior |
| unit | Code-level unit test — functions, utilities, pure logic |
| other | Does not fit above categories |
/noob-explore only executes ui and ui_api tests (browser automation). Other layers need different runners.
| Command | Description |
| ------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------- |
| testcase create <runId> --ticket <ref> --type <type> --format <fmt> --title <text> | Create a test case (draft by default) |
| testcase create ... --layer <layer> | Set test layer: ui, api, ui_api, database, ai, unit, other (default: ui) |
| testcase create ... --ready | Create and mark as ready for execution |
| testcase mark-ready <id> | Mark a test case as ready |
| testcase mark-draft <id> | Mark a test case as draft (not executable) |
| testcase ready-all <ticketRef> | Mark all test cases for a ticket as ready |
| testcase draft-all <ticketRef> | Mark all test cases for a ticket as draft |
| testcase claim <ticketRef> <sessionId> | Claim next available ready case (priority order) |
| testcase claim ... --fresh | Also claim previously completed cases |
| testcase result <id> --status <s> --run <runId> | Record execution result |
| testcase release <id> | Release a claimed case |
| testcase release-session <sessionId> | Release all claims by a session |
| testcase list --ticket <ref> | List cases for a ticket |
| testcase stats <ticketRef> | Show counts by type/status |
| testcase select --repo <name> --diff <branch> | Select test cases affected by code changes (via coverage_map + import graph) |
| testcase risk --ticket <ref> | Compute risk scores from failure patterns, code churn, flakiness, recency |
| testcase audit --ticket <ref> | Audit: find duplicates, never-failed, stale. --duplicates, --orphaned, --stale |
# BDD format with test layer
noob-tester testcase create $RUN_ID \
--ticket PROJ-123 --type direct_functional --format bdd \
--title "Login with valid credentials" \
--bdd-feature "Login" --bdd-scenario "Valid login" \
--bdd-given '["user on /login"]' \
--bdd-when '["enters email","enters password","clicks Sign In"]' \
--bdd-then '["redirected to dashboard"]' \
--impacted-files '["src/auth/login.ts"]' \
--layer ui
# Traditional format — API test (won't be picked up by noob-explore)
noob-tester testcase create $RUN_ID \
--ticket PROJ-123 --type direct_functional --format traditional \
--title "POST /api/auth/login returns 200 with valid creds" \
--trad-steps '[{"step":"POST /api/auth/login with valid body","expected":"200 + session token"}]' \
--layer api
# UI + API test
noob-tester testcase create $RUN_ID \
--ticket PROJ-123 --type impact_regression --format traditional \
--title "Checkout still works after auth change" \
--trad-steps '[{"step":"Go to checkout","expected":"Page loads"}]' \
--layer ui_api
# Multi-session: each session claims one test case at a time
noob-tester testcase claim PROJ-123 $SESSION_ID # gets next unclaimed
noob-tester testcase claim PROJ-123 $SESSION_ID --fresh # re-runs completed toonoob-tester runpack — Run packs (execution batches)
Run packs are the execution layer for /noob-explore. Each pack groups test case executions for a ticket, with stored target URL, credential references, and capture config.
Default behavior: /noob-explore resumes an existing pack with pending/failed entries. If none exist, it creates a new pack. If the user says "rerun" or "fresh", it forces a new pack.
| Command | Description |
| ---------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| runpack resolve --ticket <id> --run <runId> | Resume or create a run pack. Checks for existing packs with pending/failed entries first. --fresh to force new. Optional: --target-url, --secret-target, --secret-role, --capture, --session |
| runpack create --ticket <id> --run <runId> | Create a new run pack (always fresh, CLI only — skills should use resolve). Optional: --target-url, --secret-target, --secret-role, --capture, --session |
| runpack meta <packId> | Get pack metadata (target, credentials, capture config) |
| runpack add <packId> <testCaseId> | Add a specific test case to a pack |
| runpack claim <packId> <sessionId> | Claim next pending entry already in the pack (resume mode) |
| runpack claim-next <packId> <ticketId> <sessionId> | Pick next test case not yet in the pack, add and claim it. --layer to filter by test layer. --runner to set runner type (ui/api, auto-detected from layer) |
| runpack populate <packId> <ticketId> --status <s> | Add ready test cases to pack with status: pending, blocked, skipped. --layer to filter by test layer (e.g. --layer api). --runner to stamp entries. Optional: --reason, --run, --session |
| runpack result <entryId> --status <s> | Record result: passed, failed, skipped, blocked. Optional: --results, --logs, --observations, --issues (all JSON) |
| runpack artifact <entryId> --type <t> --path <p> | Attach artifact: screenshot, snapshot, video, har, console, trace. Optional: --label, --step, --metadata |
| runpack observe <entryId> --text <t> | Add an observation |
| runpack log <entryId> --text <t> | Add a log entry |
| runpack list --ticket <id> | List run packs for a ticket (with pass/fail/pending counts) |
| runpack list --pack <packId> | List entries in a specific pack |
| runpack release <packId> | Release all claimed entries back to pending |
| runpack retry --entry <entryId> | Retry a specific entry (reset to pending) |
| runpack retry --name <text> --pack <packId> | Retry entries matching test case name (substring) |
| runpack retry --pack <packId> | Retry all failed/blocked entries |
| runpack retry --all <packId> | Retry ALL entries including passed (full rerun of same pack) |
| runpack delete --pack <packId> --yes | Delete a specific pack |
| runpack delete --ticket <id> --yes | Delete all packs for a ticket |
| runpack auto-retry <packId> | Mark all failed/blocked entries for auto-retry (max 1 retry per entry) |
| runpack classify-retry <entryId> --status <s> | Classify retry result: likely_false_positive if passed, confidence level if failed |
| runpack false-positives <packId> | Show false positive analysis (total, retried, false positives, confirmed, by confidence) |
# Resolve — resumes existing pack or creates new (always use this, not create)
noob-tester runpack resolve --ticket PROJ-123 --run $RUN_ID \
--target-url "https://staging.app.com" \
--secret-target staging --secret-role admin \
--capture screenshot,snapshot,har
# Returns: { runPackId, resumed: true/false }
# Claim one test case (resume pending first, then fresh)
noob-tester runpack claim $RUNPACK_ID $SESSION_ID # resume pending entry
noob-tester runpack claim-next $RUNPACK_ID PROJ-123 $SESSION_ID # or claim fresh
noob-tester runpack claim-next $RUNPACK_ID PROJ-123 $SESSION_ID --layer ui # only UI tests
# Record results and artifacts per entry
noob-tester runpack result $ENTRY_ID --status passed --results '{"summary":"all good"}'
noob-tester runpack artifact $ENTRY_ID --type screenshot --path ./step1.png --label "After login" --step 1noob-tester capture — Per-action artifact storage
Stores snapshots, console logs, HAR network data, screenshots, and network errors per action, linked to run, runpack entry, page URL, and action number.
| Command | Description |
| ------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| capture store --run <runId> --type <type> | Store an artifact. Types: snapshot, screenshot, console, har, video, trace, network_error, api_request. --file <path> or --content <text>. Optional: --pack, --entry, --session, --ticket, --action <n>, --desc, --url <pageUrl> |
| capture list --run <runId> | List artifacts for a run. --entry <id> for a specific entry. --type to filter |
| capture stats --run <runId> | Show artifact counts by type |
# Store console logs for an action
noob-tester capture store --run $RUN_ID --type console --file ./evidence/console.txt \
--url "/dashboard" --action 3 --desc "After clicking Save" --ticket FEAT-7679
# List all HAR files for a run
noob-tester capture list --run $RUN_ID --type har
# Stats
noob-tester capture stats --run $RUN_ID
# {"snapshot":5,"screenshot":5,"console":5,"har":5}noob-tester uimap — UI maps (persistent app knowledge)
UI maps are a persistent knowledge base of how an app's UI works — pages, selectors, navigation paths, forms, reliability tracking. Shared across targets with the same repos. Grows with every /noob-explore session.
A map is defined by repos, not targets. Multiple targets (staging, prod, dev) sharing the same codebase share the same map. Fetchable by ticket ID, repo URL, or target URL.
Stable selectors — uimap scan stores elements using role + text/label/placeholder/url (e.g. button[name="Sign In"], textbox[placeholder="Search"]), not ephemeral [ref=eN] refs. Each element records its selector strategy type (role+text, role+placeholder, role+url, ref). The map tells you WHAT elements to expect, the current browser snapshot tells you WHERE they are.
| Command | Description |
| ---------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| uimap create --name <n> | Create a map. Optional: --repos, --targets, --tickets (comma-separated) |
| uimap get <id> | Get map details + stats |
| uimap list | List all maps with stats |
| uimap resolve --ticket <id> | Find a map by ticket ID, --repo, or --target. Returns first match |
| uimap update <id> | Add repos/targets/tickets: --add-repos, --add-targets, --add-tickets |
| uimap delete <id> --yes | Delete map and all its data |
| uimap page <mapId> --url <pattern> | Record/update a page (upserts by URL). Optional: --title, --snapshot, --screenshot, --auth-required, --auth-roles, --code, --repos, --tickets, --parity, --run, --session |
| uimap pages <mapId> | List all pages |
| uimap element <pageId> --selector <sel> --type <t> | Record/update an element (upserts by selector). Optional: --role, --text, --action, --result, --code, --tickets, --auth-roles, --run, --testcase |
| uimap elements <pageId> | List elements on a page |
| uimap lookup --map <id> --url <pattern> | Look up elements by URL. --type to filter. Sorted by reliability |
| uimap hit <elementId> | Record selector success. --run optional |
| uimap miss <elementId> | Record selector failure. Auto-updates status (working/flaky/broken) |
| uimap alt <elementId> --selector <sel> | Add alternative selector |
| uimap flaky <mapId> | List flaky/broken elements |
| uimap nav <mapId> --from <pageId> --to <pageId> | Record navigation. --via element, --type, --conditions |
| uimap path --map <id> --from <url> --to <url> | Find navigation path between URLs (BFS pathfinding) |
| uimap form <pageId> | Record/update a form. --selector, --fields (JSON), --submit, --success, --error, --sample-values |
| uimap scan <pageId> --snapshot <path> | Parse accessibility snapshot and bulk-record all elements + forms. Stores stable selectors: role[name="text"], role[placeholder], role[url], @ref fallback. Records selector strategy per element. --ticket, --run, --session optional |
| uimap stats <mapId> | Show map statistics |
# Create a map for the app (defined by repos, not target)
noob-tester uimap create --name "My App" \
--repos "https://gitlab.com/org/frontend,https://gitlab.com/org/backend" \
--targets "https://staging.app.com,https://prod.app.com" \
--tickets "PROJ-123"
# Find existing map by ticket, repo, or target
noob-tester uimap resolve --ticket PROJ-456
# Record pages and scan elements from snapshot (2 commands per page)
noob-tester uimap page $MAP_ID --url "/login" --title "Login" --ticket PROJ-123 --run $RUN_ID
noob-tester uimap scan $PAGE_ID --snapshot ./snapshot.txt --ticket PROJ-123 --run $RUN_ID
# Scan creates: button[name="Sign In"], textbox[name="Email"], link[url="/forgot-password"]
# Track selector reliability
noob-tester uimap hit $ELEMENT_ID --run $RUN_ID # worked
noob-tester uimap miss $ELEMENT_ID --run $RUN_ID # failed
# Navigation pathfinding
noob-tester uimap path --map $MAP_ID --from "/login" --to "/checkout"
# Track target parity (staging has it, prod doesn't)
noob-tester uimap page $MAP_ID --url "/beta-feature" \
--parity '{"staging":true,"prod":false}'noob-tester apimap — API maps (persistent endpoint registry)
Like UI maps for the frontend, API maps are a persistent knowledge base of your backend. Endpoints, parameters, response schemas, dependency chains, and health tracking — all visualized as a force-directed graph in the dashboard.
| Command | Description |
| ------------------------------ | --------------------------------------------------------------------------------------------- |
| apimap resolve <name> | Find or create an API map. --base-url, --tickets, --repos |
| apimap endpoint <mapId> | Register/update an endpoint. --method, --path, --summary, --auth-type, --auth-roles |
| apimap call <endpointId> | Record a call result. --status (HTTP code), --time (ms). Updates health automatically |
| apimap param <endpointId> | Add a parameter. --name, --in (path/query/body/header), --type, --required |
| apimap response <endpointId> | Register expected response. --status, --schema, --example |
| apimap chain <mapId> | Add dependency. --from, --to, --type (creates/reads/updates/deletes/cleanup) |
| apimap lookup <mapId> | Find endpoint by --method + --path |
| apimap list | List all API maps |
| apimap get <name> | Full map data (endpoints, params, responses, chains) |
| apimap stats <name> | Statistics (total, active, flaky, failing, avg response time) |
# Create or find an API map
APIMAP_ID=$(noob-tester apimap resolve "my-api" --base-url https://api.staging.com --tickets PROJ-123 | jq -r '.id')
# Register endpoints discovered from code analysis
EP_ID=$(noob-tester apimap endpoint $APIMAP_ID --method POST --path "/api/users" \
--summary "Create user" --auth-type bearer --auth-roles "admin" | jq -r '.endpointId')
# Add params and responses
noob-tester apimap param $EP_ID --map $APIMAP_ID --name email --in body --type string --required
noob-tester apimap response $EP_ID --map $APIMAP_ID --status 201 --schema '{"id":"string","email":"string"}'
# Record call results during testing (auto-updates health)
noob-tester apimap call $EP_ID --status 201 --time 150 --run $RUN_ID
# Add dependency chains
GET_EP_ID=$(noob-tester apimap endpoint $APIMAP_ID --method GET --path "/api/users/:id" | jq -r '.endpointId')
noob-tester apimap chain $APIMAP_ID --from $EP_ID --to $GET_EP_ID --type createsEndpoint health updates automatically based on call results:
- active — no failures or low failure rate
- flaky — intermittent failures (some succeed, some fail)
- failing — consistently failing (3+ consecutive failures)
noob-tester coverage — Code-level coverage mapping
Link test cases to source files via impacted_files + import graph expansion. Find which source files have no test coverage.
| Command | Description |
| ------------------------------------- | ---------------------------------------------------------------------------------------------- |
| coverage build <repoName> | Build coverage map from test case impacted_files + 1-level import graph expansion |
| coverage stats <repoName> | Show coverage statistics (total/covered/uncovered files, coverage %) |
| coverage uncovered <repoName> | List files with no test case coverage, sorted by importer count (more importers = higher risk) |
| coverage file <repoName> <filePath> | Show which test cases cover a specific file (with link type and confidence) |
| coverage clear <repoName> | Clear coverage map for a repo (rebuild with coverage build) |
# Build coverage map (reads test_cases.impacted_files, expands via import_graph)
noob-tester coverage build frontend
# View stats
noob-tester coverage stats frontend
# Total: 342, Covered: 89, Uncovered: 253, Coverage: 26%
# Find highest-risk uncovered files
noob-tester coverage uncovered frontend --limit 20
# Which test cases cover auth.ts?
noob-tester coverage file frontend src/auth/login.tsnoob-tester rca — Root cause analysis
Classify failures from completed run packs. Used by /noob-rca skill or standalone.
| Command | Description |
| ------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| rca save | Save an RCA result. --pack, --entry, --testcase, --classification, --confidence, --cause required. Optional: --evidence, --pattern, --action |
| rca list --pack <id> | List RCA results for a run pack (with test case details) |
| rca summary --pack <id> | Summary counts by classification and suggested action |
| rca get <entryId> | Get RCA result for a specific entry |
| rca clear --pack <id> | Clear all RCA results for re-analysis |
Classifications: env_issue, flaky_selector, actual_bug, test_data_issue, network, auth_issue, timeout, unknown
Suggested actions: retry, fix_test, fix_app, fix_env, investigate, skip
# Save an RCA result
noob-tester rca save --pack $PACKID --entry $ENTRY_ID --testcase $TC_ID \
--classification actual_bug --confidence 0.9 \
--cause "Auth middleware doesn't pass session to downstream services" \
--evidence "Console shows 500 on /api/orders, HAR confirms missing session header" \
--action fix_app
# View summary
noob-tester rca summary --pack $PACKID
# { total: 8, byClassification: { actual_bug: 3, env_issue: 2, flaky_selector: 1, network: 2 } }noob-tester a11y — Accessibility testing
Store and query axe-core WCAG audit results. Automatically populated by /noob-explore on every page load.
| Command | Description |
| ---------------------- | --------------------------------------------------------------------------------------------------------------------- |
| a11y scan <runId> | Store axe-core violations JSON. --url, --results (JSON array). Optional: --pack, --entry, --page-id |
| a11y add <runId> | Store a single a11y issue. --url, --rule, --impact, --description. Optional: --wcag, --selector, --html |
| a11y list | List a11y issues. --run, --pack, or --page to filter |
| a11y summary <runId> | Summary by impact level and rule, with page count |
Impact levels: critical, serious, moderate, minor (mapped from axe-core)
# Store axe-core results from browser evaluation
noob-tester a11y scan $RUN_ID --url "/login" --results "$AXE_VIOLATIONS" \
--pack $PACKID --entry $ENTRY_ID
# View summary
noob-tester a11y summary $RUN_ID
# { total: 12, byImpact: { serious: 4, moderate: 6, minor: 2 }, pageCount: 3 }
# List all issues for a pack
noob-tester a11y list --pack $PACKID --jsonnoob-tester testcase select — Test selection by code changes
Given a git diff, find which test cases should run based on coverage map + import graph.
| Command | Description |
| ----------------------------------------------- | -------------------------------------------------------------------------------------------------------------- |
| testcase select --repo <name> --diff <branch> | Select test cases affected by changed files. --ticket to scope. --depth <n> for deeper expansion. --json |
# Which test cases should run for this branch?
noob-tester testcase select --repo frontend --diff main
# Changed: 5 files, Affected: 12 files (with imports), Test cases: 8
noob-tester testcase select --repo frontend --diff main --ticket PROJ-123 --jsonRequires coverage build first — builds the file-to-testcase mapping.
noob-tester testcase risk — Risk-based prioritization
Compute risk scores from failure patterns, code churn, flakiness, and recency.
| Command | Description |
| ------------------------------ | ---------------------------------------------------------------- |
| testcase risk --ticket <ref> | Compute and store risk scores for all ready test cases. --json |
noob-tester testcase risk --ticket PROJ-123
# Computed: 15 test cases, Avg score: 0.42, High risk: 3
# Claim in risk order (highest risk first)
noob-tester runpack claim-next $PACKID PROJ-123 $SESSION --layer ui --risknoob-tester runpack auto-retry / false-positives — False positive reduction
Auto-retry failed entries to distinguish real failures from transient issues.
| Command | Description |
| ----------------------------------------------- | ------------------------------------------------------------------------------ |
| runpack auto-retry <packId> | Mark all failed/blocked entries for retry (max 1 retry each) |
| runpack classify-retry <entryId> --status <s> | Classify retry result: likely_false_positive if passed, confidence if failed |
| runpack false-positives <packId> | Show false positive stats. --json |
# After execution completes with failures:
noob-tester runpack auto-retry $PACKID # resets failed → pending (retry_count++)
# After retry pass completes:
noob-tester runpack classify-retry $ENTRY_ID --status passed # → likely_false_positive
# View analysis
noob-tester runpack false-positives $PACKID
# Total failed: 8, Retried: 8, False positives: 3, Confirmed: 5noob-tester testcase audit — Test suite cleanup & deduplication
Audit test cases for duplicates, never-failed, orphaned, and stale entries.
| Command | Description |
| ---------------------------------------------- | --------------------------------------------------------------------------------- |
| testcase audit --ticket <ref> | Full audit: duplicates + never-failed + stale. --json |
| testcase audit --duplicates --ticket <ref> | Only near-duplicate pairs (Jaccard similarity). --threshold <n> (default: 0.65) |
| testcase audit --never-failed --ticket <ref> | Test cases executed but never failed (potential low-value) |
| testcase audit --orphaned | Test cases with no run pack activity in 90 days (across all tickets) |
| testcase audit --stale --ticket <ref> | Test cases not executed in 30+ days |
# Full audit
noob-tester testcase audit --ticket PROJ-123
# Total: 24, Duplicates: 2 pairs, Never failed: 5, Stale: 3
# Just duplicates with custom threshold
noob-tester testcase audit --duplicates --ticket PROJ-123 --threshold 0.7 --jsonnoob-tester visual — Visual regression testing
Compare screenshots against baselines per page/viewport. Hash-based quick check + Claude vision for detailed analysis.
| Command | Description |
| ----------------------------------------------------------------- | ----------------------------------------------------------------------------------------- |
| visual baseline --page <id> --url <pattern> --screenshot <path> | Set baseline. --viewport, --run, --entry |
| visual compare --page <id> --screenshot <path> | Compare against baseline (hash check). Returns { hasBaseline, hashMatch, baselinePath } |
| visual diff-save --baseline <id> --run <runId> --current <path> | Save diff result. --score, --description, --regression, --entry |
| visual list | List diffs. --run, --unreviewed, --json |
| visual accept <diffId> | Accept current screenshot as new baseline |
| visual review <diffId> | Mark reviewed: --regression or --ok |
| visual stats | Stats: baselines, diffs, regressions, reviewed/unreviewed. --run, --json |
# Set baseline on first passing run
noob-tester visual baseline --page $PAGE_ID --url "/login" \
--screenshot ./evidence/login.png --run $RUN_ID
# Compare on next run
noob-tester visual compare --page $PAGE_ID --screenshot ./evidence/login-current.png
# { hasBaseline: true, hashMatch: false, baselinePath: "...", baselineId: "..." }
# Save diff after Claude vision analysis
noob-tester visual diff-save --baseline $BASELINE_ID --run $RUN_ID \
--current ./evidence/login-current.png \
--score 0.6 --description "Submit button changed from blue to green" --regression
# Review
noob-tester visual list --unreviewed
noob-tester visual review $DIFF_ID --regression
noob-tester visual accept $DIFF_ID # promote as new baselinenoob-tester secrets — Manage credentials
Scoped to targets (environments/apps) and roles (admin, user, api). Supports literal values, environment variables (env:), and 1Password (op:).
| Command | Description |
| -------------------------------------------------------- | ----------------------------------- |
| secrets target add <name> --url <url> | Register a target |
| secrets target list | List all targets and roles |
| secrets target delete <name> --yes | Delete a target and all its secrets |
| secrets set <key> <value> --target <t> --role <r> | Set a secret |
| secrets get-profile --target <t> --role <r> | Get all resolved secrets |
| secrets get-profile --url <url> --role <r> | Get secrets by matching URL |
| secrets delete <key> --target <t> --role <r> | Delete a secret |
| secrets delete-role --target <t> --role <r> | Delete all secrets for a role |
| secrets list | List all (values masked) |
| secrets list --target <t> | Filter by target |
| secrets list --role <r> | Filter by role |
| secrets list --url <url> | Filter by URL |
| secrets find <search> | Find by key or value (e.g. email) |
| secrets import-op <vault/item> --target <t> --role <r> | Import from 1Password |
| secrets import-op ... --live | Store as op: refs (always fresh) |
| secrets import-op ... --map label=KEY | Custom field mapping |
| secrets import-op ... --prefix APP_ | Prefix all keys |
# Register targets
noob-tester secrets target add staging --url https://staging.app.com
noob-tester secrets target add prod --url https://prod.app.com
# Set credentials
noob-tester secrets set LOGIN_EMAIL "[email protected]" --target staging --role admin
noob-tester secrets set LOGIN_PASSWORD "op:Private/MyApp/password" --target staging --role admin
noob-tester secrets set API_TOKEN "env:STAGING_TOKEN" --target staging --role api
# Import all fields from 1Password at once
noob-tester secrets import-op "Private/MyApp" --target staging --role admin
noob-tester secrets import-op "Private/MyApp" --target staging --role admin --live # keep as op:// refs
# Vault names with slashes work — last segment is the item name
noob-tester secrets import-op "ENG/Development/TeamEnablementQA" --target staging --role admin --live
# Query
noob-tester secrets get-profile --target staging --role admin
noob-tester secrets get-profile --url https://staging.app.com --role admin
noob-tester secrets find "[email protected]"noob-tester session — Track active sessions
| Command | Description |
| -------------------------------------------------------------------------- | ---------------------------------------------- |
| session start --task <text> --labels <a,b> --tickets <PROJ-123,PROJ-456> | Register a session with labels and ticket refs |
| session heartbeat <id> --phase <n> --run-id <id> --tickets <PROJ-789> | Keep alive, add tickets |
| session end <id> | Mark completed |
| session get <id> | Get details |
| session link <runId> <sessionId> | Link a run to a session |
| session list | List all (marks stale after 5min) |
| session list --active | Only active sessions |
noob-tester watch — Live web dashboard
noob-tester watch # http://localhost:4040
noob-tester watch --port 3000
noob-tester watch --session <id> # focus on one sessionLayout: Left sidebar navigation with logo at top, nav links in middle, live stats at bottom. Content area fills remaining space. Breadcrumb navigation (clickable chips with | separator) on all detail pages. Split views with independent scroll per panel.
Pages:
- Dashboard — sessions grouped by ticket. Click a ticket → split view with sessions (left) and issues (right) for that ticket. Click a session → full session detail with breadcrumb
Dashboard | FEAT-7679 | abc123 - Issues — all issues grouped by ticket → sortable table (click column headers to sort by severity, category, title, location, time). Click any issue → full detail modal
- Analyses — grouped by run, viewable per analysis type
- Explore — run packs grouped by ticket → pack detail with test case entries, results, per-action artifacts (snapshots, console logs, HAR, screenshots), logs, observations
- Test Cases — suites grouped by ticket → split view with BDD/traditional steps, ready/draft badges
- Plans — test plans by ticket → plan detail with Requirements, Steps, and Test Notes tabs. Steps linked to test cases, MRs, UI map pages. Blockers, coverage gaps, strategy
- Repos — registered repos with sync status, index stats, groups
- UI Maps — force-directed canvas sitemap (zoom, pan, drag nodes/clusters). Click a page → modal with element map canvas + screenshot + elements/forms/navigations
- Metrics — aggregate usage stats
- Secrets — targets → roles → secrets with reveal/add/delete, 1Password import
- Docs — tabbed CLI command reference (CLI Commands, Skills, Concepts)
Issue detail modal — click any issue anywhere → full modal with: severity/category badges, description, location, screenshot, console output, network data, per-action artifacts (from run_artifacts table), related run info, test case, analyses, technical issues with workarounds, UI map sitemap canvas (affected page highlighted), element list, metadata.
Updates live via SSE every 2 seconds. Zero external dependencies.
noob-tester tech-issue — Technical issue tracking
Track and manage technical difficulties (timeouts, crashes, env issues) encountered during testing. Serves as a knowledge ba
