@ganeshgaxy/noob-tester

v0.1.17

Published

a month ago

Automated manual QA tester — CLI data layer + Claude Code skills

0High
0Medium
0Low

noob-tester

An AI-powered QA testing system that integrates with Claude Code as a persistent data layer — turning your AI agent into a fully autonomous test engineer. Give it a ticket and a target URL; it reads requirements, analyzes the codebase, writes test cases, executes them via browser automation and direct API testing, finds bugs, and delivers a comprehensive report with root cause analysis.

What It Is

noob-tester is the CLI data layer for an AI testing agent. The CLI manages all persistent state — sessions, runs, test cases, run packs, UI maps, API maps, codebase indexes, coverage data, secrets, and issues — while Claude Code does the actual work using its tools (browser, file reading, API calls, MCP integrations).

The system is built around two core ideas:

Skills — Markdown instruction files that tell Claude Code how to perform specific QA phases (analysis, test case generation, exploration, RCA, reporting). Each skill is self-contained and composable.
Chain Commands — High-level CLI operations that combine multiple steps into a single robust command, reducing agent token usage and error rates compared to manual bash sequences.

Architecture

Claude Code (AI Agent)
       │
       │  uses skills (SKILL.md files)
       │
       ▼
noob-tester CLI  ←→  SQLite DB (~/.noob-tester/noob-tester.db)
       │
       ├── Sessions & Runs        (lifecycle tracking)
       ├── Run Packs              (test case execution queue)
       ├── Test Cases             (BDD / traditional format)
       ├── UI Maps                (persistent selector knowledge base)
       ├── API Maps               (endpoint registry + health)
       ├── Coverage Maps          (source file → test case links)
       ├── Secrets                (target-scoped credentials)
       ├── Codebase Index         (FTS5 full-text search)
       └── Artifacts              (screenshots, HAR, snapshots)
             │
             ▼
    Watch Dashboard (React, port 4040)

Installation

Prerequisites:

| Dependency | Required | Purpose | | -------------------------------------------------------------------------------- | -------- | -------------------------------------------------------------------------------- | | Claude Code | Yes | The AI agent that runs all skills | | agent-browser | Yes | Browser automation for UI tests (/noob-explore) | | Atlassian MCP | Yes | Ticket reading, MR discovery, result updates, Confluence. Required by all skills | | git, curl, jq | Yes | Repo cloning + API test execution + JSON parsing | | glab | Optional | GitLab CLI — reading MR diffs and repo browsing | | 1Password CLI | Optional | secrets import-op — import credentials from 1Password vaults |

Install the CLI:

npm install -g @ganeshgaxy/noob-tester
noob-tester setup          # verify all dependencies and database

Install skills into Claude Code:

cp -r skills/ ~/.claude/skills/

Configure the Atlassian MCP server in your Claude Code settings — without it, skills cannot read tickets or discover linked repos.

Quick Start

# Register your repos
noob-tester repos add frontend https://gitlab.com/org/frontend
noob-tester repos add backend https://gitlab.com/org/backend
noob-tester repos group add myapp --repos frontend,backend

# Sync and index the codebase
noob-tester repos sync myapp
noob-tester repos index myapp

# Set up credentials
noob-tester secrets target add staging --url https://staging.app.com
noob-tester secrets set LOGIN_EMAIL "[email protected]" --target staging --role admin
noob-tester secrets set LOGIN_PASSWORD "op:Private/MyApp/password" --target staging --role admin

# Open the live dashboard
noob-tester watch

In Claude Code:

> Use noob-tester to test PROJ-123 at https://staging.app.com
> /noob-analyze PROJ-123
> /noob-testcase PROJ-123
> /noob-explore test the login page at https://staging.app.com
> /noob-api-explore run the API tests for PROJ-123
> /noob-rca analyze the failures
> /noob-report generate a report for PROJ-123

Coverage and risk (CLI):

# Build coverage map and find gaps
noob-tester coverage build frontend
noob-tester coverage uncovered frontend

# Select tests affected by a branch
noob-tester testcase select --repo frontend --diff main

# Score and prioritize by risk
noob-tester testcase risk --ticket PROJ-123

AI Skills

Skills are markdown instruction files (SKILL.md) installed into Claude Code's skills directory. Each skill is self-contained and composable — use them standalone or chain them through a full QA pipeline.

Install location: ~/.claude/skills/<skill-name>/SKILL.md

Skill Reference

| Skill | Phase | Trigger | | -------------------- | --------------- | --------------------------------------------------------------------- | | /noob-tester | Orchestrator | "test PROJ-123" — routes to the right skill or runs the full pipeline | | /noob-analyze | 1 — Analysis | "analyze the impact of PROJ-123" | | /noob-plan | 2 — Planning | "PROJ-123 is ready for QA, plan the testing" | | /noob-testcase | 3 — Generation | "write test cases for PROJ-123" | | /noob-explore | 4 — UI Testing | "test the login page at https://staging.app.com" | | /noob-api-explore | 4 — API Testing | "run the API tests for PROJ-123" | | /noob-rca | 5 — RCA | "why did these tests fail?" | | /noob-report | 5 — Report | "generate a report for PROJ-123" | | /noob-mr-pr | Utility | "review the MR for PROJ-123" | | /noob-claim | 4 — Claiming | "claim the next test case for PROJ-123" | | /noob-pool | 4 — Parallel | "run pool agents for PROJ-123" — dispatch multiple agents in parallel | | /noob-repos-setup | Utility | "set up repos for PROJ-123" | | /noob-ticket-cache | Utility | ticket context caching (called internally by other skills) |

Skill Details

/noob-tester — Main orchestrator. Routes to the right skill based on natural language. Runs the full 5-phase pipeline when asked to "test" a ticket end-to-end.

/noob-analyze — Phase 1. Reads the ticket (via Atlassian MCP), auto-discovers repos from MR links and ticket description, syncs + indexes the codebase, and produces four analysis types:

Gap analysis — what's missing or unclear in the requirements
Requirements analysis — structured requirements breakdown
Feasibility analysis — what can and can't be automated
Impact analysis — traces every requirement through the full codebase dependency chain (UI → API → service → database), finds regression risks, shared code concerns, config/feature-flag issues, and hidden edge cases

/noob-plan — Phase 2. Runs when a ticket is dev-complete and ready for QA. Fetches linked MRs/PRs, reads actual code diffs, syncs the correct branch, reads prior analysis and UI map, and generates an ordered test plan with: strategy, test steps (confident vs uncertain), test notes, blockers, coverage gaps, and MR references.

/noob-testcase — Phase 3. Generates BDD (Given/When/Then) and Traditional (Steps/Expected) test cases from the plan, analysis, and deep codebase reading. Produces three types: direct_functional → impact_regression → general_regression. Tags every test case with a layer (ui, api, ui_api, database, ai, unit, other) that determines which runner can execute it.

/noob-explore — Phase 4, UI layer. Browser automation (agent-browser / Playwright). Executes ui and ui_api test cases one at a time via run packs. On every page: takes snapshot + screenshot, scans elements into the UI map (stable selectors: role[name="text"]), runs axe-core WCAG audit, checks visual regression against baseline. Uses credentials from secrets store. Recovers from selector failures by checking UI map staleness and retrying from a fresh snapshot.

/noob-api-explore — Phase 4, API layer. Executes ALL api layer test cases in a single invocation using curl + jq. Reads codebase once, authenticates once per role, loops through every API test. Validates status codes, response bodies, timing. Tracks created resources for cleanup in reverse order after each test.

/noob-rca — Phase 5, RCA. Classifies every failed/blocked run pack entry into: env_issue, flaky_selector, actual_bug, test_data_issue, network, auth_issue, timeout, unknown. Assigns confidence scores and suggested actions. Updates failure patterns for future risk scoring.

/noob-report — Phase 5, Report. Pulls all data for a ticket (analyses, plan, test cases, run packs, issues, RCA, a11y, visual diffs, tech issues), generates a PASS/FAIL/PARTIAL verdict, posts results to the ticket, and notifies Slack.

/noob-mr-pr — Utility. Reviews a GitLab MR or Bitbucket PR for a ticket — reads the diff, checks against acceptance criteria, surfaces concerns.

/noob-claim — Phase 4, Claiming. Claims test cases from run packs for execution. Three modes: claim next (uses claim-smart — retry failed → resume pending → claim new), claim by name (substring match with validation), and retry a specific test. Auto-creates session and run pack if not provided. Returns a $ENTRY JSON object for /noob-explore to execute.

/noob-pool — Phase 4, Parallel dispatch. Reads qa_pool_agents config for the ticket, enumerates all pending test cases (sorted by priority: direct_functional → impact_regression → general_regression), checks the most recent run pack to skip already-claimed/running/passed test cases, then fires one claude sub-agent per test case as a background process (fire-and-forget). Supports a MAX_SPAWNS limit (default 5, overridable). Each agent invocation targets a specific test case by name — eliminating all race conditions. Monitor results in the Pool page of the watch dashboard.

/noob-repos-setup — Utility. Ensures all repos for a ticket are registered, cloned/pulled, and indexed. Called internally by analysis and planning skills.

/noob-ticket-cache — Utility. Manages the ticket context cache (ticket info, MR diffs, comments, linked tickets). Prevents redundant Atlassian MCP and glab calls across skills.

Example Pipeline

# In Claude Code — full 5-phase pipeline
/noob-analyze PROJ-123                    # Phase 1: Analysis + impact
/noob-plan PROJ-123                       # Phase 2: Test plan from MR diffs
/noob-testcase PROJ-123                   # Phase 3: Generate test cases
/noob-claim PROJ-123                      # Phase 4: Claim next test case
/noob-explore https://staging.app.com     # Phase 4: Browser test (ui/ui_api)
/noob-api-explore PROJ-123                # Phase 4: API tests (api layer, all at once)
/noob-rca                                 # Phase 5: Classify failures
/noob-report PROJ-123                     # Phase 5: Final report + ticket update

CLI Reference

`noob-tester repos` — Manage repositories and codebase index

Register repos, group them, sync to local disk, and build a searchable index with BM25 full-text search + import dependency graph.

| Command | Description | | -------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- | | repos add <name> <url> | Register a repository | | repos list | List all registered repos | | repos delete <name> --yes | Delete a repo and its index | | repos path <name> | Get local path of a synced repo | | repos group add <name> --repos a,b,c | Create a repo group | | repos group list | List all groups | | repos group delete <name> | Delete a group | | repos discover --ticket <id> | Find all repos for a ticket (from runs, test cases, UI maps) + ensure them. --url <extra> to add more. Auto: register + clone/pull + diff-aware index | | repos ensure <urls...> | Register + clone/pull + index repos. Accepts URLs or names. Uses glab for GitLab repos. All repos in ~/.noob-tester/repos/ | | repos sync <name> | Clone or pull a repo or group. --branch <branch> to checkout a specific branch. --reindex to auto-re-index if commit changed | | repos index <name> | Diff-aware re-index (only changed files since last indexed commit). --full for complete rebuild. Records branch + commit | | repos search <query> | Search indexed code | | repos search <query> --expand | Search + show related files via import graph | | repos search <query> --repos a,b | Search specific repos |

# Register and group
noob-tester repos add frontend https://gitlab.com/org/frontend
noob-tester repos add backend https://gitlab.com/org/backend
noob-tester repos group add myapp --repos frontend,backend

# Sync and index
noob-tester repos sync myapp
noob-tester repos index myapp
# ✔ frontend: 342 files, 1205 imports (main @ a1b2c3d4)
# ✔ backend: 189 files, 567 imports (main @ e5f6g7h8)

# Sync a specific branch (e.g. for testing an MR)
noob-tester repos sync frontend --branch feature/PROJ-123 --reindex
# ✔ frontend: switched to feature/PROJ-123 @ 9i0j1k2l, re-indexed

# Search with import graph expansion
noob-tester repos search "authentication middleware" --expand
# Finds auth.ts + every file that imports it + every file it imports

# Claude Code can also read synced repos directly
noob-tester repos path frontend
# /Users/you/.noob-tester/repos/frontend

`noob-tester run` — Manage test runs

| Command | Description | | --------------------------------------------------- | ---------------------------------------------------------------------------------------------------------- | | run resolve --input-type <type> --input-ref <ref> | Resume or create a run. Reuses existing running/pending run for same input-ref. --fresh to force new | | run create --input-type <type> --input-ref <ref> | Always create a new run (CLI only — skills should use resolve) | | run create ... --target-url <url> | Set the target app URL | | run create ... --repo <url> | Attach repo URLs (repeatable) | | run create ... --reuse-run <id> | Reuse prior analysis/plan (skip Phase 1 & 2) | | run create ... --fresh | Ignore all prior run data | | run create ... --force | Override — regenerate analysis/plan/testcases even if they exist | | run create ... --capture <types> | Comma-separated capture types: screenshot,snapshot,video,har,console,trace (default: all) | | run create ... --secret-target <name> | Secret target name for login credentials | | run create ... --secret-role <role> | Secret role within the target (default: "default") | | run update <id> --phase <n> | Update current phase | | run complete <id> --status <s> --summary <text> | Mark run completed/failed | | run get <id> | Get run details as JSON |

Input types: ticket, confluence, text, file

# From ticket with repos, capture config, and credential reference
noob-tester run create --input-type ticket --input-ref PROJ-123 \
  --target-url https://staging.app.com \
  --repo https://gitlab.com/org/frontend \
  --repo https://gitlab.com/org/backend \
  --capture screenshot,snapshot,video,har,console,trace \
  --secret-target staging --secret-role admin

# Reuse prior analysis
noob-tester run create --input-type text --input-ref "re-test login" \
  --target-url https://app.com --reuse-run <priorRunId>

`noob-tester testcase` — Generate and manage test cases

Test cases in BDD or traditional format, with multi-session claim system.

Test case types (execution priority):

direct_functional — core feature/fix tests (executed first)
impact_regression — tests for impacted dependencies (executed second)
general_regression — crucial flows not directly touched (executed last)

Test layers (what kind of test): | Layer | Description | |---|---| | ui | Pure UI interaction — clicks, forms, navigation (default) | | api | Pure API — request/response, status codes, payloads | | ui_api | Spans both — UI action triggers API call, verify both sides | | database | Data persistence, queries, migrations, constraints | | ai | AI/ML features — prompts, responses, model behavior | | unit | Code-level unit test — functions, utilities, pure logic | | other | Does not fit above categories |

/noob-explore only executes ui and ui_api tests (browser automation). Other layers need different runners.

| Command | Description | | ------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------- | | testcase create <runId> --ticket <ref> --type <type> --format <fmt> --title <text> | Create a test case (draft by default) | | testcase create ... --layer <layer> | Set test layer: ui, api, ui_api, database, ai, unit, other (default: ui) | | testcase create ... --ready | Create and mark as ready for execution | | testcase mark-ready <id> | Mark a test case as ready | | testcase mark-draft <id> | Mark a test case as draft (not executable) | | testcase ready-all <ticketRef> | Mark all test cases for a ticket as ready | | testcase draft-all <ticketRef> | Mark all test cases for a ticket as draft | | testcase claim <ticketRef> <sessionId> | Claim next available ready case (priority order) | | testcase claim ... --fresh | Also claim previously completed cases | | testcase result <id> --status <s> --run <runId> | Record execution result | | testcase release <id> | Release a claimed case | | testcase release-session <sessionId> | Release all claims by a session | | testcase list --ticket <ref> | List cases for a ticket | | testcase stats <ticketRef> | Show counts by type/status | | testcase select --repo <name> --diff <branch> | Select test cases affected by code changes (via coverage_map + import graph) | | testcase risk --ticket <ref> | Compute risk scores from failure patterns, code churn, flakiness, recency | | testcase audit --ticket <ref> | Audit: find duplicates, never-failed, stale. --duplicates, --orphaned, --stale |

# BDD format with test layer
noob-tester testcase create $RUN_ID \
  --ticket PROJ-123 --type direct_functional --format bdd \
  --title "Login with valid credentials" \
  --bdd-feature "Login" --bdd-scenario "Valid login" \
  --bdd-given '["user on /login"]' \
  --bdd-when '["enters email","enters password","clicks Sign In"]' \
  --bdd-then '["redirected to dashboard"]' \
  --impacted-files '["src/auth/login.ts"]' \
  --layer ui

# Traditional format — API test (won't be picked up by noob-explore)
noob-tester testcase create $RUN_ID \
  --ticket PROJ-123 --type direct_functional --format traditional \
  --title "POST /api/auth/login returns 200 with valid creds" \
  --trad-steps '[{"step":"POST /api/auth/login with valid body","expected":"200 + session token"}]' \
  --layer api

# UI + API test
noob-tester testcase create $RUN_ID \
  --ticket PROJ-123 --type impact_regression --format traditional \
  --title "Checkout still works after auth change" \
  --trad-steps '[{"step":"Go to checkout","expected":"Page loads"}]' \
  --layer ui_api

# Multi-session: each session claims one test case at a time
noob-tester testcase claim PROJ-123 $SESSION_ID       # gets next unclaimed
noob-tester testcase claim PROJ-123 $SESSION_ID --fresh  # re-runs completed too

`noob-tester runpack` — Run packs (execution batches)

Run packs are the execution layer for /noob-explore. Each pack groups test case executions for a ticket, with stored target URL, credential references, and capture config.

Default behavior: /noob-explore resumes an existing pack with pending/failed entries. If none exist, it creates a new pack. If the user says "rerun" or "fresh", it forces a new pack.

| Command | Description | | ---------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | runpack resolve --ticket <id> --run <runId> | Resume or create a run pack. Checks for existing packs with pending/failed entries first. --fresh to force new. Optional: --target-url, --secret-target, --secret-role, --capture, --session | | runpack create --ticket <id> --run <runId> | Create a new run pack (always fresh, CLI only — skills should use resolve). Optional: --target-url, --secret-target, --secret-role, --capture, --session | | runpack meta <packId> | Get pack metadata (target, credentials, capture config) | | runpack add <packId> <testCaseId> | Add a specific test case to a pack | | runpack claim <packId> <sessionId> | Claim next pending entry already in the pack (resume mode) | | runpack claim-next <packId> <ticketId> <sessionId> | Pick next test case not yet in the pack, add and claim it. --layer to filter by test layer. --runner to set runner type (ui/api, auto-detected from layer) | | runpack populate <packId> <ticketId> --status <s> | Add ready test cases to pack with status: pending, blocked, skipped. --layer to filter by test layer (e.g. --layer api). --runner to stamp entries. Optional: --reason, --run, --session | | runpack result <entryId> --status <s> | Record result: passed, failed, skipped, blocked. Optional: --results, --logs, --observations, --issues (all JSON) | | runpack artifact <entryId> --type <t> --path <p> | Attach artifact: screenshot, snapshot, video, har, console, trace. Optional: --label, --step, --metadata | | runpack observe <entryId> --text <t> | Add an observation | | runpack log <entryId> --text <t> | Add a log entry | | runpack list --ticket <id> | List run packs for a ticket (with pass/fail/pending counts) | | runpack list --pack <packId> | List entries in a specific pack | | runpack release <packId> | Release all claimed entries back to pending | | runpack retry --entry <entryId> | Retry a specific entry (reset to pending) | | runpack retry --name <text> --pack <packId> | Retry entries matching test case name (substring) | | runpack retry --pack <packId> | Retry all failed/blocked entries | | runpack retry --all <packId> | Retry ALL entries including passed (full rerun of same pack) | | runpack delete --pack <packId> --yes | Delete a specific pack | | runpack delete --ticket <id> --yes | Delete all packs for a ticket | | runpack auto-retry <packId> | Mark all failed/blocked entries for auto-retry (max 1 retry per entry) | | runpack classify-retry <entryId> --status <s> | Classify retry result: likely_false_positive if passed, confidence level if failed | | runpack false-positives <packId> | Show false positive analysis (total, retried, false positives, confirmed, by confidence) |

# Resolve — resumes existing pack or creates new (always use this, not create)
noob-tester runpack resolve --ticket PROJ-123 --run $RUN_ID \
  --target-url "https://staging.app.com" \
  --secret-target staging --secret-role admin \
  --capture screenshot,snapshot,har
# Returns: { runPackId, resumed: true/false }

# Claim one test case (resume pending first, then fresh)
noob-tester runpack claim $RUNPACK_ID $SESSION_ID         # resume pending entry
noob-tester runpack claim-next $RUNPACK_ID PROJ-123 $SESSION_ID  # or claim fresh
noob-tester runpack claim-next $RUNPACK_ID PROJ-123 $SESSION_ID --layer ui  # only UI tests

# Record results and artifacts per entry
noob-tester runpack result $ENTRY_ID --status passed --results '{"summary":"all good"}'
noob-tester runpack artifact $ENTRY_ID --type screenshot --path ./step1.png --label "After login" --step 1

`noob-tester capture` — Per-action artifact storage

Stores snapshots, console logs, HAR network data, screenshots, and network errors per action, linked to run, runpack entry, page URL, and action number.

| Command | Description | | ------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | capture store --run <runId> --type <type> | Store an artifact. Types: snapshot, screenshot, console, har, video, trace, network_error, api_request. --file <path> or --content <text>. Optional: --pack, --entry, --session, --ticket, --action <n>, --desc, --url <pageUrl> | | capture list --run <runId> | List artifacts for a run. --entry <id> for a specific entry. --type to filter | | capture stats --run <runId> | Show artifact counts by type |

# Store console logs for an action
noob-tester capture store --run $RUN_ID --type console --file ./evidence/console.txt \
  --url "/dashboard" --action 3 --desc "After clicking Save" --ticket FEAT-7679

# List all HAR files for a run
noob-tester capture list --run $RUN_ID --type har

# Stats
noob-tester capture stats --run $RUN_ID
# {"snapshot":5,"screenshot":5,"console":5,"har":5}

`noob-tester uimap` — UI maps (persistent app knowledge)

UI maps are a persistent knowledge base of how an app's UI works — pages, selectors, navigation paths, forms, reliability tracking. Shared across targets with the same repos. Grows with every /noob-explore session.

A map is defined by repos, not targets. Multiple targets (staging, prod, dev) sharing the same codebase share the same map. Fetchable by ticket ID, repo URL, or target URL.

Stable selectors — uimap scan stores elements using role + text/label/placeholder/url (e.g. button[name="Sign In"], textbox[placeholder="Search"]), not ephemeral [ref=eN] refs. Each element records its selector strategy type (role+text, role+placeholder, role+url, ref). The map tells you WHAT elements to expect, the current browser snapshot tells you WHERE they are.

| Command | Description | | ---------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | uimap create --name <n> | Create a map. Optional: --repos, --targets, --tickets (comma-separated) | | uimap get <id> | Get map details + stats | | uimap list | List all maps with stats | | uimap resolve --ticket <id> | Find a map by ticket ID, --repo, or --target. Returns first match | | uimap update <id> | Add repos/targets/tickets: --add-repos, --add-targets, --add-tickets | | uimap delete <id> --yes | Delete map and all its data | | uimap page <mapId> --url <pattern> | Record/update a page (upserts by URL). Optional: --title, --snapshot, --screenshot, --auth-required, --auth-roles, --code, --repos, --tickets, --parity, --run, --session | | uimap pages <mapId> | List all pages | | uimap element <pageId> --selector <sel> --type <t> | Record/update an element (upserts by selector). Optional: --role, --text, --action, --result, --code, --tickets, --auth-roles, --run, --testcase | | uimap elements <pageId> | List elements on a page | | uimap lookup --map <id> --url <pattern> | Look up elements by URL. --type to filter. Sorted by reliability | | uimap hit <elementId> | Record selector success. --run optional | | uimap miss <elementId> | Record selector failure. Auto-updates status (working/flaky/broken) | | uimap alt <elementId> --selector <sel> | Add alternative selector | | uimap flaky <mapId> | List flaky/broken elements | | uimap nav <mapId> --from <pageId> --to <pageId> | Record navigation. --via element, --type, --conditions | | uimap path --map <id> --from <url> --to <url> | Find navigation path between URLs (BFS pathfinding) | | uimap form <pageId> | Record/update a form. --selector, --fields (JSON), --submit, --success, --error, --sample-values | | uimap scan <pageId> --snapshot <path> | Parse accessibility snapshot and bulk-record all elements + forms. Stores stable selectors: role[name="text"], role[placeholder], role[url], @ref fallback. Records selector strategy per element. --ticket, --run, --session optional | | uimap stats <mapId> | Show map statistics |

# Create a map for the app (defined by repos, not target)
noob-tester uimap create --name "My App" \
  --repos "https://gitlab.com/org/frontend,https://gitlab.com/org/backend" \
  --targets "https://staging.app.com,https://prod.app.com" \
  --tickets "PROJ-123"

# Find existing map by ticket, repo, or target
noob-tester uimap resolve --ticket PROJ-456

# Record pages and scan elements from snapshot (2 commands per page)
noob-tester uimap page $MAP_ID --url "/login" --title "Login" --ticket PROJ-123 --run $RUN_ID
noob-tester uimap scan $PAGE_ID --snapshot ./snapshot.txt --ticket PROJ-123 --run $RUN_ID
# Scan creates: button[name="Sign In"], textbox[name="Email"], link[url="/forgot-password"]

# Track selector reliability
noob-tester uimap hit $ELEMENT_ID --run $RUN_ID   # worked
noob-tester uimap miss $ELEMENT_ID --run $RUN_ID  # failed

# Navigation pathfinding
noob-tester uimap path --map $MAP_ID --from "/login" --to "/checkout"

# Track target parity (staging has it, prod doesn't)
noob-tester uimap page $MAP_ID --url "/beta-feature" \
  --parity '{"staging":true,"prod":false}'

`noob-tester apimap` — API maps (persistent endpoint registry)

Like UI maps for the frontend, API maps are a persistent knowledge base of your backend. Endpoints, parameters, response schemas, dependency chains, and health tracking — all visualized as a force-directed graph in the dashboard.

| Command | Description | | ------------------------------ | --------------------------------------------------------------------------------------------- | | apimap resolve <name> | Find or create an API map. --base-url, --tickets, --repos | | apimap endpoint <mapId> | Register/update an endpoint. --method, --path, --summary, --auth-type, --auth-roles | | apimap call <endpointId> | Record a call result. --status (HTTP code), --time (ms). Updates health automatically | | apimap param <endpointId> | Add a parameter. --name, --in (path/query/body/header), --type, --required | | apimap response <endpointId> | Register expected response. --status, --schema, --example | | apimap chain <mapId> | Add dependency. --from, --to, --type (creates/reads/updates/deletes/cleanup) | | apimap lookup <mapId> | Find endpoint by --method + --path | | apimap list | List all API maps | | apimap get <name> | Full map data (endpoints, params, responses, chains) | | apimap stats <name> | Statistics (total, active, flaky, failing, avg response time) |

# Create or find an API map
APIMAP_ID=$(noob-tester apimap resolve "my-api" --base-url https://api.staging.com --tickets PROJ-123 | jq -r '.id')

# Register endpoints discovered from code analysis
EP_ID=$(noob-tester apimap endpoint $APIMAP_ID --method POST --path "/api/users" \
  --summary "Create user" --auth-type bearer --auth-roles "admin" | jq -r '.endpointId')

# Add params and responses
noob-tester apimap param $EP_ID --map $APIMAP_ID --name email --in body --type string --required
noob-tester apimap response $EP_ID --map $APIMAP_ID --status 201 --schema '{"id":"string","email":"string"}'

# Record call results during testing (auto-updates health)
noob-tester apimap call $EP_ID --status 201 --time 150 --run $RUN_ID

# Add dependency chains
GET_EP_ID=$(noob-tester apimap endpoint $APIMAP_ID --method GET --path "/api/users/:id" | jq -r '.endpointId')
noob-tester apimap chain $APIMAP_ID --from $EP_ID --to $GET_EP_ID --type creates

Endpoint health updates automatically based on call results:

active — no failures or low failure rate
flaky — intermittent failures (some succeed, some fail)
failing — consistently failing (3+ consecutive failures)

`noob-tester coverage` — Code-level coverage mapping

Link test cases to source files via impacted_files + import graph expansion. Find which source files have no test coverage.

| Command | Description | | ------------------------------------- | ---------------------------------------------------------------------------------------------- | | coverage build <repoName> | Build coverage map from test case impacted_files + 1-level import graph expansion | | coverage stats <repoName> | Show coverage statistics (total/covered/uncovered files, coverage %) | | coverage uncovered <repoName> | List files with no test case coverage, sorted by importer count (more importers = higher risk) | | coverage file <repoName> <filePath> | Show which test cases cover a specific file (with link type and confidence) | | coverage clear <repoName> | Clear coverage map for a repo (rebuild with coverage build) |

# Build coverage map (reads test_cases.impacted_files, expands via import_graph)
noob-tester coverage build frontend

# View stats
noob-tester coverage stats frontend
# Total: 342, Covered: 89, Uncovered: 253, Coverage: 26%

# Find highest-risk uncovered files
noob-tester coverage uncovered frontend --limit 20

# Which test cases cover auth.ts?
noob-tester coverage file frontend src/auth/login.ts

`noob-tester rca` — Root cause analysis

Classify failures from completed run packs. Used by /noob-rca skill or standalone.

| Command | Description | | ------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------- | | rca save | Save an RCA result. --pack, --entry, --testcase, --classification, --confidence, --cause required. Optional: --evidence, --pattern, --action | | rca list --pack <id> | List RCA results for a run pack (with test case details) | | rca summary --pack <id> | Summary counts by classification and suggested action | | rca get <entryId> | Get RCA result for a specific entry | | rca clear --pack <id> | Clear all RCA results for re-analysis |

Classifications: env_issue, flaky_selector, actual_bug, test_data_issue, network, auth_issue, timeout, unknown

Suggested actions: retry, fix_test, fix_app, fix_env, investigate, skip

# Save an RCA result
noob-tester rca save --pack $PACKID --entry $ENTRY_ID --testcase $TC_ID \
  --classification actual_bug --confidence 0.9 \
  --cause "Auth middleware doesn't pass session to downstream services" \
  --evidence "Console shows 500 on /api/orders, HAR confirms missing session header" \
  --action fix_app

# View summary
noob-tester rca summary --pack $PACKID
# { total: 8, byClassification: { actual_bug: 3, env_issue: 2, flaky_selector: 1, network: 2 } }

`noob-tester a11y` — Accessibility testing

Store and query axe-core WCAG audit results. Automatically populated by /noob-explore on every page load.

| Command | Description | | ---------------------- | --------------------------------------------------------------------------------------------------------------------- | | a11y scan <runId> | Store axe-core violations JSON. --url, --results (JSON array). Optional: --pack, --entry, --page-id | | a11y add <runId> | Store a single a11y issue. --url, --rule, --impact, --description. Optional: --wcag, --selector, --html | | a11y list | List a11y issues. --run, --pack, or --page to filter | | a11y summary <runId> | Summary by impact level and rule, with page count |

Impact levels: critical, serious, moderate, minor (mapped from axe-core)

# Store axe-core results from browser evaluation
noob-tester a11y scan $RUN_ID --url "/login" --results "$AXE_VIOLATIONS" \
  --pack $PACKID --entry $ENTRY_ID

# View summary
noob-tester a11y summary $RUN_ID
# { total: 12, byImpact: { serious: 4, moderate: 6, minor: 2 }, pageCount: 3 }

# List all issues for a pack
noob-tester a11y list --pack $PACKID --json

`noob-tester testcase select` — Test selection by code changes

Given a git diff, find which test cases should run based on coverage map + import graph.

| Command | Description | | ----------------------------------------------- | -------------------------------------------------------------------------------------------------------------- | | testcase select --repo <name> --diff <branch> | Select test cases affected by changed files. --ticket to scope. --depth <n> for deeper expansion. --json |

# Which test cases should run for this branch?
noob-tester testcase select --repo frontend --diff main
# Changed: 5 files, Affected: 12 files (with imports), Test cases: 8

noob-tester testcase select --repo frontend --diff main --ticket PROJ-123 --json

Requires coverage build first — builds the file-to-testcase mapping.

`noob-tester testcase risk` — Risk-based prioritization

Compute risk scores from failure patterns, code churn, flakiness, and recency.

| Command | Description | | ------------------------------ | ---------------------------------------------------------------- | | testcase risk --ticket <ref> | Compute and store risk scores for all ready test cases. --json |

noob-tester testcase risk --ticket PROJ-123
# Computed: 15 test cases, Avg score: 0.42, High risk: 3

# Claim in risk order (highest risk first)
noob-tester runpack claim-next $PACKID PROJ-123 $SESSION --layer ui --risk

`noob-tester runpack auto-retry / false-positives` — False positive reduction

Auto-retry failed entries to distinguish real failures from transient issues.

| Command | Description | | ----------------------------------------------- | ------------------------------------------------------------------------------ | | runpack auto-retry <packId> | Mark all failed/blocked entries for retry (max 1 retry each) | | runpack classify-retry <entryId> --status <s> | Classify retry result: likely_false_positive if passed, confidence if failed | | runpack false-positives <packId> | Show false positive stats. --json |

# After execution completes with failures:
noob-tester runpack auto-retry $PACKID  # resets failed → pending (retry_count++)

# After retry pass completes:
noob-tester runpack classify-retry $ENTRY_ID --status passed  # → likely_false_positive

# View analysis
noob-tester runpack false-positives $PACKID
# Total failed: 8, Retried: 8, False positives: 3, Confirmed: 5

`noob-tester testcase audit` — Test suite cleanup & deduplication

Audit test cases for duplicates, never-failed, orphaned, and stale entries.

| Command | Description | | ---------------------------------------------- | --------------------------------------------------------------------------------- | | testcase audit --ticket <ref> | Full audit: duplicates + never-failed + stale. --json | | testcase audit --duplicates --ticket <ref> | Only near-duplicate pairs (Jaccard similarity). --threshold <n> (default: 0.65) | | testcase audit --never-failed --ticket <ref> | Test cases executed but never failed (potential low-value) | | testcase audit --orphaned | Test cases with no run pack activity in 90 days (across all tickets) | | testcase audit --stale --ticket <ref> | Test cases not executed in 30+ days |

# Full audit
noob-tester testcase audit --ticket PROJ-123
# Total: 24, Duplicates: 2 pairs, Never failed: 5, Stale: 3

# Just duplicates with custom threshold
noob-tester testcase audit --duplicates --ticket PROJ-123 --threshold 0.7 --json

`noob-tester visual` — Visual regression testing

Compare screenshots against baselines per page/viewport. Hash-based quick check + Claude vision for detailed analysis.

| Command | Description | | ----------------------------------------------------------------- | ----------------------------------------------------------------------------------------- | | visual baseline --page <id> --url <pattern> --screenshot <path> | Set baseline. --viewport, --run, --entry | | visual compare --page <id> --screenshot <path> | Compare against baseline (hash check). Returns { hasBaseline, hashMatch, baselinePath } | | visual diff-save --baseline <id> --run <runId> --current <path> | Save diff result. --score, --description, --regression, --entry | | visual list | List diffs. --run, --unreviewed, --json | | visual accept <diffId> | Accept current screenshot as new baseline | | visual review <diffId> | Mark reviewed: --regression or --ok | | visual stats | Stats: baselines, diffs, regressions, reviewed/unreviewed. --run, --json |

# Set baseline on first passing run
noob-tester visual baseline --page $PAGE_ID --url "/login" \
  --screenshot ./evidence/login.png --run $RUN_ID

# Compare on next run
noob-tester visual compare --page $PAGE_ID --screenshot ./evidence/login-current.png
# { hasBaseline: true, hashMatch: false, baselinePath: "...", baselineId: "..." }

# Save diff after Claude vision analysis
noob-tester visual diff-save --baseline $BASELINE_ID --run $RUN_ID \
  --current ./evidence/login-current.png \
  --score 0.6 --description "Submit button changed from blue to green" --regression

# Review
noob-tester visual list --unreviewed
noob-tester visual review $DIFF_ID --regression
noob-tester visual accept $DIFF_ID  # promote as new baseline

`noob-tester secrets` — Manage credentials

Scoped to targets (environments/apps) and roles (admin, user, api). Supports literal values, environment variables (env:), and 1Password (op:).

| Command | Description | | -------------------------------------------------------- | ----------------------------------- | | secrets target add <name> --url <url> | Register a target | | secrets target list | List all targets and roles | | secrets target delete <name> --yes | Delete a target and all its secrets | | secrets set <key> <value> --target <t> --role <r> | Set a secret | | secrets get-profile --target <t> --role <r> | Get all resolved secrets | | secrets get-profile --url <url> --role <r> | Get secrets by matching URL | | secrets delete <key> --target <t> --role <r> | Delete a secret | | secrets delete-role --target <t> --role <r> | Delete all secrets for a role | | secrets list | List all (values masked) | | secrets list --target <t> | Filter by target | | secrets list --role <r> | Filter by role | | secrets list --url <url> | Filter by URL | | secrets find <search> | Find by key or value (e.g. email) | | secrets import-op <vault/item> --target <t> --role <r> | Import from 1Password | | secrets import-op ... --live | Store as op: refs (always fresh) | | secrets import-op ... --map label=KEY | Custom field mapping | | secrets import-op ... --prefix APP_ | Prefix all keys |

# Register targets
noob-tester secrets target add staging --url https://staging.app.com
noob-tester secrets target add prod --url https://prod.app.com

# Set credentials
noob-tester secrets set LOGIN_EMAIL "[email protected]" --target staging --role admin
noob-tester secrets set LOGIN_PASSWORD "op:Private/MyApp/password" --target staging --role admin
noob-tester secrets set API_TOKEN "env:STAGING_TOKEN" --target staging --role api

# Import all fields from 1Password at once
noob-tester secrets import-op "Private/MyApp" --target staging --role admin
noob-tester secrets import-op "Private/MyApp" --target staging --role admin --live  # keep as op:// refs

# Vault names with slashes work — last segment is the item name
noob-tester secrets import-op "ENG/Development/TeamEnablementQA" --target staging --role admin --live

# Query
noob-tester secrets get-profile --target staging --role admin
noob-tester secrets get-profile --url https://staging.app.com --role admin
noob-tester secrets find "[email protected]"

`noob-tester session` — Track active sessions

| Command | Description | | -------------------------------------------------------------------------- | ---------------------------------------------- | | session start --task <text> --labels <a,b> --tickets <PROJ-123,PROJ-456> | Register a session with labels and ticket refs | | session heartbeat <id> --phase <n> --run-id <id> --tickets <PROJ-789> | Keep alive, add tickets | | session end <id> | Mark completed | | session get <id> | Get details | | session link <runId> <sessionId> | Link a run to a session | | session list | List all (marks stale after 5min) | | session list --active | Only active sessions |

`noob-tester qa-pool` — Agent Orchestration

Map multiple specialized agents to a single ticket and dispatch them in parallel with zero race conditions. Each agent is a Claude Code agent .md file (e.g., a UI tester, API tester, or security tester). The orchestrator enumerates all pending test cases first and assigns each one explicitly to an agent — no two agents compete for the same test case.

Why use this? When you have a large ticket with many test cases, you can run multiple agents in parallel — each targeting a specific test case — and monitor all results in the Pool dashboard.

| Command | Description | | ------------------------------------------ | ----------------------------------------------------------------------------- | | qa-pool add --ticket <id> --agent <path> | Associate an agent with a ticket. --target, --role, --file optional | | qa-pool list --ticket <id> | List all agents for a ticket. --json for raw output

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

noob-tester

Table of Contents

What It Is

Architecture

Installation

Quick Start

AI Skills

Skill Reference

Skill Details

Example Pipeline

CLI Reference

noob-tester repos — Manage repositories and codebase index

noob-tester run — Manage test runs

noob-tester testcase — Generate and manage test cases

noob-tester runpack — Run packs (execution batches)

noob-tester capture — Per-action artifact storage

noob-tester uimap — UI maps (persistent app knowledge)

noob-tester apimap — API maps (persistent endpoint registry)

noob-tester coverage — Code-level coverage mapping

noob-tester rca — Root cause analysis

noob-tester a11y — Accessibility testing

noob-tester testcase select — Test selection by code changes

noob-tester testcase risk — Risk-based prioritization

noob-tester runpack auto-retry / false-positives — False positive reduction

noob-tester testcase audit — Test suite cleanup & deduplication

noob-tester visual — Visual regression testing

noob-tester secrets — Manage credentials

noob-tester session — Track active sessions

noob-tester qa-pool — Agent Orchestration