@pyratzlabs/gadget

v0.2.5

Published

22 days ago

AI-powered E2E testing CLI tool

0High
0Medium
0Low

nicobuchet

Gadget

AI-powered E2E testing CLI that acts as an agentic beta tester.

Write tests in YAML, run them against any web application with Playwright, and get production readiness assessments powered by Claude.

Quick Start

1. Install

npm i -g @pyratzlabs/gadget
npx playwright install chromium

2. Initialize a project

gadget init

This creates a .gadgetrc.yaml config file and a sample test in tests/example.test.yaml.

3. Run an audit

Set your Anthropic API key, then run the audit command against your app:

export ANTHROPIC_API_KEY="sk-ant-..."
gadget audit tests/ --base-url https://staging.myapp.com

Gadget runs every test flow with screenshots at each step, then sends them to Claude to evaluate the UI from a real user's perspective. You get a verdict (ready, not-ready, or needs-attention), a quality score (0-100), and actionable findings.

To sync audit findings into Linear tickets:

export LINEAR_API_KEY="lin_api_..."
gadget audit tests/ --base-url https://staging.myapp.com --linear --linear-team <team-id>

Gadget creates tickets with a [Gadget Audit] prefix, includes the audit description and screenshots, and avoids opening a duplicate when it finds an existing open Gadget-created issue for the same finding.

If Linear sync fails with Linear GraphQL request failed with HTTP 400 on 0.2.0 or 0.2.1, upgrade to 0.2.3 or later and rerun the same audit command.

Global install remediation:

npm i -g @pyratzlabs/[email protected]
gadget audit tests/ --base-url https://staging.myapp.com --linear --linear-team <team-id>

Project-local remediation:

pnpm add -D @pyratzlabs/[email protected]
pnpm exec gadget audit tests/ --base-url https://staging.myapp.com --linear --linear-team <team-id>

If your CI runner pins Gadget or caches global/package artifacts, update the pinned version and clear the relevant cache before rerunning the audit.

Writing Tests

Tests are YAML files with a name, optional config/variables, and a list of steps:

name: Login Flow
config:
  baseUrl: "https://myapp.com"
  timeout: 10000
  screenshot: on-failure

variables:
  username: "testuser"
  password: "secret123"

steps:
  - navigate: "/login"

  - fill:
      label: "Email"
      value: "{{ username }}"

  - fill:
      label: "Password"
      value: "{{ password }}"
      secure: true

  - click: "Sign In"

  - assert:
      url: "/dashboard"

  - assert:
      text: "Welcome back"
      visible: true

Use {{ variableName }} for test variables and {{ env.VAR_NAME }} for environment variables. Mark sensitive fields with secure: true to mask values in logs.

Commands

| Command | Description | |---------|-------------| | gadget run <paths...> | Run E2E tests | | gadget audit <paths...> | Run tests + AI production readiness assessment | | gadget check | Auto-generate and run tests from git diff | | gadget validate <paths...> | Validate test files without running | | gadget init | Scaffold .gadgetrc.yaml and example test | | gadget providers | List available AI providers and their status |

AI Audit & Quality Score

The audit command captures a screenshot after every step and sends them to Claude for review. The AI evaluates layout, readability, visual bugs, broken flows, and UX friction — exactly what a human tester would look at.

Findings are categorized by severity:

| Severity | Impact on score | |----------|----------------| | Critical | -20 | | Warning | -10 | | Nitpick | -3 | | Improvement | -1 |

A score of 80+ generally means production-ready. Use --min-score as a CI gate:

gadget audit tests/ --base-url https://staging.myapp.com --min-score 80

You can also sync findings to Linear during the audit:

gadget audit tests/ --base-url https://staging.myapp.com --linear --linear-team <team-id>

Affected versions for the unused-$projectId Linear lookup bug were 0.2.0 and 0.2.1. Use 0.2.3+ for Linear sync.

Reporters

Use --reporter to choose output formats (combine with commas):

console — colored terminal output (default)
html — self-contained HTML report with embedded screenshots
junit — standard JUnit XML for CI/CD
json — structured JSON for automation
github — GitHub Actions annotations + step summary

gadget run tests/ --reporter console,html,junit

CI Integration

Gadget is designed to run in CI pipelines. Example workflow files are available in the examples/ directory for both GitHub Actions and GitLab CI.

See the full documentation for detailed CI setup guides.

Claude Code Skills

Gadget ships with a set of Claude Code skills in the skills/ directory. These let you run Gadget commands conversationally inside Claude Code using slash commands:

| Skill | Slash Command | Description | |-------|-------------|-------------| | gadget-init | /gadget-init | Scaffold a project with guided setup | | gadget-run | /gadget-run | Run E2E tests with file discovery and failure analysis | | gadget-audit | /gadget-audit | AI-powered production readiness assessment | | gadget-check | /gadget-check | Auto-generate tests from git diff | | gadget-validate | /gadget-validate | Validate YAML test files with auto-fix |

Each skill wraps the corresponding npx @pyratzlabs/gadget command and adds intelligent parameter discovery, prerequisite checking, result interpretation, and follow-up suggestions.

Using the skills

Install skills into your project:
```
npx skills install @pyratzlabs/gadget
```
Invoke a skill in Claude Code:
```
/gadget-audit
```

Claude will guide you through the rest — checking prerequisites, discovering test files, running the command, and interpreting results.

Configuration

Project settings live in .gadgetrc.yaml (created by gadget init). CLI flags override config values. See the getting started guide for all options.

Documentation

Getting Started — full guide with all step types, test suites, variable interpolation, CI examples, and exit codes
API Reference — complete CLI options, configuration schema, test file format, and TypeScript types