tester-h

v0.5.0

Published

6 days ago

tester-h — run an H QA agent against your web app from the command line.

0High
0Medium
0Low

fredericlegrand

qa testing ai agent browser automation playwright hcompany

tester-h

████████╗███████╗███████╗████████╗███████╗██████╗       ██╗  ██╗
╚══██╔══╝██╔════╝██╔════╝╚══██╔══╝██╔════╝██╔══██╗      ██║  ██║
   ██║   █████╗  ███████╗   ██║   █████╗  ██████╔╝█████╗███████║
   ██║   ██╔══╝  ╚════██║   ██║   ██╔══╝  ██╔══██╗╚════╝██╔══██║
   ██║   ███████╗███████║   ██║   ███████╗██║  ██║      ██║  ██║
   ╚═╝   ╚══════╝╚══════╝   ╚═╝   ╚══════╝╚═╝  ╚═╝      ╚═╝  ╚═╝
        QA for the web, driven by a vision agent.

tester-h is a QA agent in your terminal. Hand it a URL and a plain-English instruction; it drives a real browser, verifies the rendered product with a vision model, runs deterministic SEO / accessibility / link / hygiene audits on every page it touches, and returns a structured pass / fail report listing every issue and its severity.

Built on H Company's Holo3 vision-language model.

npx tester-h --url https://staging.example.com \
  "Sign up flow works end-to-end; check SEO and a11y while you're there"

  trajectory  6f5e0d…
  · navigate: https://staging.example.com
  · audit:    2 issues — [high] seo.meta_description.missing, [medium] a11y.img_missing_alt
  · click: Sign up · type: [email protected] · click: Submit
  · read_text("h1") → "Welcome, foo"
  ────────── findings ──────────
  [HIGH]   FAIL  seo.meta_description.missing — No <meta name="description">.
  [HIGH]   FAIL  a11y.form_control_no_name — 2 form controls without an accessible name.
  ✓ PASS (5 actions)

Exit 0 on pass, 1 on fail — drops straight into CI.

New here? → QUICKSTART.md gets you from install to a passing run in ~5 minutes.

Why · Install · Authenticate · doctor
Run a test · Record → replay → codegen · Audits · Chaos
Xray / Jira · Editor & agent integrations · Inspector
Configuration · Data flow & privacy · Exit codes · License

Why

QA agents that "look at a screenshot and decide" miss too much. tester-h splits the work into two lanes that play to their strengths:

Vision decides what only a human eye can: is the CTA the brand color, is the layout broken, did the hero image render, is the focus ring visible.
Deterministic audits decide everything objective: meta description present, images missing alt text, 404 links, broken heading hierarchy, mixed content, console errors.

Page text is read straight from the DOM, never OCR'd from pixels. The verdict comes back as structured data — every check that ran, every issue with a severity — so it pipes cleanly into CI, dashboards, or another agent.

Where it runs. Inference runs on H Company infrastructure with your API key; the model endpoint is fixed and not user-configurable. Two execution modes:

| Mode | Where the agent runs | Reaches | Use it for | | ------------------------------------ | --------------------------------------------------------------------- | ---------------------------------- | ----------------------------------- | | Local (default) | Holo3 loop in-process; a browser on your machine (headless in CI) | localhost, intranet, public URLs | dev loop, CI on self-contained apps | | Cloud (--cloud / --agent-id) | Fully server-side on the H Agent API | public URLs only | offloaded runs, shareable replays |

Install

Requires Node.js 20+.

npm install -g tester-h

First install downloads a Chromium build via Playwright (~150 MB). If that's blocked (proxy / CI), install it later — tester-h doctor prints the exact pinned command:

node "<path to bundled playwright-core>/cli.js" install chromium

(Set TESTER_H_SKIP_POSTINSTALL=1 to skip the auto-download at install time.)

Authenticate

tester-h login

login opens the H sign-in portal in your browser — sign in with SSO, mint an API key (hk-…), and paste it back (hidden input). The key is written to ./.env as HAI_API_KEY=… at mode 0600, and login adds .env (and .tester-h/runs/) to your .gitignore so it can't be committed by accident.

tester-h login --no-browser        # print the URL instead of opening it
tester-h login --api-key hk-…      # non-interactive (CI)
tester-h whoami                    # show the active key (masked) + source
tester-h logout                    # remove the key from ./.env

Key resolution order: --api-key flag → project YAML api_key → HAI_API_KEY env → stored ./.env. A key is always required — it's used for model inference in both local and cloud modes.

Region

tester-h talks to two regions — global (default) and eu — following one rule: EU = .eu after the service name (agp.eu.hcompany.ai, portal.eu.hcompany.ai). Keys are region-scoped, so EU is opt-in at login:

tester-h login --region eu     # mints an EU key from the EU portal + pins eu
tester-h region                # show the active region + the hosts it resolves to
tester-h region eu             # switch the stored region (then re-login for an EU key)
tester-h --region eu …         # one-off override for a single command

doctor shows the active region and exactly which hosts it uses. Note: the EU Models API isn't live yet, so in eu local-mode inference falls back to the global endpoint (with a warning) — cloud mode (--cloud) runs fully on the EU Agent API. The fallback flips to the EU endpoint automatically once it ships.

`doctor` — is my setup ready?

The command to run before anything else (and as a CI preflight):

tester-h doctor

✓ Node.js          v20.x
✓ Chromium         installed
✓ H Models API     reachable (api.hcompany.ai → HTTP 404)
✓ Key auth         hk-e…22d6 authenticates
✓ Jira auth        authenticated as Jane Doe         (only if Xray creds set)
✓ Xray auth        authenticated against Xray Cloud
✓ .env safety      .env is git-ignored

Each check prints an actionable fix on failure. Exit 0 when ready, 1 otherwise. --json emits a structured report.

Run a test

tester-h --url https://staging.example.com \
  "On /pricing, the primary CTA is brand red and clicking it lands on /signup"

The agent navigates, observes, acts, and answers; every page it lands on is audited automatically. Output ends with a verdict line and a sorted findings table.

| Flag | Description | | ----------------------------- | ---------------------------------------------------------------- | | --url <url> | URL to test (required; or url: in your config) | | --device <preset> | mobile | tablet | desktop (sets viewport, UA, touch) | | --viewport <WxH> | Viewport size (default 1920x1080) | | --browser <engine> | chromium | firefox | webkit (default chromium) | | --headed | Show the browser (default headless) | | --max-steps <n> | Cap on agent turns (default 60 local) | | --timeout <s> | Wall-clock timeout (default 600) | | --model <id> | Pick a Holo3 model (from the models H serves you) | | --record <path> | Save a replayable trajectory (see below) | | --cloud / --agent-id <id> | Run server-side on the H Agent API instead of locally | | --local | Force local mode (overrides cloud: true in config) | | --json | Emit NDJSON events on stdout (machine-readable) | | --trace-dir <dir> | Write a run trace package (run.json, events.ndjson, screenshots) | | --debug | Verbose: resolved config + every event |

Local is the default. Cloud is opt-in via --cloud or --agent-id — in cloud mode the target URL must be reachable from the internet (not localhost). --cloud on its own runs on an auto-provisioned default agent (tester-h-default, created on first use); pass --agent-id <id> to use a specific deployed agent instead.

Typo guard: a mistyped command (tester-h docter) is caught with a "did you mean?" suggestion instead of being run as a paid agent instruction.

Record → replay → codegen

Author by intent once, then keep it running.

# 1. Record — the agent does the task and saves a hybrid trajectory
tester-h --url https://www.saucedemo.com --record checkout.json \
  "Log in as standard_user/secret_sauce, add a backpack, and check out"

# 2. Replay — re-runs the trajectory; self-heals when the UI drifts
tester-h replay checkout.json

# 3. Codegen — export to a standalone Playwright spec
tester-h codegen checkout.json --no-recovery --out tests/checkout.spec.ts

replay runs the recorded primitives deterministically; on a broken selector it falls back to the vision model to re-find the element by meaning (unless --strict), and writes the learned selector back (--no-update to disable).
replay --matrix runs one trajectory across every --browsers × --viewports combination and prints a pass/fail grid.
replay-all replays every trajectory in ./.tester-h/trajectories/ (CI mode); variant files (*-mobile.json) are auto-picked per viewport.
codegen emits idiomatic Playwright. Default keeps a self-healing withRecovery wrapper (import … from 'tester-h/playwright'); --no-recovery emits pure Playwright with zero tester-h dependency — commit it and run it with your own runner.

Deterministic audits

No agent, no model, no API key — same engines QA already trusts, folded into one report format.

tester-h audit https://example.com          # SEO + a11y + links + hygiene + console
tester-h a11y https://example.com           # deep WCAG audit (axe-core)
tester-h lighthouse https://example.com     # Core Web Vitals (LCP/TBT/CLS/FCP/TTFB)

| Command | Checks | Notable flags | | ------------ | --------------------------------------------------------------------------- | ----------------------------------------------------------------------------- | | audit | SEO meta, accessible names, broken links, heading hierarchy, console errors | --categories seo,a11y,links,html_hygiene,console · --link-check-limit <n> | | a11y | WCAG via axe-core | --tags wcag2a,wcag2aa,best-practice | | lighthouse | 4 category scores + Core Web Vitals | --device mobile\|desktop |

audit exits 1 on any critical/high finding (CI-friendly). Add --json to any of them for NDJSON.

Chaos — adversarial QA

Turn the agent loose to break the page within a fixed action budget. It taps console errors, 4xx/5xx responses, and visual breakage, and saves a replayable trajectory when it finds something.

tester-h chaos https://example.com --budget 50 --hint "stress the checkout flow"

(chaos drives the agent, so it needs an API key — unlike the deterministic audits.)

Xray / Jira integration

Run QA test plans authored in Jira/Xray, push verdicts back, and auto-file a bug on every failure — no test code in your repo.

tester-h xray status                                    # verify Jira + Xray auth
tester-h xray pull --plan TH-1                          # fetch the plan's tests
tester-h xray run-plan TH-1 --url https://staging.example.com

| Subcommand | Does | | --------------------------- | -------------------------------------------------------------------------------- | | xray status | Validates the 5 credentials, Jira auth and Xray auth, and the local manifest | | xray pull --plan KEY | Fetches the plan's tests → ./.tester-h/xray.manifest.json | | xray run TEST-KEY --url X | Runs one test, pushes a Test Run, auto-files a bug on FAIL | | xray run-plan KEY --url X | Runs every test, pushes one Test Execution, auto-files bugs on FAILs | | xray push <file> | Re-pushes a recorded result/trajectory to Xray |

Credentials (provide via flag, env var, or ./.env):

JIRA_BASE_URL   JIRA_USER   JIRA_TOKEN   XRAY_CLIENT_ID   XRAY_CLIENT_SECRET

Useful flags: --no-bugs (skip bug filing), --refresh (re-pull before run-plan), --concurrency N (run-plan: parallel runs, default 2), and --cloud / --agent-id to run the plan server-side. Set the bug issue type with --issue-type <name> or JIRA_BUG_ISSUE_TYPE; if your project has no "Bug" type, tester-h resolves a valid one automatically (e.g. Task) and tells you.

Exit codes (pytest-style): 0 pass · 1 setup error · 2 a test failed.

Editor & agent integrations

tester-h mcp        # MCP server (stdio) for Claude Code, Cursor, Codex
tester-h serve      # A2A HTTP server for cross-framework agent discovery

Over MCP, your editor can call run_qa, visual_check, replay, replay_all, codegen, dom_audit, axe, and lighthouse directly — just tell it to test your app. (MCP protocol 2024-11-05; A2A 0.3.0, default 127.0.0.1:18794.)

Inspect runs locally

tester-h visualize          # or: tester-h --visualize after any run

Opens a local Inspector (default 127.0.0.1:4321) that loads the current repo and surfaces the latest failed/flaky trace from .tester-h/runs/. Flags: --project <path>, --port <n>, --host <host>, --no-open.

Configuration

Everything lives inside your repo so test assets travel with the code:

<repo>/
├── .env                     # HAI_API_KEY=… (mode 0600, git-ignored)
└── .tester-h/
    ├── tester-h.yaml        # defaults: url, viewport, timeout, …
    ├── trajectories/        # recorded replay assets (commit these)
    │   ├── checkout.json
    │   └── checkout-mobile.json
    └── runs/                # local trace evidence (git-ignored)

tester-h init creates this layout and writes the .gitignore entries.

Environment variables:

| Var | Purpose | | ------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------- | | HAI_API_KEY | Your API key (required) | | HAI_AGENT_ID | Default cloud agent id (does not by itself switch to cloud mode) | | HAI_REGION | global (default) or eu — set by login --region / region; --region overrides | | JIRA_BASE_URL JIRA_USER JIRA_TOKEN XRAY_CLIENT_ID XRAY_CLIENT_SECRET | Xray/Jira integration | | JIRA_BUG_ISSUE_TYPE | Override the issue type used for auto-filed bugs | | TESTER_H_PROJECT_ROOT | Override where .tester-h/ / .env live | | TESTER_H_CONFIG_DIR | Relocate just the .tester-h/ directory | | TESTER_H_SKIP_POSTINSTALL=1 | Skip the Chromium download at install | | TESTER_H_NO_UPDATE_CHECK | Disable the background update check | | NO_COLOR / FORCE_COLOR | Control ANSI color output |

Data flow & privacy

What leaves your machine, by mode:

| Mode | Sent, and where | | ---------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Local (default) | Per agent step, a page screenshot + URL + title go to the H Models API (api.hcompany.ai) for the vision model. The browser is local, so it can test localhost/intranet — but those page screenshots are transmitted. | | Cloud (--cloud/--agent-id) | The task text + URL go to the H Agent API (agp.hcompany.ai); the browser runs server-side. |

The model endpoint is fixed to H Company infrastructure and not user-overridable. Per H Company's Privacy Policy, screenshots are processed transiently and not retained in persistent form. Note: if you pass --trace-dir (or use the Inspector's capture), screenshots of the pages you test are written to .tester-h/runs/ on your own disk — treat that directory as sensitive when testing authenticated apps, and keep it git-ignored.

Recorded artifacts contain typed text verbatim. A trajectory (--record) and codegen output store the values typed into fields. Don't record real passwords into committed trajectories — use a test account.

Exit codes

| Surface | Codes | | -------------------------- | ---------------------------------------------- | | run / replay / chaos | 0 pass · 1 fail or setup error | | audit | 1 on any critical/high finding | | xray … | 0 pass · 1 setup error · 2 a test failed | | doctor | 0 ready · 1 a check failed |

License

Proprietary — see LICENSE.md. Use of tester-h and the H Company services it relies on is governed by H Company's Terms of Use. © H Company.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

tester-h

Contents

Why

Install

Authenticate

Region

doctor — is my setup ready?

Run a test

Record → replay → codegen

Deterministic audits

Chaos — adversarial QA

Xray / Jira integration

Editor & agent integrations

Inspect runs locally

Configuration

Data flow & privacy

Exit codes

License

`doctor` — is my setup ready?