@pcamarajr/scout
v0.8.0
Published
Self-healing browser QA: natural-language scenarios verified by an AI agent, replayed deterministically in CI
Downloads
1,518
Maintainers
Readme
🔭 Scout
Describe a test in plain English. Scout verifies it in a real browser, then replays it for free, forever.
Scout is self-healing browser QA. You write a scenario in one sentence; an AI agent drives a real browser (Playwright) to verify it and records a deterministic script. Every run after that is pure replay — no LLM, fast and free — and AI only steps back in when the UI changes and the script breaks.
- ✍️ Author in one sentence — no selectors, no code. Just the flow and what should happen.
- 🤖 Verified in a real browser — an agent judges behavior ("the paywall must not appear"), not just clicks.
- 🌐 Sees console & network — assert "no console errors" or "a POST to
/api/checkoutreturned 2xx" — not just the visible DOM. - ⚡ Replays for free — recorded runs are pure Playwright. ~zero cost, seconds per scenario, CI-ready.
- 🔧 Self-heals — when the UI changes, AI re-verifies and re-records; you review the diff.
- 🧠 Works with Claude, Gemini, or OpenAI — zero-config with Claude Code; one env var for the rest.
How it works
You write: "On the homepage, search for 'shoes' and results appear."
│
▼ scout go (first run)
AI agent runs it in a real browser, judges the outcome (verified / failed / …),
and records a deterministic script (steps + assertions).
│
▼ scout go (every run after)
Pure Playwright replay — no LLM, seconds per scenario, ~zero cost.
│
▼ UI changed and replay broke?
AI re-runs, re-judges, re-records the script (self-healing). You review the diff.| | Pure Playwright | Pure AI | Scout | |---|---|---|---| | Authoring | code + selectors | one sentence | one sentence | | Cost per CI run | ~zero | $$ + slow | ~zero (replay) | | Survives UI changes | breaks | adapts | breaks → AI re-records | | Judges behavior | only what you coded | yes | yes |
Quickstart — try it on any live URL (~2 min)
No codebase required. Point Scout at a site you already have and watch it verify a real flow.
npm install -g @pcamarajr/scout # or: npx @pcamarajr/scout <command>
npx playwright install chromium # the browser engine
scout init --base-url https://your-app.com # or run `scout init` and answer the prompt
scout doctor # check your AI credentials (see below)
scout create "Homepage search" \
-f home \
-c "On the homepage, search for 'shoes'; a list of results appears"
scout go # first run: the AI agent verifies it in a real browser
scout go # again: deterministic replay, no LLM, secondsThat's it — you have a verified test that replays for free.
AI credentials
The first scout go needs an AI provider. Run scout doctor anytime to check.
- Claude Code (zero-config) — the happy path. If you're signed in to Claude Code, Scout reuses that session automatically. Nothing to set.
- Or an API key:
export ANTHROPIC_API_KEY=…(Claude),export GEMINI_API_KEY=…(Gemini), orexport OPENAI_API_KEY=…(OpenAI). Scout picks the provider from themodelinscout.config.json.
Full provider setup → docs/providers/. Deterministic replay never uses an LLM, so CI needs no credentials (scout go --no-heal).
Use it in your codebase
Same flow, committed alongside your app — the suite travels with the branch and runs as a PR gate.
cd your-project
npm install --save-dev @pcamarajr/scout
npx playwright install chromium
scout init # writes scout.config.json, .scout/, and AI agent files (below).scout/specs/*.scout.md(your scenarios) and.scout/scripts/are committed; runs and sessions are gitignored.- Gate a PR on it:
scout report --checkexits non-zero if any scenario isn'tverified. → CI setup
Use it with your AI agent
scout init also scaffolds onboarding for coding agents, so the agent that builds a feature can write and verify its test:
AGENTS.md(repo root) — the canonical guide; the source of truth for how an agent uses Scout..claude/skills/scout/SKILL.md+.cursor/rules/scout.mdc— point your agent at it automatically.- MCP server (
scout mcp) — exposesscout_create_scenario,scout_run,scout_report, … so the agent runs the loop end-to-end.
The loop: you describe a flow → the agent writes the .scout.md → runs scout go → reports the real verdict (never claims success without running it) → iterates with you until it's green. Full guide → docs/ai-agents.md.
Docs
| Topic | |
|---|---|
| Writing scenarios | the .scout.md format, slugs, per-scenario overrides |
| Providers & credentials | Claude, Gemini, OpenAI — detection order + scout doctor |
| Auth profiles | logged-in flows, scout login, $ENV: secrets |
| Environments & CI | base-URL overrides, worktrees, the PR gate |
| Run artifacts | traces, screenshots, the preview video |
| AI agents & MCP | the co-author loop, the scaffolded files, MCP tools |
| CLI reference | every command, scout report --json/--check |
| Architecture | how it works inside + design decisions + limitations |
Verdicts
| | Meaning |
|---|---|
| ✅ verified | All expected behavior confirmed by assertions |
| ❌ failed | Broken behavior — the reason says what |
| ⚠️ partial | Only part of the expected behavior confirmed |
| 🚫 blocked | Couldn't reach the flow (app down, login broken) |
Early but functional, and published on npm. Issues and PRs welcome — see docs/architecture.md for internals. MIT licensed.
