@yasserkhanorg/impact-gate
v2.1.7
Published
Diff-aware E2E impact analysis, release-ready test planning, coverage gating, and hallucination-resistant AI generation for Playwright/Cypress teams.
Maintainers
Keywords
Readme
Impact Gate
@yasserkhanorg/impact-gate
Diff-aware E2E impact analysis, release-ready test planning, coverage gating, and hallucination-resistant AI generation for Playwright/Cypress teams. Optional AI features can suggest, generate, and heal tests once your project has a route-families.json manifest.
What It Does
impact-gate is built first for one painful CI job: given a git diff, tell us which E2E surface changed, whether the current suite already covers it, and what still needs testing before we merge or ship.
That same workflow works for:
- pull requests against
main - release branches against the previous release tag
- hotfixes against the last shipped version
- any "what changed between these refs?" release-readiness check
Product priorities:
- Primary: diff-aware E2E impact analysis and coverage gating
- Secondary: optional AI features can suggest, generate, and heal tests once your project has a route-families.json manifest
- Tertiary: crew workflows, MCP integrations, plugins, and the autonomous QA agent
The clearest path today is a Playwright or Cypress repository with a maintained route-families.json manifest. That is the path this package is optimized to make unusually clear and useful.
AI-specific note:
- Generated tests are grounded against the discovered local API surface and verified before they are trusted. The package is designed to reduce hallucinated page-object methods, fabricated helpers, and low-evidence test generation.
Transition note:
- The package and primary CLI are being renamed to
impact-gate. - Legacy CLI aliases (
e2e-ai-agents,e2e-qa-agent,e2e-agents-mcp) still work during migration. - Legacy config filenames are still supported.
- The
.e2e-ai-agents/artifact directory remains unchanged for compatibility.
Product Shape
| Level | Commands | What They Are For |
|------|----------|-------------------|
| Core CI Workflow | impact, plan, gate | Decide what changed, what is covered, and whether a PR should pass or a release is ready |
| Optional AI Workflow | generate, heal, analyze, finalize-generated-tests | Suggest, create, or repair tests after impact analysis |
| Setup and Calibration | train, bootstrap, traceability-*, feedback, cost-report, llm-health | Build the manifest, feed execution data back in, and inspect cost/provider health |
| Advanced / Experimental | crew, MCP mode, plugins, impact-gate-qa | Deeper orchestration and browser-driven workflows beyond the core CI loop |
Known Limitations
- The clearest, most stable workflow is still Playwright/Cypress impact analysis and gating.
- AI generation and healing work best after the project has a good
route-families.jsonmanifest. - Advanced features are improving, but they are not the best entry point if you only want dependable CI coverage decisions.
- The strict profile is the most opinionated path in the codebase. Most teams should start with the core CI workflow above and only opt into stricter heuristics once their mappings are mature.
Free Tier
These commands work with zero LLM cost and do not require an API key:
| Command | What It Does |
|---------|-------------|
| impact | Deterministic impact analysis from a git diff |
| plan | Coverage-gap detection and recommended run set |
| gate | CI coverage gate that exits non-zero below a threshold |
| train --no-enrich | Build route-families.json with the scanner only |
| bootstrap | Generate route-families.json from a knowledge graph |
| traceability-capture | Extract test-file relationships from Playwright JSON |
| traceability-ingest | Merge traceability mappings into rolling manifest |
| feedback | Ingest recommendation outcomes for calibration |
| cost-report | View LLM cost breakdown from past runs |
Optional AI features use Anthropic, OpenAI, or a local Ollama instance.
Start Here
The fastest way to evaluate the package is the deterministic CI path. These commands do not require an API key.
Install the package:
npm install -D @yasserkhanorg/impact-gateRequires Node.js >= 20. Ships both CommonJS and ESM builds.
Verify the CLI:
npx impact-gate --helpThen run the core CI workflow:
# 1. See what changed in a PR or branch diff
npx impact-gate impact --path /path/to/project --since origin/main
# 2. Build a coverage plan and CI summary artifacts
npx impact-gate plan --path /path/to/project --since origin/main
# 3. Fail the job if coverage is below a threshold
npx impact-gate gate --path /path/to/project --threshold 80Use the same plan command for release readiness:
# Compare the current branch or release candidate to the last shipped tag
npx impact-gate plan --path /path/to/project --since v2.1.0That gives you a release-focused test plan showing impacted flows, current coverage, and where you still need tests or validation before shipping.
Notes:
impactprints a deterministic summary to stdout.planwrites.e2e-ai-agents/plan.jsonand.e2e-ai-agents/ci-summary.md.plan --since <old-release-tag>is the simplest way to turn a release diff into a prioritized test plan.gateexpects a threshold in the range0-100and exits1when the threshold is missed.- Add the Optional AI Workflow only after your
route-families.jsonmanifest is useful enough to trust.
Dogfood Proof
The current repo includes a full dogfood run at dogfood/2026-03-28/README.md.
- Playwright example: synthetic auth change ->
impact,plan, andgateall behaved as expected - Cypress example: synthetic dashboard change -> parity proof for the same deterministic flow
- Self dogfood: heuristic fallback grouped changes truthfully, but still read too optimistically for a package-style repo
Takeaway:
- the strongest product path is still an app-shaped Playwright/Cypress repo with a maintained manifest
- zero-config / heuristic fallback is useful for orientation, but it should not be treated as equally trustworthy for release decisions
Setup and Calibration
These commands help the core CI workflow become accurate and project-aware.
# Build the manifest from the repo structure
npx impact-gate train --path /path/to/project --no-enrich
# Or bootstrap it from an Understand-Anything knowledge graph
npx impact-gate bootstrap --path /path/to/project [--kg-path ./knowledge-graph.json]
# Feed execution data back into the manifest
npx impact-gate traceability-capture --path /path/to/project --traceability-report ./playwright-report.json
npx impact-gate traceability-ingest --path /path/to/project --traceability-input ./traceability-input.json
# Calibration and diagnostics
npx impact-gate feedback --path /path/to/project --feedback-input ./feedback.json
npx impact-gate cost-report --path /path/to/project
npx impact-gate llm-healthOptional AI Workflow
Once impact analysis is useful and the manifest is in place, you can layer on AI assistance.
# All-in-one wrapper: impact + coverage + optional generation/healing
npx impact-gate analyze --path /path/to/project [--generate] [--heal]
# Generate tests for uncovered gaps
npx impact-gate generate --path /path/to/project
# Heal flaky or failing specs from a Playwright report
npx impact-gate heal --path /path/to/project --traceability-report ./playwright-report.json
# Stage generated tests, commit, and optionally open a PR
npx impact-gate finalize-generated-tests --path /path/to/project --create-prplan and suggest are aliases. analyze is the convenience wrapper when you want the full path in one invocation.
How Hallucinations Are Tackled
The AI path is intentionally constrained instead of trusting raw LLM output.
- Deterministic first: impact analysis, coverage planning, and release-diff planning work without an LLM. The AI layer comes after the diff and coverage evidence are already established.
- Local API surface grounding: generation prompts are built from discovered page objects, helpers, method signatures, and inherited methods from your own repository.
- Prompt-level constraints: the generator is explicitly told to use only known methods and to fall back to raw Playwright selectors when a method is not available.
- Prompt sanitization: flow names, evidence, and user-action strings are sanitized before being injected into prompts.
- Hallucination detection gate: generated code is scanned for method calls that do not exist in the discovered API surface. Suspicious specs are blocked by default instead of being written into the main specs directory.
- Needs-review quarantine: blocked specs are written to
generated-needs-review/so teams can inspect them manually rather than accidentally trusting them in CI. - Verification after generation: written specs go through compile checks and smoke-run verification. Failing specs are moved out of the trusted path.
This is why the strongest product story is still: deterministic diff -> test plan -> optional AI assistance with guardrails.
Advanced / Experimental
These features are real, but they are not the clearest place to start if your goal is simple CI coverage decisions.
Multi-Agent Crew
The Crew orchestrates deeper multi-agent workflows on top of the same impact-analysis foundation. Use it when you want richer strategy output, structured test design, or end-to-end generation pipelines.
# Quick strategy recommendations
npx impact-gate crew --workflow quick-check --path /path/to/project --tests-root ./e2e-tests --since origin/master
# Full design-only workflow
npx impact-gate crew --workflow design-only --path /path/to/project --tests-root ./e2e-tests --since origin/master
# End-to-end workflow
npx impact-gate crew --workflow full-qa --path /path/to/project --tests-root ./e2e-tests --since origin/masterBuilt-in safeguards include budget enforcement, provider circuit breaking, and structured output for downstream tooling.
Plugins
External agents can register into crew workflows via the plugins config:
import type {AgentPlugin, AgentTask, AgentResult, CrewContext} from '@yasserkhanorg/impact-gate';
const myPlugin: AgentPlugin = {
role: 'my-custom-analyzer',
phase: 'understand',
runAfter: ['impact-analyst'],
async execute(task: AgentTask, ctx: CrewContext): Promise<AgentResult> {
return {role: 'my-custom-analyzer', status: 'success', output: null, warnings: []};
},
};
export default myPlugin;npx impact-gate crew --plugins ./my-plugin.ts --workflow full-qa --path ./appSee docs/PLUGIN_API_STABILITY.md for the API contract and stability guarantees.
Programmatic API
import {
CrewOrchestrator,
ImpactAnalystAgent,
CrossImpactAgent,
RegressionAdvisorAgent,
StrategistAgent,
TestDesignerAgent,
} from '@yasserkhanorg/impact-gate';
const orchestrator = new CrewOrchestrator();
orchestrator.registerAgent(new ImpactAnalystAgent());
orchestrator.registerAgent(new CrossImpactAgent());
orchestrator.registerAgent(new RegressionAdvisorAgent());
orchestrator.registerAgent(new StrategistAgent());
orchestrator.registerAgent(new TestDesignerAgent());
const result = await orchestrator.run({
appPath: './webapp',
testsRoot: './e2e-tests',
gitSince: 'origin/master',
workflow: 'design-only',
});
console.log(result.context.strategyEntries);
console.log(result.context.testDesigns);
console.log(result.context.crossImpacts);Route-Families Training
What it produces
The train command builds a knowledge map of your codebase — a single JSON file (route-families.json) that maps source files to features, test directories, and user flows. This is not ML training; no model is trained. It's building a structured manifest like:
{
"id": "channels",
"routes": ["/{team}/channels/{channel}"],
"priority": "P0",
"webappPaths": ["src/components/channel_header/**"],
"serverPaths": ["server/channels/api4/channel*.go", "server/channels/app/channel*.go"],
"specDirs": ["specs/functional/channels/"],
"userFlows": ["Create channel", "Archive channel", "Search in channel"],
"components": ["ChannelHeader", "ChannelSidebar"]
}Why the tool needs this
When a PR changes server/channels/app/channel.go, the tool needs to answer: "which E2E tests should I run?" Without the manifest, it has no idea. With it:
channel.go changed
→ belongs to "channels" family
→ specs are in specs/functional/channels/
→ run those tests
→ flag if coverage is missing for the affected user flowsEvery downstream command (impact, plan, generate, heal, impact-gate-qa) reads this manifest to understand the codebase.
How scanning works
The scanner uses 4 strategies to build the file → family mapping:
- Directory matching —
src/channels/+tests/channels/share a name → channels family - Test-derived —
specs/functional/channels/drafts/exists with spec files → drafts family (even if source code is scattered across components/actions/reducers) - Server-derived —
api4/channel.go+app/channel.go+store/channel_store.gospan 3 backend tiers → channel family (related files likechannel_bookmark.goare grouped under the parent) - Name-matched —
src/utils/channels.tsorserver/public/model/channel.gobasename matches → add to channels family's paths
What LLM enrichment adds
The scanner finds files. The LLM reads code samples and adds semantic metadata the scanner can't determine:
- Accurate URL routes (
/{team}/channels/{channel}instead of guessed/channels) - Priority classification (P0 critical user flow vs P2 nice-to-have)
- Human-readable user flows ("Create channel", "Search messages")
- React component and page object names
This metadata makes impact analysis smarter — it can prioritize P0 flows and suggest specific test scenarios.
What validation does
The --validate flag measures manifest accuracy against real git history. It's not training data — it's a quality check:
835 commits → 5105 changed files → 3223 bound to a family = 63% coverageThis tells you the manifest is complete enough. If coverage were 30%, impact analysis would be blind to most code changes.
Usage
# Scan your codebase + LLM enrichment (default)
npx impact-gate train --path /path/to/project
# Offline mode (no LLM, no API key needed)
npx impact-gate train --path /path/to/project --no-enrich
# Validate accuracy against recent git history
npx impact-gate train --path /path/to/project --validate --since HEAD~50
# Full pipeline: scan + enrich + validate
npx impact-gate train --path /path/to/project --validate --since HEAD~20Why LLM enrichment is on by default: The manifest gives AI context for impact analysis, scenario suggestion, and bug detection. AI-generated context produces better AI reasoning downstream. Use --no-enrich for offline/free operation or to avoid sending code snippets to third-party LLM APIs.
Training loop: Run train → review route-families.json → run train --validate to check coverage % → fix gaps → repeat.
Additional flags:
--verbose/-v— DEBUG-level output with timing for each phase--json— structured JSON log output (for CI pipelines)--server-path— explicit path to backend server root--budget-usd— max LLM spend (default: $0.50, max: $10)
Output:
<testsRoot>/.e2e-ai-agents/route-families.json— the manifest<testsRoot>/.e2e-ai-agents/train-report.json— timing data, family counts, coverage stats, LLM metrics
Configuration
Create impact-gate.config.json in your project (auto-discovered):
{
"path": ".",
"profile": "strict",
"testsRoot": ".",
"mode": "impact",
"framework": "auto",
"git": { "since": "origin/master" },
"impact": {
"dependencyGraph": { "enabled": true, "maxDepth": 3 },
"traceability": { "enabled": true },
"aiFlow": { "enabled": true, "provider": "anthropic" }
},
"pipeline": {
"enabled": false,
"scenarios": 3,
"outputDir": "specs/functional/ai-assisted",
"mcp": false
},
"policy": {
"enforcementMode": "block",
"blockOnActions": ["must-add-tests"]
}
}Analysis Profiles
Profiles are not the same thing as frameworks. They control analysis strictness and project-specific conventions.
| Profile | Description |
|---------|-------------|
| default | Standard analysis behavior for most repositories |
| strict | Stricter handling of heuristic-only mappings and more opinionated analysis defaults |
Framework detection is separate. The CLI can auto-detect Playwright, Cypress, pytest, supertest, and Selenium usage from the project structure and dependencies.
Key options
testsRoot— path to tests when they live outside the app rootprofile—defaultorstrictimpact.dependencyGraph— static reverse dependency graph for transitive impactimpact.traceability— file-to-test mapping from CI execution dataimpact.aiFlow— LLM-powered flow mapping through the configured providerpipeline.mcp— use Playwright MCP server for browser-aware generation/healingpolicy.enforcementMode—advisory,warn, orblock
CI Integration
GitHub Actions
- name: Run E2E coverage check
run: |
npx impact-gate plan \
--config ./impact-gate.config.json \
--since origin/${{ github.base_ref }} \
--fail-on-must-add-tests \
--github-output "$GITHUB_OUTPUT"The plan command writes:
.e2e-ai-agents/plan.json— structured plan withrunSet,confidence,decision.e2e-ai-agents/ci-summary.md— markdown summary for PR comments.e2e-ai-agents/metrics-summary.json— run metrics
Use --fail-on-must-add-tests to exit non-zero when uncovered P0/P1 gaps exist. Use --github-output to expose outputs to subsequent workflow steps.
If you want AI enrichment on top of the deterministic plan, add your provider environment variables to the workflow separately.
See examples/github-actions/pr-impact.yml for a complete workflow template.
Pipeline Modes
Package Native (default)
Strategy-based test templates with quality guardrails and iterative heal attempts. The strongest path today is still a repo whose impact analysis and manifest quality are already in good shape.
MCP Mode (--pipeline-mcp)
Uses the official Playwright Test Agent loop (planner/generator/healer) with Claude CLI orchestration. Validates generated specs against discovered local API surface to block hallucinated methods.
--pipeline-mcp-only— fail if MCP setup fails (no silent fallback)--pipeline-mcp-allow-fallback— fall back to package-native if MCP unavailable--pipeline-mcp-timeout-ms— per-command timeout--pipeline-mcp-retries— retry count for transient failures
Agentic Generation (generate command)
LLM-powered generate-run-fix loop: generates a spec, runs it, analyzes failures, and iterates up to --max-attempts times.
LLM Providers
Used internally for AI enrichment, test generation, and healing.
# Anthropic
export ANTHROPIC_API_KEY=sk-ant-...
# OpenAI
export OPENAI_API_KEY=sk-...
# Ollama (free, local)
export OLLAMA_BASE_URL=http://localhost:11434
export OLLAMA_MODEL=deepseek-r1:7bProgrammatic provider usage:
import { AnthropicProvider } from '@yasserkhanorg/impact-gate';
const claude = new AnthropicProvider({
apiKey: process.env.ANTHROPIC_API_KEY
});
const response = await claude.generateText('Analyze test failure');Factory pattern with auto-detection, hybrid mode (free local + premium fallback), and custom OpenAI-compatible endpoints are also supported. See the provider API exports for full details.
Advanced / Experimental: MCP Server
Exposes 6 tools for test agents (Playwright v1.56+):
import { E2EAgentsMCPServer } from '@yasserkhanorg/impact-gate/mcp';
const server = new E2EAgentsMCPServer();
// Tools: discover_tests, read_file, write_file, run_tests, get_git_changes, get_repository_contextSecurity: write_file is restricted to test spec files (*.spec.ts, *.test.ts) and the .e2e-ai-agents/ directory. Path traversal and symlink escape are blocked. Rate limited to 100 requests/minute.
Traceability
Build file-to-test mappings from CI execution data:
- Capture — extract test-file relationships from Playwright JSON reports
- Ingest — merge into a rolling manifest (
.e2e-ai-agents/traceability.json) - Query — impact analysis uses the manifest to map changed files to relevant tests
Tuning flags: --traceability-min-hits, --traceability-max-files-per-test, --traceability-max-age-days.
Schemas: schemas/traceability-input.schema.json
Artifacts
| File | Written by | Purpose |
|------|-----------|---------|
| route-families.json | train | Route family manifest |
| train-report.json | train | Training timings, coverage, LLM metrics |
| plan.json | plan | Coverage plan with gaps, decisions, metrics |
| ci-summary.md | plan | Markdown for PR comments |
| metrics.jsonl | plan | Append-only run metrics |
| metrics-summary.json | plan | Aggregated metrics |
| traceability.json | traceability-ingest | File-to-test manifest |
| traceability-state.json | traceability-ingest | Rolling counts |
| feedback.json | feedback | Recommendation outcomes |
| calibration.json | feedback | Precision/recall calibration |
| flaky-tests.json | feedback | Flaky test scores |
| agentic-summary.json | generate | Agentic generation results |
All written under <testsRoot>/.e2e-ai-agents/.
Advanced / Experimental: Autonomous QA Agent (impact-gate-qa)
An autonomous QA engineer that can take a diff or a feature prompt, open a real browser, navigate changed features, hunt edge cases, generate follow-up specs, heal failures, and produce a findings report. Built on top of agent-browser and the Anthropic tool-use API.
If you want the full product story and the natural-language front door, start
with the Autonomous Browser QA guide
and the QA Skill Guide
for Codex and Claude examples using /qa.
Quick Start
# PR mode — test features changed since origin/main
npx impact-gate-qa pr --since origin/main --base-url http://localhost:3000
# Hunt mode — deep-test a specific area
npx impact-gate-qa hunt "settings panel" --base-url http://localhost:3000
# Release mode — systematic exploration of all critical flows
npx impact-gate-qa release --base-url http://localhost:3000 --time 30
# Fix mode — verify healed specs
npx impact-gate-qa fix --base-url http://localhost:3000Architecture
- Phase 1 (Script) — Runs
impact-gate impact/planto determine scope, then executes matched Playwright specs. - Phase 2 (Explore) — LLM-driven browser loop: observe (accessibility snapshot) → think → act (click/fill/navigate) → record findings. Includes stuck detection, multi-user testing, console error capture, and vision-based analysis.
- Phase 3 (Report) — Generates a structured report with findings, per-flow sign-off, and a release-readiness verdict (go/no-go/conditional).
Options
| Flag | Default | Description |
|------|---------|-------------|
| --base-url | Required | Application URL |
| --time | 15 | Time limit in minutes |
| --budget | 2.00 | Max LLM spend in USD |
| --phase | all | Run only 1, 2, or 3 |
| --headed | off | Keep browser visible |
| --since | — | Git ref for diff-based scoping |
| --tests-root | — | Path to Playwright tests directory |
Requires agent-browser CLI (npm install -g agent-browser) and ANTHROPIC_API_KEY.
Production Usage
The strongest production story today is a repo that maintains a good route-families.json manifest, feeds traceability data back into the plan, and uses the deterministic impact -> plan -> gate loop in CI. That evidence-first workflow is the path to trust before layering in optional generation, healing, or autonomous QA.
License
Apache 2.0
