npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

test-capabilities

v0.3.0

Published

Fail-closed testing capability framework with CLI, browser, property, healing, and diagnostic root-cause surfaces

Readme


summary: "Project overview and operator guide for test-capabilities." read_when:

  • "You are onboarding to test-capabilities"
  • "You need commands, structure, or current repo scope" type: "reference"

test-capabilities

Fail-closed testing capability framework for CLI, browser, property, healing, and diagnostic root-cause workflows.

Previously tracked in this workspace as testers.

Vision

We don't build tests. We build the immune system of software.

See docs/project/vision.md for the durable north-star vision and docs/project/product-posture.md for the current product maturity snapshot.

Components

| Path | Description | |------|-------------| | src/ | TEST-CAPABILITIES testing framework (operation kernel, orchestrator, self-healing, quantum simulator, prediction engine) | | bin/test-capabilities | TEST-CAPABILITIES CLI | | external/bombadil | Parked repo-local Bombadil-compatible binary fallback, intentionally excluded from packed npm artifacts | | examples/demo/ | Built-in zero-external-dependency demo fixture for first functional proof | | prompts/ | LLM testing prompts (cli-tester, web-tester, api-tester) | | docs/ | Testing guides and frameworks |

Documentation

| Doc | Description | |-----|-------------| | docs/project/vision.md | Durable product vision and strategic direction | | docs/project/product-posture.md | Current product maturity, supported/unsupported boundary, and major gaps | | docs/TEST-CAPABILITIES-FRAMEWORK.md | TEST-CAPABILITIES autonomous testing framework | | docs/LLM-TESTING-GUIDE.md | Guide for LLM-driven testing | | docs/DECISION-MATRIX.md | Tool selection decision matrix | | docs/adoption/ | Greenfield, brownfield, minimal first-run, and Bombadil 0.5 adoption guides | | docs/dev/ts-quality-screening.md | Repo-local deterministic screening proof path via ts-quality | | docs/npm-publishing-checklist.md | Public npm release readiness checklist | | docs/releases/release-workflow.md | GitHub Release → npm Trusted Publishing workflow | | docs/api/ | TEST-CAPABILITIES API reference docs |

Polished core use case

The first public release is centered on one zero-external-dependency flow: CLI smoke + observation diagnostics.

node ./bin/test-capabilities doctor --json
node ./bin/test-capabilities demo --json

This proves the package can load, run a real CLI command through cli-tester, and emit observation.v1 diagnostic evidence without Surf Go, Bombadil, network access, or a target application. For the full doctor -> init -> demo -> test --json path, use Minimal CLI smoke walkthrough. For adoption strategy, use Greenfield bootstrap or Brownfield integration. See examples/demo/README.md for the packaged demo fixture.

Capability Contract

The runtime is now fail-closed. If a config section, agent, command, or flag is not wired to a real implementation path, the CLI errors instead of pretending success.

First public happy path:

npx test-capabilities doctor
# or from a checkout
node ./bin/test-capabilities doctor --json

doctor has no Surf/Bombadil requirement: missing optional external runtimes are reported as warnings while package/runtime readiness checks remain required.

The shipped CLI verbs now run through a typed operation kernel exposed at src/core/operations.ts and implemented in trust-sized modules under src/core/operations/. That registry owns the supported routes, their input schemas, their executors, and their structured result shapes so the CLI wrapper stays thin. For Surf-backed web exploration, Surf Go is the standard runtime. The supported orchestrator resolves it through TEST_CAPABILITIES_SURF_GO_BIN, a source checkout referenced by TEST_CAPABILITIES_SURF_GO_REPO, or surf-go on PATH. A Surf Go source checkout can run via go -C <repo>/go run ./cmd/surf-go; build surf-go first for faster runs. Explicit Surf Go repo env vars fail closed when invalid instead of silently switching to a different runtime. surf explore now runs explicit browser-state/DOM/link probes, supports bounded same-origin --depth exploration from 1-3, and reports graded user-flow coverage from verified probe counts; empty output, help text, warning-only output, and target URLs without a matching browser-state probe fail closed as unverified coverage. For Bombadil-backed web exploration, the supported orchestrator resolves the binary through TEST_CAPABILITIES_BOMBADIL_BIN, a built source checkout referenced by TEST_CAPABILITIES_BOMBADIL_REPO, repo-local external/bombadil, or bombadil on PATH. A Bombadil-compatible source checkout only overrides the parked repo-local fallback once it has a built target/release/bombadil or target/debug/bombadil; upstream Bombadil 0.5 centralizes builds and no longer requires esbuild, though local source builds may still need project-specific prerequisites such as trunk or the project Nix shell. Bombadil 0.5 request headers, trace output paths, trace reproduction, viewport/instrumentation/permission knobs, and test-external debugger settings are exposed through agents.<name>.bombadil config. Bombadil's disabled-control skipping, quiescence timers, and dialog auto-accept behavior come from the resolved Bombadil binary itself. A bounded experimental terminal-fuzzer agent wraps bombadil terminal test -- <command> [args...] and emits normalized observation.v1 runtime evidence whose subject is the resolved terminal command, without claiming production stability or autonomy. Packed npm consumers should treat Bombadil as an external tool requirement: the package intentionally excludes external/bombadil, and npm run consumer:smoke verifies that a packed consumer without TEST_CAPABILITIES_BOMBADIL_BIN, TEST_CAPABILITIES_BOMBADIL_REPO, or bombadil on PATH receives a clear failing Bombadil finding instead of a fake pass. See docs/project/bombadil-distribution-posture.md and docs/adoption/bombadil-0.5-how-to.md. The same packed-consumer smoke also proves calibrated root_cause and low-calibration non-authoritative propagation observations survive through the distributed library API.

Implemented today

| Surface | Status | Notes | |---------|--------|-------| | doctor command | Implemented | Zero-external-dependency package and environment diagnostics; optional Surf Go/Bombadil-compatible runtimes warn when absent instead of failing | | init command | Implemented | Generates a minimal valid test-capabilities.yaml for the zero-external-dependency cli-tester path and refuses overwrites without --force | | demo command | Implemented | Built-in zero-external-dependency functional demo for the polished cli-smoke-observation use case | | test command | Implemented | Supports --config, --target, --quick; URL targets apply when quantum.enabled: true or a supported bombadil/surf agent is enabled, and they only replace targets.cli when no cli-tester smoke is enabled | | bombadil orchestrator agent | Implemented | Runs a bounded Bombadil exploration budget against targets.web; resolves the binary through explicit env, a built source checkout, repo-local parked fallback, or PATH | | terminal-fuzzer orchestrator agent | Implemented | Experimental bounded wrapper for bombadil terminal test -- <command> [args...]; emits observation.v1 runtime evidence for the resolved CLI/terminal command and fails closed on missing Bombadil or target command | | surf orchestrator agent | Implemented | Runs the supported surf explore operation against targets.web; resolves Surf Go from explicit env, a source checkout, or surf-go on PATH, then reports graded user-flow coverage from verified browser-state/DOM probes | | cli-tester orchestrator agent | Implemented | Executes <targets.cli> --help as a capability-backed smoke | | quantum command | Implemented | Uses the shared simulator path | | surf explore | Implemented | Runs Surf Go navigation plus explicit browser-state/DOM probes, optionally follows same-origin links with --depth 1..3, and fails closed unless the seed page verifies browser-state evidence | | heal command | Implemented | Heuristic selector repair workflow; --findings-input accepts orchestrator findings JSON so proposals cite diagnostic evidence as triggeringFindingId; apply mode can consume a reviewed --proposal-input artifact and still requires an external --checkpoint-ref | | normalized observations | Implemented | Supported orchestrator agents emit observation.v1 diagnostic events for Surf coverage, Bombadil property exploration, and CLI smoke execution; when correlation is enabled, runs can also include component-level semantic synthesis, suite-level observation correlation, deterministic root_cause observations for at least two same-component independent failed-or-errored observed evidence units that agree on the same failure class, and low-calibration non-authoritative propagation observations across configured dependency edges | | finding correlation | Implemented | Cross-finding synthesis inside the orchestrator; observation-native synthesis and calibrated root-cause observations summarize multi-sensor meaning without becoming pass/fail authority or prediction |

Explicitly unsupported for now

These surfaces fail clearly when enabled or invoked:

  • orchestrator agents: api-fuzzer
  • orchestrator intelligence flags: self_healing, prediction, collective
  • chaos execution
  • CLI commands: predict, visualize, report
  • test flags: --autonomous, --self-heal, --predict, --fail-threshold, --upload-artifacts, --report
  • surf actions: flow, assert, compare, replay

Commands

# Quality gates
npm run check          # Full CI check (lint + test)
npm run lint           # Lint check
npm run fix            # Auto-fix lint issues
npm run consumer:smoke # Packed-artifact consumer contract smoke
npm run truth:gate       # Cross-check portable runtime/package/docs/passport truth surfaces
npm run release:check    # Release preflight (quality + truth-gated diagnostic corpora + packed-artifact verification)

# First-run proof (full walkthrough: docs/adoption/minimal-cli-smoke-walkthrough.md)
node ./bin/test-capabilities doctor
node ./bin/test-capabilities demo --json

# Build
npm run build          # TypeScript build

# TEST-CAPABILITIES CLI
npm run test-capabilities                # Run TEST-CAPABILITIES CLI
node ./bin/test-capabilities init --output ./test-capabilities.yaml --target node --force
node ./bin/test-capabilities doctor --config ./test-capabilities.yaml --target node --json
node ./bin/test-capabilities test --config ./test-capabilities.yaml --json
node ./bin/test-capabilities quantum --target https://example.com
node ./bin/test-capabilities surf explore --url https://example.com
node ./bin/test-capabilities heal --dir ./tests --dry-run
node ./bin/test-capabilities heal --dir ./tests --dry-run \
  --proposal-output artifacts/heal-proposals.json \
  --verification-output artifacts/heal-verification.json
node ./bin/test-capabilities heal --dir ./tests --dry-run \
  --findings-input artifacts/orchestrator-findings.json \
  --proposal-output artifacts/heal-proposals.json
node ./bin/test-capabilities heal --dir ./tests --checkpoint-ref checkpoint/test-capabilities/heal-001
node ./bin/test-capabilities heal --dir ./tests \
  --proposal-input artifacts/heal-proposals.json \
  --checkpoint-ref checkpoint/test-capabilities/heal-001

# Testing
npm test                  # Run node contract tests
npm run test:property     # fast-check invariant lane for config, route selection, and orchestrator behavior
npm run test:behavior     # cucumber-backed CLI workflow scenarios mapped to docs/examples
npm run test:ci-targeted  # CI-targeted smoke tests
npm run capability:drill  # Repo-local end-to-end drill for shipped capabilities
npm run root-cause:corpus        # Dogfood calibrated root-cause diagnosis invariants
npm run runtime-diagnostic:corpus # Dogfood calibrated diagnosis through real cli-tester subprocesses
npm run bombadil:smoke           # Richer Bombadil regression smoke against a deterministic local fixture

# Docs discovery
npm run docs:list            # List relevant docs for a task
npm run docs:list:workspace  # Workspace-wide doc scan

Screening proof path

Repo-local ts-quality screening currently starts with a narrow operation-kernel slice over the test-capabilities source surface. For the wrapper, commands, and changed-scope rules, see docs/dev/ts-quality-screening.md.

Capability drill

To exercise the shipped capabilities against deterministic local fixtures, run:

npm run capability:drill

What it checks today:

  • test succeeds on a real CLI smoke target and rejects inert URL overrides in quick mode
  • quantum succeeds on an explicit local URL and fails closed when --target is missing
  • heal proposes selector fixes without rewriting payload literals or custom-helper strings
  • heal --dry-run --proposal-output <file> --verification-output <file> writes durable proposal and in-memory verification artifacts for review or future replay-ledger follow-through without mutating files
  • heal requires --checkpoint-ref from an external checkpoint/restore authority before applying proposals that mutate files
  • heal --proposal-input <artifact> --checkpoint-ref <ref> applies proposals from a previously emitted proposal artifact instead of recomputing them during apply
  • surf explore runs through the shipped wrapper path and rejects invalid URLs
  • test runs a surf-backed orchestrator agent against a deterministic local fixture
  • library drills for orchestrator correlation and prediction input validation; contract tests cover calibrated root-cause observation synthesis

Surf modes:

# Auto-detect: use Surf Go when available, otherwise a deterministic shim
npm run capability:drill

# Force the deterministic shim path
bash ./scripts/capability-drill.sh --surf-mode shim

# Require a real Surf Go runtime (surf-go or source-checkout Surf Go)
bash ./scripts/capability-drill.sh --surf-mode real

# Emit machine-readable JSON for automation
bash ./scripts/capability-drill.sh --json --surf-mode shim --skip-build

The JSON mode returns a structured summary with ok, surfMode, summary, and per-check status entries so CI or agent tooling can consume the drill result without scraping terminal text.

Root-cause calibration corpus

To dogfood the calibrated diagnostic layer against deterministic fixture cases, use:

npm run root-cause:corpus

To prove the same root-cause synthesis path through real supported cli-tester subprocess execution, use:

npm run runtime-diagnostic:corpus

What it checks today:

  • single-agent CLI or Surf failures do not emit root_cause
  • two independent observed CLI command-resolution or timeout failures classify as command_resolution or timeout_or_latency, including shell not-found wording, while app crashes do not masquerade as command resolution
  • two independent observation-only API signals can classify as contract_mismatch, including API contract violation, property-kind payload evidence, and response-payload/required-field wording that must not masquerade as Bombadil/property failure or selector drift
  • bounded non-contract classes include API authentication/authorization evidence as auth_or_permission, API/web connection/DNS/TLS evidence as network_connectivity, API/CLI/web rate-limit/quota/memory/disk/file-descriptor/pool-exhaustion evidence as resource_exhaustion, and API/CLI/web missing environment/configuration evidence as configuration_error
  • executable-resolution evidence still classifies as command_resolution even when the missing executable is named config or app-config; real config-file/value evidence remains configuration_error
  • generic API runtime, property-kind runtime, stack-trace, validation, or schema exceptions without contract/auth/network/resource/configuration evidence remain component_failure_surface; recommendation-only keywords are not classifying evidence
  • two independent observed Surf failures classify as browser_coverage_gap, including generic DOM coverage wording that must not masquerade as selector drift
  • two independent observed selector/DOM drift failures classify as selector_or_dom_drift, including selector-contract wording, while single-sensor or unobserved selector drift does not emit root_cause
  • two independent observed Bombadil failures classify as property_violation, including required-property validation wording that must not masquerade as API contract mismatch
  • two independent sensors linked to the same API finding classify as contract_mismatch, even when generic browser words appear in the observations
  • finding-only, mixed-class evidence, all-passing linked sensors, single-sensor multi-finding, unobserved conflicting findings, partially observed evidence pairs, linked finding/current-run evidence disagreement, and same-component mixed CLI/API failure classes do not emit root_cause
  • unrelated ambiguous signals in one component, including a suppressed same-component mixed-class ambiguity, do not suppress a calibrated same-run diagnosis for another component
  • independent CLI and API failures can emit simultaneous component-scoped root_cause observations
  • three-sensor agreement produces high-calibration root_cause with signalCount and sensorCount reflecting all three observers
  • independent Bombadil + CLI failures emit two component-scoped root_cause observations (property_violation + command_resolution)
  • three-way simultaneous Surf + CLI + API failures emit three component-scoped root_cause observations
  • propagation synthesis covers default api -> web, cli -> api, and cli -> web edges, including API latency links with web runtime failures, same-timeout shared-infra links, and API schema-drift-to-UI links with web runtime failures; it supports intelligence.propagationTopology overrides for custom edges, suppresses generic component-failure-only, non-latency same-class, and Surf evidence-gap overclaims, and stays low-calibration/non-authoritative
  • root-cause and propagation output exclude prediction language and synthetic corr-* IDs; auth-boundary, network-connectivity, resource-exhaustion, or configuration failures do not imply downstream propagation without a separately promoted bounded link

Machine-readable mode emits aggregate coverage floors, exact release truth-lock counts, per-case expected/actual classification, root-cause count, calibration counts, linked finding IDs, propagation counts, propagation subjects, propagation links, and no-propagation guardrail markers for automation without scraping terminal text. Runtime observations also expose structured semantics.failureClass for root_cause and semantics.propagationLink for propagation while retaining evidence strings for backward-compatible inspection:

npm run --silent root-cause:corpus -- --json
npm run --silent runtime-diagnostic:corpus -- --json

The runtime diagnostic corpus is intentionally narrower than the fixture corpus: it creates temporary local CLI fixtures and checks single-sensor suppression, two-sensor command_resolution, two-sensor timeout_or_latency, same-component mixed-class suppression, correlation-disabled suppression, unique observation IDs, and absence of prediction/causal/repair-order language without using network, databases, or external binaries.

Bombadil richer smoke fixture

To run a richer local Bombadil regression against a deterministic multi-control fixture, use:

npm run bombadil:smoke

What it does:

  • serves examples/bombadil-rich/site/ on a temporary local port
  • runs Bombadil directly and expects trace artifacts under a temporary output directory
  • runs test-capabilities test --quick with a Bombadil-backed config against the same local fixture

Useful options:

# Reuse an already-built dist/
bash ./scripts/bombadil-rich-smoke.sh --skip-build

# Run only the direct Bombadil phase
bash ./scripts/bombadil-rich-smoke.sh --direct-only

# Run only the TEST-CAPABILITIES wrapper phase
bash ./scripts/bombadil-rich-smoke.sh --tc-only

# Keep the generated fixture/output directory for inspection
bash ./scripts/bombadil-rich-smoke.sh --keep-temp

The richer fixture currently lives at examples/bombadil-rich/site/ and includes intra-origin navigation, toggles, select inputs, a form, and stateful UI so Bombadil can explore more than the minimal capability-drill page.

Structure

test-capabilities/
├── bin/               # TEST-CAPABILITIES CLI
├── docs/
│   ├── api/           # TEST-CAPABILITIES API reference
│   ├── project/       # Vision, goals
│   ├── _core/         # Immutable core docs
│   ├── org_context/   # Organizational context
│   ├── learnings/     # Crystallized patterns
│   └── system4d/      # System4D framework docs
├── examples/          # Test patterns, sample specs
├── external/          # Vendored tools (bombadil)
├── flows/             # Test flow definitions
├── ontology/          # Generated test artifacts
├── policy/            # Stack lane, security policies
├── prompts/           # LLM testing prompts
├── src/               # TEST-CAPABILITIES framework source
│   ├── core/          # Orchestrator
│   ├── healing/       # Self-healing
│   ├── integrations/  # External tool clients
│   ├── prediction/    # Prediction engine
│   └── quantum/       # Quantum simulator
├── scripts/           # CI, quality gates, tooling
└── tests/             # Test files