npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

ai-reliability-layer

v0.0.1

Published

Validates AI failure analysis against Playwright runtime evidence. Includes a shared core, NestJS HTTP server, and CLI.

Downloads

113

Readme

ai-reliability-layer

Validates AI failure analysis against real runtime evidence. Run a scenario, let an AI diagnose the failure, then re-run it — the library cross-checks the AI's prediction against what actually happened in the browser and returns a structured verdict.

How it works

Scenario (selector + URL)
  │
  ├─ First attempt (Playwright) ──► AI analysis (GPT / Claude / etc.)
  │                                        │
  └─ Retry attempt (Playwright) ──────────┘
                                           │
                       ReliabilityEvaluator: compare prediction vs runtime evidence
                                           │
                                    AnalysisReport + Verdict

A verdict can be:

| action | Meaning | |---|---| | accept_ai | Runtime evidence confirms the AI's prediction | | override_ai | Evidence contradicts the prediction | | needs_more_evidence | Inconclusive — actual cause is unknown |


Requirements

  • Node.js ≥ 18
  • A Playwright-supported browser (Chromium is launched automatically)
  • An AI provider API key (or use the built-in mock provider for testing)

Installation

npm install ai-reliability-layer
npx playwright install chromium   # first time only

Quick start

1. Create a scenario file

TypeScript (scenarios/login-button.ts):

export default {
  id: "login-button",
  name: "Login button click",
  url: "https://your-app.com/login",
  selector: "#login-btn",
  expectedMode: "deterministic_fail",   // "deterministic_fail" | "flaky" | "loose_element"
  timeoutMs: 3000,
};

Markdown with YAML frontmatter (scenarios/login-button.md):

---
id: login-button
name: Login button click
url: https://your-app.com/login
selector: "#login-btn"
expectedMode: deterministic_fail
timeoutMs: 3000
---

2. Set your AI provider

# OpenAI / OpenAI-compatible (Grok, Gemini, DeepSeek, Ollama…)
export OPENAI_BASE_URL=https://api.openai.com/v1
export OPENAI_API_KEY=sk-...

# Or Anthropic / Claude
export ANTHROPIC_API_KEY=sk-ant-...

# Default provider to use
export AI_PROVIDER=openai          # or: claude, grok, gemini, deepseek, ollama, mock
export AI_MODEL=gpt-4o

3. Run via CLI

# Analyse a single scenario by ID
npx ai-reliability-layer analyze --scenario login-button

# Analyse all scenarios in ./scenarios/
npx ai-reliability-layer analyze --all

# Run in async (non-blocking) mode and stream progress
npx ai-reliability-layer analyze --all --async

# Analyse a specific file
npx ai-reliability-layer analyze --file ./scenarios/login-button.ts

# Override AI provider/model for this run only
npx ai-reliability-layer analyze --scenario login-button --provider claude --model claude-3-7-sonnet

# Discover all scenarios (returns JSON)
npx ai-reliability-layer discover

4. Run via Node.js API

import { createCoreRuntime } from "ai-reliability-layer";

const runtime = createCoreRuntime();

try {
  // Analyse one scenario by ID
  const report = await runtime.analysisService.run({
    scenarioId: "login-button",
  });
  console.log(report.verdict?.action);        // "accept_ai" | "override_ai" | "needs_more_evidence"
  console.log(report.verdict?.actualCause);   // "invalid_selector" | "timeout" | "flaky_timing" | "loose_element" | "unknown"

  // Analyse all scenarios
  const reports = await runtime.analysisService.run({ runAll: true });

  // Inline — no file needed
  const inlineReport = await runtime.analysisService.run({
    scenario: {
      url: "https://your-app.com",
      selector: "#submit",
      expectedMode: "flaky",
    },
  });
} finally {
  await runtime.close();
}

5. Start the HTTP server

# Development (watch mode)
npm run start:server:dev

# Production (build first)
npm run build
npm run start:server

Default port is 3000. Override with PORT=8080 npm run start:server.


HTTP API

All requests and responses use application/json.

POST /analysis/run — synchronous

curl -X POST http://localhost:3000/analysis/run \
  -H "content-type: application/json" \
  -H "x-ai-provider: mock" \
  -H "x-ai-model: gpt-5.4" \
  --data '{"scenarioId":"invalid-selector"}'

Request body:

{
  "scenarioId": "login-button",           // run by ID
  // or "filePath": "./scenarios/x.ts"    // run from file (must be inside project root)
  // or "runAll": true                    // run all discovered scenarios
  // or "scenario": { "url": "https://...", "selector": "#x" }  // inline, no file needed
  "ai": { "provider": "claude", "model": "claude-3-7-sonnet" }  // optional — also accepted as headers
}

Response:

{
  "status": "completed",
  "result": {
    "reportId": "login-button-1744310400000",
    "scenario": { "id": "login-button", "selector": "#login-btn" },
    "firstRun":  { "status": "failed", "errorMessage": "Timeout 3000ms exceeded." },
    "retryRun":  { "status": "failed" },
    "aiDiagnosis": { "predictedCause": "invalid_selector", "confidence": 0.91, "summary": "..." },
    "validationEvidence": { "retryStatus": "failed", "selectorExists": false, "historicalPattern": "stable_fail", "failureSignature": "timeout" },
    "verdict": { "actualCause": "invalid_selector", "aiCorrect": true, "action": "accept_ai", "explanation": "..." },
    "createdAt": "2026-04-10T00:00:00.000Z"
  }
}

AI provider/model can be passed as HTTP headers instead of (or to override) the request body:

| Header | Body equivalent | |---|---| | x-ai-provider | ai.provider | | x-ai-model | ai.model |

POST /analysis/runs — async

Returns immediately with a runId. Poll the status endpoint to track progress.

GET /analysis/runs/:runId

{
  "run": {
    "runId": "run-1744310400000",
    "status": "running",   // "queued" | "running" | "completed" | "failed"
    "total": 3, "completed": 1, "passed": 1, "failed": 0, "pending": 2
  }
}

GET /analysis/runs/:runId/results

Full run object + array of all AnalysisReport objects.

GET /analysis/reports/:reportId

Single report by ID.

GET /scenarios

List all discovered scenarios.


Scenario reference

| Field | Type | Required | Description | |---|---|---|---| | id | string | No | Stable identifier (alphanumeric, hyphens, underscores). Defaults to filename without extension. | | name | string | No | Human-readable label. | | url | string | Yes | Target URL. Must use http:, https:, or fixture: protocol. | | selector | string | Yes | CSS / XPath selector Playwright will wait for and click. | | expectedMode | string | No | "deterministic_fail", "flaky", or "loose_element". Used as a hint in the AI prompt. | | timeoutMs | number | No | Per-operation timeout in milliseconds. Default: 1000. |


Configuration

| Variable | Default | Description | |---|---|---| | AI_PROVIDER | mock | Default AI provider for every run. | | AI_MODEL | mock-reliability-v1 | Default model name. | | OPENAI_BASE_URL | — | Base URL for OpenAI or compatible providers. | | OPENAI_API_KEY | — | API key for OpenAI or compatible providers. | | ANTHROPIC_API_KEY | — | API key for Anthropic. | | ANTHROPIC_BASE_URL | https://api.anthropic.com/v1 | Anthropic base URL override. | | AI_<PROVIDER>_BASE_URL | — | Per-provider base URL (e.g. AI_GROK_BASE_URL). | | AI_<PROVIDER>_API_KEY | — | Per-provider API key (e.g. AI_GROK_API_KEY). | | SCENARIO_DIR | scenarios | Directory to scan for scenario files. | | BASE_OUTPUT_DIR | artifacts | Root directory for reports, runs, and screenshots. | | RUN_CONCURRENCY | cpus/2 (max 2) | Number of scenarios that run in parallel. | | ENABLE_TRACE | false | Save Playwright trace files on every run. | | ENABLE_SUCCESS_SCREENSHOT | false | Save screenshots on passing runs too. | | PORT | 3000 | HTTP server port. |

Supported AI providers

| Name | Protocol | Required environment variables | |---|---|---| | mock | built-in | none | | openai | OpenAI | OPENAI_BASE_URL, OPENAI_API_KEY | | claude / anthropic | Anthropic Messages | ANTHROPIC_API_KEY | | grok | OpenAI-compatible | AI_GROK_BASE_URL, AI_GROK_API_KEY | | gemini | OpenAI-compatible | AI_GEMINI_BASE_URL, AI_GEMINI_API_KEY | | deepseek | OpenAI-compatible | AI_DEEPSEEK_BASE_URL, AI_DEEPSEEK_API_KEY | | ollama | OpenAI-compatible | AI_OLLAMA_BASE_URL, AI_OLLAMA_API_KEY | | lmstudio | OpenAI-compatible | AI_LMSTUDIO_BASE_URL, AI_LMSTUDIO_API_KEY | | local | OpenAI-compatible | AI_LOCAL_BASE_URL, AI_LOCAL_API_KEY |

Any other OpenAI-compatible provider can be added without code changes — just set AI_<NAME>_BASE_URL and AI_<NAME>_API_KEY.


Output structure

artifacts/
  reports/                        # one JSON file per AnalysisReport
  jobs/                           # one JSON file per async AnalysisRun
  runs/
    <scenario-id>/
      <run-id>/
        screenshot.png            # captured on failure (always), on pass (if ENABLE_SUCCESS_SCREENSHOT=true)
        trace.zip                 # if ENABLE_TRACE=true

Development

npm install
npm run check          # type-check (no emit)
npm run build          # compile to dist/
npm test               # unit + integration tests
npm run test:unit
npm run test:integration
npm run cli -- discover
npm run cli -- analyze --all
npm run start:server:dev

Security

  • provider is validated as lowercase alphanumeric (hyphens and underscores allowed, max 63 chars). Uppercase, spaces, and control characters are rejected.
  • model must not contain control characters (newlines, carriage returns, etc.) and is capped at 200 chars.
  • scenario.url is restricted to http:, https:, and fixture: protocols. file://, javascript:, and other protocols are rejected to prevent SSRF.
  • filePath (CLI --file and API body) is resolved against process.cwd() and must remain inside the project directory. Paths like ../../etc/passwd are rejected.
  • reportId / runId are validated against [a-zA-Z0-9][a-zA-Z0-9_-]{0,199} before being used as file names on disk.

License

MIT