npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@mutagent/diagnostics

v0.1.0-alpha.8

Published

mutagent-diagnostics: AI agent diagnostics-on-tap skill for Claude Code, Codex, and other coding runtimes

Readme

MUTAGENT Diagnostics

 ███▄ ▄███▓ █    ██ ▄▄▄█████▓ ▄▄▄        ▄████ ▓█████  ███▄    █ ▄▄▄█████▓
▓██▒▀█▀ ██▒ ██  ▓██▒▓  ██▒ ▓▒▒████▄     ██▒ ▀█▒▓█   ▀  ██ ▀█   █ ▓  ██▒ ▓▒
▓██    ▓██░▓██  ▒██░▒ ▓██░ ▒░▒██  ▀█▄  ▒██░▄▄▄░▒███   ▓██  ▀█ ██▒▒ ▓██░ ▒░
▒██    ▒██ ▓▓█  ░██░░ ▓██▓ ░ ░██▄▄▄▄██ ░▓█  ██▓▒▓█  ▄ ▓██▒  ▐▌██▒░ ▓██▓ ░
▒██▒   ░██▒▒▒█████▓   ▒██▒ ░  ▓█   ▓██▒░▒▓███▀▒░▒████▒▒██░   ▓██░  ▒██▒ ░
░ ▒░   ░  ░░▒▓▒ ▒ ▒   ▒ ░░    ▒▒   ▓▒█░ ░▒   ▒ ░░ ▒░ ░░ ▒░   ▒ ▒   ▒ ░░
░  ░      ░░░▒░ ░ ░     ░      ▒   ▒▒ ░  ░   ░  ░ ░  ░░ ░░   ░ ▒░    ░
░      ░    ░░░ ░ ░   ░        ░   ▒   ░ ░   ░    ░      ░   ░ ░   ░
       ░      ░                    ░  ░      ░    ░  ░         ░

Diagnostics-on-Tap for AI agents. Pull evidence from your agent traces, translate user feedback into root-cause records, surface ranked remedies, and apply approved fixes — all from within your existing AI coding session.


CI License: Proprietary Version Bun Node Platforms


Table of Contents


What it does

@mutagent/diagnostics is an AI coding agent skill — a self-contained bundle that installs into your existing coding-agent runtime (Claude Code, Codex, Cursor, OpenCode) and gives it the ability to diagnose itself.

Invoke it naturally in chat: "diagnose my agents" or "why did my agent fail last night". The skill:

  1. Tier 0 static scan — free, instant pattern match on your traces to bound token cost
  2. Auto-extracts a rich entity context at ingest (system prompt · tool inventory · input sample) — deterministically, no LLM
  3. Slices trace history into up to 5 parallel analysis clusters
  4. Dispatches N Analyzers that run RCA against each cluster
  5. Classifies each finding on 3 axes: WHAT failed · WHY it failed · WHERE in the stack
  6. Builds a deterministic render input (Step 8.5 enricher) — derives the big-stat row, 24h latency heatmap, and signal census; fail-loud if the input is starved
  7. Renders a gold-standard HTML report — Methodology [INTERNAL] · Overview · one tab per finding · Decisions — with copy-back markdown approval workflow
  8. Spawns a Background apply-worker on an isolated git worktree that opens a PR (or issues idempotent REST mutations for cloud targets) — without touching your checkout

First invocation triggers interactive onboarding to configure your source + target platform. All subsequent invocations run the full diagnostic cycle automatically.


Quick Start

# First-time install — bootstraps config + runs onboarding wizard
$ pnpx @mutagent/diagnostics init

mutagent-diagnostics v0.1.0-alpha
Checking host runtime...  claude-code detected

? Select trace source platform:
  > Langfuse (cloud / self-hosted)
    OpenTelemetry (Jaeger / Tempo / Honeycomb)
    Local JSONL files
    Claude Code session transcripts
    Codex session transcripts

? Select target platform for fix delivery:
  > Claude Code (local agent config)
    Cursor (.cursorrules)
    OpenCode (config)
    Mastra.ai (TypeScript PR)
    Cloud REST (idempotent PUT)

Config written to .mutagent-diagnostics/config.yaml
Agents installed: .claude/agents/diagnostics-analyzer.md
                  .claude/agents/diagnostics-apply-worker.md

Ready. Invoke: "diagnose my agents" in your coding session.

After init, trigger diagnostics naturally from your AI session:

You: diagnose my agents

> Running Tier 0 static scan on 247 traces...
> 3 signal clusters detected (errors: 12, low-score: 8, feedback: 4)
> Dispatching 3 analyzers in parallel...
> RCA complete. 7 findings ranked by severity.
> Opening HTML report...

Platform Support

Source Platforms (trace evidence)

| Platform | Type | Notes | |----------|------|-------| | Langfuse | Cloud / self-hosted | Full filter/search coverage | | OpenTelemetry | Jaeger · Tempo · Honeycomb | OTLP endpoint | | Local JSONL | .jsonl / .ndjson | Filesystem read | | Claude Code | Session transcripts | Auto-detected from ~/.claude/projects/ | | Codex CLI | Session transcripts | Auto-detected from ~/.codex/sessions/ |

Target Platforms (apply fixes to)

| Platform | Class | Delivery | |----------|-------|---------| | Claude Code | local-agent | Edit ~/.claude/agents/*.md via PR | | Codex CLI | local-agent | Edit agent configs via PR | | Cursor | local-agent | Edit .cursorrules via PR | | OpenCode | local-agent | Edit config via PR | | Mastra.ai | local-code-construct | TypeScript PR to agent graph | | Cloud Agent SDK | local-code-construct | TypeScript PR to SDK integration | | Cloud REST | remote | Idempotent PUT mutations |


Architecture

%%{init: {'theme':'dark','themeVariables':{'primaryColor':'#a78bfa','primaryTextColor':'#e2e8f0','primaryBorderColor':'#7c3aed','lineColor':'#06b6d4','edgeLabelBackground':'#1e1b4b','background':'#0f0f23','clusterBkg':'#1e1b4b','clusterBorder':'#4c1d95','titleColor':'#a78bfa'}}}%%
flowchart TD
    INV["Invoke<br/>'diagnose my agents'"]
    DETECT{Config<br/>present?}
    ONB["Onboarding<br/>(8-phase)"]
    PROTO["Load orchestrator-protocol.md<br/>Parent session = orchestrator"]

    TIER0["Tier 0 static scan<br/>scripts/tier0-scan.ts"]
    SLICER["Dynamic-cluster slicer<br/>scripts/slicer.ts<br/>cap = 5"]
    ANL["N Analyzers ≤ 5<br/>assets/agents/diagnostics-analyzer.md"]
    RCA["RCA Layer<br/>WHAT / WHY / WHERE"]
    ENRICH["Step 8.5 — Build Render Input<br/>scripts/enrich/build-render-input.ts<br/>deterministic · fail-loud"]
    REPORT["Gold-standard HTML Report<br/>assets/templates/report.html.tpl<br/>copy-back markdown approval"]
    APPLY["BG apply-worker on worktree<br/>assets/agents/diagnostics-apply-worker.md<br/>PR or REST"]

    classDef purple fill:#4c1d95,stroke:#a78bfa,color:#e2e8f0
    classDef cyan   fill:#0e7490,stroke:#06b6d4,color:#e2e8f0
    classDef dim    fill:#1e293b,stroke:#475569,color:#94a3b8

    INV --> DETECT
    DETECT -->|missing| ONB
    DETECT -->|present| PROTO
    PROTO --> TIER0
    TIER0 --> SLICER
    SLICER --> ANL
    ANL --> RCA
    RCA --> ENRICH
    ENRICH --> REPORT
    REPORT --> APPLY

    class INV,DETECT purple
    class TIER0,SLICER,ANL,RCA,ENRICH cyan
    class ONB,PROTO,REPORT,APPLY dim

Full DAG and component dependency graph: references/reference.md


Features

| Feature | Detail | |---------|--------| | 3-axis failure taxonomy | Every finding classified on (WHAT, WHY, WHERE) — 10 WHAT types · 9 WHY types · 8 WHERE types | | Tier 0 static scan | Cost gate before any LLM call — pattern-match + signal count in milliseconds | | Dynamic cluster slicing | Groups traces by error / feedback / latency signal; window-based fallback when no a-priori signal | | Cap-of-5 fan-out | Never more than 5 parallel analyzers — keeps cost and complexity bounded | | Auto-extracted entity context | Every normalizer derives an EntityContext (system prompt · tool inventory with per-tool stats · input sample) at ingest — deterministic, no LLM; operator never hand-fills it. Fields > 1 KB render as collapsed ExpandableSection (system prompt always collapsed for PII) | | Gold-standard HTML report | 8-tab report — Methodology [INTERNAL] · Overview (entity card · 6-tile big-stat · 24h latency heatmap · signal census · scan funnel) · one tab per finding (taxonomy · evidence · why-chain · assumptions pills · ★-recommended remedies) · Decisions. Built by the deterministic Step 8.5 enricher; renderer is fail-loud on starved input (R1 §9.3) | | HTML report + copy-back | Human-in-the-loop review via rendered HTML; operator pastes approval markdown back into chat | | Structured contract mode | When a target ships a self-diagnosis-contract.yaml, the report becomes a structured 10-category pass/fail/pending scorecard against the declared success criteria | | Branch-hygiene apply | All local fixes land in an isolated git worktree → PR; operator's checkout is never touched | | Stale-target detection | Hash-compare before every write; re-diagnose prompt if drift detected | | Idempotent REST | Cloud-target PUTs include Idempotency-Key; safe to retry on 5xx | | Dual audit trail | Every apply emits pr-body.md + audit.json + audit.md | | Self-diagnostics | Post-session self-RCA (opt-in, default OFF); feeds skill's own transcripts through the same pipeline | | Platform-portable UX | AskUserQuestion on Claude Code; numbered chat-choice fallback on Codex / Cursor / OpenCode | | Platform-native install | pnpx @mutagent/diagnostics init auto-detects runtime and installs agent .md files |

Failure Taxonomy

WHAT  →  wrong-output · missing-output · loop · latency-spike · cost-overshoot
         format-violation · hallucination · user-complaint · low-score · missing-context

WHY   →  prompt-underspec · prompt-overspec · tool-misuse · tool-missing
         context-overflow · provider-limit · data-staleness · handoff-loss · dependency-failure

WHERE →  system-prompt · tool-definition · agent-config · routing-config
         upstream-data · provider-side · harness-side · user-input

Configuration

Config lives at <project>/.mutagent-diagnostics/config.yaml (generated by init). Secrets (API keys) live at <project>/.mutagentrc — gitignored, never committed.

# .mutagent-diagnostics/config.yaml — generated by pnpx @mutagent/diagnostics init

source:
  platform: "langfuse"           # langfuse | otel | local-jsonl | claude-code | codex
  config:
    host: "https://cloud.langfuse.com"
    # Keys: set LANGFUSE_SECRET_KEY + LANGFUSE_PUBLIC_KEY in .mutagentrc

target:
  platform: "local-claude"       # local-claude | local-codex | local-cursor | local-opencode
                                 # local-mastra | local-cloud-agent-sdk | cloud-rest
  config: {}

filters:
  time_window:
    from: "7daysAgo"
    to:   "now"
  score_below: null              # auto-scaled threshold when set
  limit: 100

ask_tool:
  runtime: "claude-code"         # claude-code | chat-multi-choice

self_diagnostics:
  enabled: false                 # default OFF

Full schema with doc strings: references/config.md


Design Principles

These principles are the audit surface for every PR. See references/principles.md for full text and scope annotations.

| # | Principle | Summary | |---|-----------|---------| | PR-001 | Tier 0 before LLM | Static scan runs before any LLM call | | PR-002 | RCA layer mandatory | Raw scores never shown; always WHAT+WHY+WHERE+evidence | | PR-003 | Read-before-write for REST | GET current state, diff, show diff, then PUT | | PR-004 | Branch hygiene | All local applies in isolated worktree → PR; operator checkout untouched | | PR-005 | Cap-of-5 fan-out | Never > 5 parallel analyzers | | PR-006 | Codebase-agnostic placeholders | Templates use {PLACEHOLDER}; no hardcoded paths | | PR-007 | Progressive disclosure | SKILL.md ≤ 500 lines; all detail in references/ | | PR-008 | Hyperlinked external docs | Links to upstream docs, not local copies | | PR-009 | Single source of truth for config | TypeBox schema in scripts/config/schema.ts | | PR-010 | Multi-choice picker UX | AskUserQuestion on Claude Code; chat fallback elsewhere | | PR-011 | Stale-target detection | Hash-compare before every write | | PR-012 | Idempotent retry for REST | Idempotency-Key header; max 2 retries | | PR-013 | Dual-emit audit | PR body + audit.json + audit.md on every apply | | PR-014 | HITL via HTML copy-back | HTML report is primary review surface | | PR-015 | Per-section rationale links | Reference docs link to the FRs they implement | | PR-016 | Adapter contract pattern | Unified interface per role (normalizer, apply-worker) | | PR-017 | Dynamic-cluster slicing | Cluster by signal when available; window-based fallback | | PR-018 | Translation grounds in evidence | Findings must cite trace message index or code line | | PR-019 | Scripts vs agent operations | Explicit Type A / B / C classification per operation | | PR-020 | Recursive whys to origin | No fixed depth; keep asking until isOrigin: true | | PR-021 | CLI install detection | which <cli> at onboarding; prompt install if missing | | PR-022 | Self-Diagnostics protocol | Post-session self-RCA; default OFF | | PR-023 | Clipboard payloads self-contained | Every remedy embeds ActionablePlan (v0.3+) | | PR-024 | Orchestration in parent session | Never delegate the loop to a coordinator sub-agent (DP-A) | | PR-025 | Self-diagnosis == client diagnosis | One RCA engine; only the subject differs (DP-B) | | PR-026 | Findings grounded inline | Read the definition or mark hypothesis-pending-source (DP-C) | | PR-027 | Declare scope; census-pick the primary signal | Disclose what was NOT assessed (DP-D) | | PR-028 | Cheap-first Tier 0 | Broad cheap checks; don't over-index on one signal (DP-E) | | PR-029 | Report IS the product surface | Template-stamp over procedural rendering (DP-F) | | PR-030 | Clipboard payloads carry full decision context | Apply-worker authors the PR without re-reading the session (DP-G, extends PR-023) | | PR-031 | Remedies declare apply-target + change-type + write-access | (DP-H) | | PR-032 | Migrations carry a SWEEP | grep + fix all refs in the same change (DP-I) | | PR-033 | Tag each finding PRODUCT/META/CORE | Self-diagnosis is a discovery lane for PRODUCT/CORE; --audience client NODE-STRIPs internal (DP-J) | | PR-034 | CLI-contract drift watch | Assert the built command string against the source-platform manual (DP-K) | | PR-035 | Fresh runs MUST LLM-deep-read | Caps bound the read, never skip it (promotes R2.1 +D1) | | PR-036 | Fix the MEASUREMENT layer | 5-trace awareness sample discovers what Tier-0 can't measure (promotes R2.2) | | PR-037 | Learn across runs | Approved findings accumulate into a class-memory library, matched first in Tier-0 (promotes R2.3 +D2) | | PR-038 | Make signal selection legible | Tier pie, selection-rule cards, mermaid selection trace (promotes R2.4) | | PR-039 | Prove the sample represents the population | 4-dim coverage proof + confidence grade (promotes R2.5) | | PR-040 | Operator-driven, not skill-driven | Single-arg NL invocation, verbatim brief preserved (promotes R2.6 +D2) | | PR-041 | Cost realism | Distrust skill-side cost tracking; cost cap INACTIVE by default (promotes D1) | | PR-042 | Preserve the verbatim operator invocation across runs | (promotes D2) | | PR-043 | Every methodology decision is recorded in runMeta as an append-only audit row | | | PR-044 | Normalize before analyze | Canonical schema (Step 3.7) before any Tier-0/LLM work | | PR-045 | Remedies carry two distinct rationales | rationale (why-real) vs whyWorks (why-the-fix-works) | | PR-046 | Feedback two-layered + fix-feedback | raw user signal → translated component-level; + fix-feedback on the solution | | PR-047 | No-code-access honesty | explicit assumption when source absent; never fabricate paths | | PR-048 | Trace Hungriness | scale the deep read with the population (tiers 100/250/500/1000) | | PR-049 | Signal detection + selection | one reconciled primary signal drives census · heatmap · funnel | | PR-050 | Inline-JS parse gate | every emitted <script> must parse (new Function); data gates can't catch dead JS | | PR-051 | Signal source-allowlist | only mechanical {loop·latency·cost} + deep-read-evidenced signals; hygiene metrics never emitted | | PR-052 | Remedy linkage | every remedy links to a target + origin; explicit diffStatus when source absent, never fabricate | | PR-053 | Methodology widgets thread into runMeta | codified threading; absence is a loud INTERNAL marker, not a blank |

PR-024..PR-033 were promoted from Wave-3 dogfood candidates (DP-A..DP-J) and operator-LOCKED in the 2026-05-30 Constitutional Review. PR-034 (DP-K) and PR-035..PR-043 were promoted from Wave-6 methodology layer (R2.1–R2.6 + D1/D2) — see references/principles.md for full text.


Project Layout

mutagent-diagnostics/
├── README.md                          ← this file
├── package.json                       ← @mutagent/diagnostics v0.1.0-alpha.0
├── tsconfig.json
├── eslint.config.js
├── docs/index.html                    ← gh-pages standalone artifact
└── .claude/skills/mutagent-diagnostics/
    ├── SKILL.md                       ← agentskills.io-compliant manifest (§0-§9)
    ├── .npmignore                     ← strips internal/ + *.test.ts on publish
    │
    ├── scripts/                       ← Type A: pure scripts (bun runtime)
    │   ├── tier0-scan.ts              ← static scan (free cost gate)
    │   ├── slicer.ts                  ← dynamic-cluster slicing
    │   ├── stale-detector.ts          ← hash compare before apply
    │   ├── cli/
    │   │   ├── init.ts                ← pnpx entrypoint
    │   │   ├── install-agents.ts      ← idempotent agent installer
    │   │   └── run.sh                 ← bun → pnpm → npm fallback selector
    │   ├── config/                    ← schema.ts · load.ts · validate.ts
    │   ├── contract/                  ← types.ts — SelfDiagnosisContract schema (structured-report mode)
    │   ├── fetch/                     ← langfuse.ts · claude-code-transcripts.sh
    │   ├── normalize/                 ← trace.ts · per-platform normalizers · platforms/entity-context.ts (Wave-5 R1.7)
    │   ├── enrich/build-render-input.ts ← Step 8.5 deterministic enricher → RenderInput (Wave-5 R1.4)
    │   ├── report/                    ← render.ts (gold-standard 8-tab HTML) · persist-selections.ts
    │   ├── lint/template-inline-js.ts ← R-007-B: reject TS in HTML scripts
    │   ├── setup/                     ← detect.ts · reconfigure.ts
    │   ├── tier0/                     ← langfuse.ts · claude-code.ts
    │   ├── validate/
    │   └── self-diagnostics/          ← [INTERNAL] probe.ts · dispatch.ts
    │
    ├── assets/
    │   ├── agents/
    │   │   ├── diagnostics-analyzer.md        ← dispatched per cluster (Step 6)
    │   │   └── diagnostics-apply-worker.md    ← spawned at apply gate (Step 11)
    │   ├── templates/
    │   │   ├── report.html.tpl        ← runtime report template (shipped)
    │   │   ├── config.yaml.tpl        ← onboarding config skeleton
    │   │   ├── pr-body.md.tpl         ← apply PR description
    │   │   ├── audit.json.tpl         ← structured audit trail
    │   │   └── audit.md.tpl           ← human-readable audit trail
    │   └── wireframes/                ← picker-UX prompts (onboarding / diagnostics)
    │
    ├── references/                    ← load-on-demand docs (not in SKILL.md)
    │   ├── reference.md               ← entry point + full DAG
    │   ├── principles.md              ← PR-001..PR-053 (audit surface)
    │   ├── operator-feedback-log.md   ← operator feedback on the report shape (Wave-5 R1.6)
    │   ├── config.md                  ← schema with doc strings
    │   ├── workflows/                 ← onboarding · orchestrator-protocol (Step 8.5) · diagnostics · apply · rca · schedule-prep
    │   ├── source-platforms/          ← per-platform fetch + filter examples
    │   ├── target-platforms/          ← per-target apply recipes
    │   └── internal/
    │       └── self-diagnostics.md    ← [INTERNAL] PR-022 playbook
    │
    ├── examples/
    │   └── sample-findings.json       ← example RCA output
    │
    └── internal/                      ← [stripped on publish — see .npmignore]
        └── templates/review/          ← dev review templates (maintainer use only)

For Skill Maintainers

Internal templates (internal/templates/review/)

Five reusable HTML templates for design reviews, status dashboards, and skill walkthroughs. These are never shipped to end users — the .npmignore at the skill root strips the entire internal/ directory on publish.

| Template | Use for | |----------|---------| | _brand-shell.html | Starting a new review/dashboard from scratch | | iteration-template.html | Multi-phase approval with per-phase Goal / Scope / Files / Risks | | review-template.html | Iterative spec lock-in with per-section approval checkboxes | | status-template.html | Progress reporting (scroll layout, PR banner, Kanban, timeline) | | skill-overview-template.html | Walking a colleague through SKILL.md §0-§9 visually |

Decision tree for which template to copy: see internal/templates/review/README.md.

Dispatch-gate pattern

The parent coding-agent session IS the orchestrator — no coordinator sub-agent. Sub-agents (diagnostics-analyzer.md, diagnostics-apply-worker.md) are dispatched by the parent session at well-defined steps (Step 6 / Step 11 of orchestrator-protocol.md). Dispatch gates prevent the *afkloop skill from auto-running diagnostics without operator review — any scheduled-diagnostics scenario must go through the HITL copy-back gate first.

Trunk-style PR model

#1047 is the long-lived trunk PR for the v0.1 → v1.0 lifecycle. Milestone commits accumulate on feat/mutagent-diagnostics-skill; squash-merge at v1.0 collapses to one main commit. Do not merge to main without operator sign-off.

Publish checklist

# Verify .npmignore strips internal/ correctly
npm pack --dry-run | grep -v "internal/"

# Lint + typecheck
bun run lint && bun run typecheck

# Test
bun test ./.claude/skills/mutagent-diagnostics/scripts/

Registry target: GitHub Packages (restricted access). Not published to public npmjs.org.


License

Proprietary. All rights reserved. See LICENSE.txt for full terms. @mutagent/diagnostics is published to GitHub Packages (restricted access) — not available on public npm registries.