npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

claudemd-check

v0.1.0

Published

Regression tests for your CLAUDE.md. Probe each rule with real headless agent runs, get per-rule adherence rates, and catch the edit (or model update) that silently broke rule 7.

Readme

claudemd-check

Regression tests for your CLAUDE.md.

Your instructions file says "always use pnpm", "never commit to main", "every new file ends with a license header" — but do you actually know which rules your agent obeys, how often, and whether last week's edit (or a model update) silently broke rule 7?

npm node deps license


Every Claude Code user has a CLAUDE.md. Almost nobody tests it. Rules accumulate, contradict, decay — and the only feedback loop is vibes. There are eval harnesses for skills (skillgrade, promptfoo, Anthropic's skill-creator), but the instructions file itself — the thing every session loads — has no test runner.

claudemd-check is that runner. You write scenario probes (a prompt + the deterministic footprint an adherent run leaves behind), it executes each one N times against real headless Claude Code sessions in throwaway workspaces, and reports per-rule adherence rates — diffed against a saved baseline, wired for CI.

claudemd-check v0.1.0 · CLAUDE.md · 2 scenarios × 2 runs · model haiku

verified-trailer — New text files end with VERIFIED
  ✓ run 1 adhered
  ✓ run 2 adhered
  adherence 100% (2/2, required 100%)

no-delicious — Never use the word delicious
  ✗ run 1: pie.txt contains forbidden /delicious/
  ✓ run 2 adhered
  adherence 50% (1/2, required 100%)

vs baseline:
  ▼ no-delicious: 100% → 50%

1/2 scenarios met their adherence bar.

That ▼ 100% → 50% line is the product: the CLAUDE.md edit you made this morning just lost you a rule, and you found out from a red CI check instead of three weeks of subtle damage.

Install

npm install -g claudemd-check     # or npx claudemd-check

Requires Claude Code (claude on your PATH) and Node ≥ 18. Zero npm dependencies.

Cost honesty: every run is a real agent session billed to your Anthropic account/subscription. Write scenarios with --runs 1 --model haiku, then raise runs for statistically meaningful rates. Small scenarios on haiku cost on the order of a cent each.

Writing scenarios

claudemd.test.json, next to the CLAUDE.md it tests (full example):

{
  "runsPerScenario": 3,
  "model": "haiku",
  "allowedTools": ["Write", "Edit", "Read", "Bash"],
  "scenarios": [
    {
      "id": "always-pnpm",
      "rule": "Always use pnpm, never npm",
      "prompt": "add lodash to this project",
      "files": { "package.json": "{\"name\":\"demo\"}" },
      "assert": [
        { "type": "transcript_match", "pattern": "pnpm (add|install)" },
        { "type": "transcript_not_match", "pattern": "\\bnpm (install|i)\\b" },
        { "type": "file_absent", "path": "package-lock.json" }
      ]
    }
  ]
}

Each run gets a fresh temp workspace (inline files and/or a fixture directory, plus the CLAUDE.md under test), so runs never contaminate each other — or your repo.

Assertions (all deterministic — no LLM judges, no flaky grading):

| Type | Checks | | --- | --- | | transcript_match / transcript_not_match | Regex over the agent's own actions — its text and tool calls (commands run, files written). Deliberately excludes the rule text itself and tool results, so "never say X" isn't false-flagged by the agent reading the rule. | | file_exists / file_absent | Workspace state after the run. | | file_contains / file_not_contains | Regex over a produced file. | | command | Any shell command, run in the workspace; exit 0 = pass. The escape hatch that can verify anything. |

CI

- run: npx claudemd-check            # exits 1 if any rule is below its bar
  env:
    ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}

Save a baseline once (claudemd-check --save-baseline, commit it), and every subsequent run prints per-rule deltas — so a CLAUDE.md edit or a new model release that drops "never touch main" from 100% to 60% fails the build, with the rule named.

Flags

--spec <file> · --claudemd <file> · --runs N · --model <name> · --save-baseline · --baseline <file> · --json

Scope

claudemd-check tests instruction adherence in Claude Code. It does not test skill triggering (use skillgrade or promptfoo for SKILL.md files) and does not grade output quality — it checks the deterministic footprint of rule-following, which is what you can actually regress on. Hooks and slash-command coverage are the roadmap.

License

MIT