npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@avinashchby/promptbench

v0.1.1

Published

The testing framework for AI coding assistant configuration files

Downloads

30

Readme

promptbench

The testing framework for AI coding assistant configuration files.

npm version license

You spend hours writing your CLAUDE.md and .cursorrules. But does your AI assistant actually follow them? Does it use pnpm like you told it? Does it avoid any types? You don't know until you waste tokens finding out.

promptbench analyzes your AI config files for quality, consistency, and completeness — then lets you write test cases to verify expected behavior.

Install

npm install -g promptbench
# or
npx promptbench

Quick Start

# Audit your CLAUDE.md
npx promptbench --audit

# Generate a test file
npx promptbench init

# Run tests
npx promptbench

How It Works

Mode 1: Config Analysis (no LLM needed)

Analyzes your config file itself for quality issues:

npx promptbench --audit
┌──────────────────────────────────────────────────┐
│  CLAUDE.md Audit Report                          │
├──────────────────────────────────────────────────┤
│  Overall Score: 72/100                           │
├──────────────────────────────────────────────────┤
│  ✅ Has project overview                         │
│  ✅ Has build commands                           │
│  ✅ Has coding style rules                       │
│  ⚠️  Missing testing section                     │
│  ⚠️  No error handling rules                     │
│  ❌ Contradicting rules found:                   │
│     Line 12: "use tabs"                          │
│     Line 45: "use 2 spaces"                      │
│  ❌ 3 vague instructions found                   │
│     Line 23: "write good code"                   │
│     Line 67: "be careful"                        │
├──────────────────────────────────────────────────┤
│  2 errors, 2 warnings, 3 info                    │
└──────────────────────────────────────────────────┘

Detectors:

  • Contradictions — finds conflicting rules (tabs vs spaces, npm vs pnpm)
  • Vagueness — flags weasel words ("best practices", "be careful", "write good code")
  • Completeness — checks for recommended sections (overview, build commands, testing, etc.)
  • Specificity — scores how actionable your instructions are
  • Metrics — line count, token estimate, section analysis

Mode 2: Behavioral Simulation (optional, needs API key)

Test if your config actually produces expected AI behavior:

export ANTHROPIC_API_KEY=sk-...
npx promptbench --simulate

Test File Format

Create .promptbench.yml in your project root:

config: ./CLAUDE.md

tests:
  - name: "Uses pnpm not npm"
    scenario: "Install a new package"
    expect:
      contains: ["pnpm add", "pnpm install"]
      not_contains: ["npm install", "yarn add"]

  - name: "No any types in TypeScript"
    scenario: "Create a TypeScript function"
    expect:
      not_contains: ["any"]
      contains: ["interface", "type"]

  - name: "Uses conventional commits"
    scenario: "Commit changes"
    expect:
      pattern: "^(feat|fix|chore|docs|refactor|test)"

  - name: "Has build commands"
    check: "config_contains"
    expect:
      config_has: ["npm run build", "npm run test"]

  - name: "Config not too long"
    check: "config_metrics"
    expect:
      max_lines: 500
      max_tokens: 8000

  - name: "No contradictions"
    check: "config_consistency"
    expect:
      no_contradictions: true

CLI

npx promptbench                        # Run all tests in .promptbench.yml
npx promptbench --config CLAUDE.md     # Analyze a specific config
npx promptbench --audit                # Full quality audit report
npx promptbench --score                # 0-100 quality score
npx promptbench --fix                  # Suggest improvements
npx promptbench --simulate             # Behavioral tests (needs API key)
npx promptbench --ci                   # CI mode: JSON output, exit 1 on failure
npx promptbench --format json          # JSON output
npx promptbench --format markdown      # Markdown report
npx promptbench init                   # Generate sample .promptbench.yml

CI Integration

# .github/workflows/promptbench.yml
name: Config Quality
on: [push, pull_request]
jobs:
  check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
      - run: npx promptbench --ci --min-score 70

Programmatic API

import { audit, analyze, parseConfig } from 'promptbench';

const report = await audit('./CLAUDE.md');
console.log(report.score); // 85

const config = parseConfig('./CLAUDE.md');
const report = analyze(config);

Supported Config Files

Auto-detected in order:

  1. CLAUDE.md
  2. .cursorrules
  3. .windsurfrules
  4. .github/copilot-instructions.md
  5. codex-instructions.md

vs "Just Hoping It Works"

| | Without promptbench | With promptbench | |---|---|---| | Config quality | Unknown | Scored 0-100 | | Contradictions | Found after wasting tokens | Caught instantly | | Vague rules | "It should work..." | Flagged with suggestions | | Missing sections | Noticed weeks later | Detected immediately | | CI enforcement | None | Exit code on failure |

License

MIT