npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@reasoningco/cse

v0.1.0

Published

Testing and analytics toolkit for Claude Code skills — trigger simulation, regression testing, conflict detection, and live session logging

Readme

claude-skills-evalkit

Testing and analytics toolkit for Claude Code skills. Answers "will my skill actually trigger?" before you deploy.

Install

npm install -g claude-skills-evalkit

This installs the cse CLI globally and automatically sets up the /cse slash command in Claude Code.

Or use without installing:

npx claude-skills-evalkit simulate ./my-skill/SKILL.md -p "your prompt"

The CLI is available as both claude-skills-evalkit and cse (shorthand).

/cse in Claude Code

After installing, a /cse slash command is available directly inside Claude Code. Restart Claude Code after install, then:

/cse simulate ./skills/my-skill/SKILL.md "does this trigger?"
/cse test ./skill-tests/my-suite.yaml
/cse conflicts
/cse optimize ./skills/my-skill/SKILL.md
/cse help

You can also use natural language:

/cse does my code reviewer skill trigger for "review this PR"?
/cse are there any conflicts between my skills?

To install the slash command manually (if you already had cse installed):

cse install-command

Commands

cse simulate — Test skill triggering

# Single prompt
cse simulate ./skills/my-skill/SKILL.md -p "build me a dashboard"

# Batch from file (one prompt per line)
cse simulate ./skills/my-skill/SKILL.md -f prompts.txt

# Verbose — show per-signal score breakdowns
cse simulate ./skills/my-skill/SKILL.md -p "build a form" -v

# JSON output for scripting
cse simulate ./skills/my-skill/SKILL.md -f prompts.txt --json

# Show top N results (default: 5)
cse simulate ./skills/my-skill/SKILL.md -p "create a component" --top 10

Output:

Prompt: "build me a dashboard"

  Rank  Skill                  Plugin              Confidence
  ──────────────────────────────────────────────────────────
  1  ►  frontend-design        ui-toolkit          0.84  ████████
  2     data-viz-builder       analytics           0.61  ██████
  3     chart-generator        charting            0.42  ████

  ► = your target skill
  Scored in 4.24ms

cse test — Regression test suites

Create a YAML test suite:

# skill-tests/frontend.yaml
name: frontend-triggers
skillPath: ../skills/frontend-design/SKILL.md
cases:
  - id: dashboard
    prompt: "build me a dashboard"
    expectedTrigger: true
    minConfidence: 0.5

  - id: sql-query
    prompt: "optimize SQL queries"
    expectedTrigger: false

Run it:

# Run all suites in ./skill-tests/
cse test

# Specific suite
cse test --suite ./skill-tests/frontend.yaml

# CI mode with JUnit XML
cse test --ci --reporter junit --output results.xml

# Filter by test ID
cse test --filter "dashboard"

cse conflicts — Detect overlapping skills

# Scan all installed skills for overlaps
cse conflicts

# Set overlap threshold (0-1)
cse conflicts --threshold 0.6

# JSON output
cse conflicts --format json

cse optimize — Improve skill descriptions

# Offline analysis (precision/recall scores + suggestions)
cse optimize ./skills/my-skill/SKILL.md \
  --positive should-trigger.txt \
  --negative should-not-trigger.txt

# AI-powered rewriting via claude -p (uses your existing Claude Code session)
cse optimize ./skills/my-skill/SKILL.md \
  --positive pos.txt --negative neg.txt --api

cse dry-run — Test instruction adherence

cse dry-run ./skills/my-skill/SKILL.md "build me a todo app"
cse dry-run ./skills/my-skill/SKILL.md "build me a todo app" --json

Uses claude -p (pipe mode) — no API key needed. If ANTHROPIC_API_KEY is set, falls back to direct API.

cse log — Session analytics

cse log status              # Show logging status
cse log query --since 7d    # Query last 7 days
cse log query --group-by skill --since 30d
cse log export --format csv --output analytics.csv

cse dashboard — Analytics dashboard

cse dashboard
cse dashboard --since 30d

cse init — Initialize config

cse init

Creates .evalkitrc.json and skill-tests/example.yaml.

Programmatic API

import {
  simulateTrigger,
  batchSimulate,
  parseSkill,
  scanPlugins,
  detectConflicts,
} from 'claude-skills-evalkit';

// Parse a skill
const skill = parseSkill('./my-skill/SKILL.md');

// Scan installed plugins
const plugins = await scanPlugins();
const allSkills = plugins.flatMap(p => p.skills);

// Simulate triggering
const result = simulateTrigger('build me a dashboard', {
  targetSkill: skill,
  competingSkills: allSkills,
});
console.log(result.targetSkill?.confidence); // 0.84
console.log(result.targetSkill?.rank);       // 1

// Detect conflicts
const report = detectConflicts(allSkills, 0.5);
console.log(report.overlaps);

Configuration

Create .evalkitrc.json in your project root (or run cse init):

{
  "scoring": {
    "weights": {
      "semantic": 0.35,
      "keywordOverlap": 0.20,
      "triggerPhrase": 0.25,
      "specificity": 0.10,
      "competition": 0.10
    }
  },
  "logging": {
    "redactPrompts": true,
    "storageBackend": "sqlite"
  },
  "regression": {
    "suiteDir": "./skill-tests",
    "reporters": ["console"]
  }
}

Claude Code Plugin

Install as a Claude Code plugin for live session logging and MCP tools (in addition to the CLI):

.claude-plugin/plugin.json  — Plugin manifest
hooks/hooks.json            — Automatic session logging
skills/claude-skills-evalkit/SKILL.md — Invoke via "test my skill"
.mcp.json                   — MCP server with simulate_trigger, detect_conflicts, analyze_description

Scoring Algorithm

Five signals combined with calibrated weights:

| Signal | Weight | What It Measures | |--------|--------|------------------| | Semantic (TF-IDF) | 0.35 | Topic relevance via cosine similarity | | Trigger Phrase | 0.25 | Match against quoted phrases in description | | Keyword Overlap | 0.20 | IDF-weighted Jaccard coefficient | | Specificity | 0.10 | Description quality and precision | | Competition | 0.10 | Score gap between top candidates |

Security

  • API keys: environment variables only (ANTHROPIC_API_KEY), never in config
  • Prompts: SHA-256 hashed by default, never stored in plaintext
  • Storage: database files created with 0600 permissions
  • No telemetry, no phone-home, no eval()

License

MIT