npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

skillscore

v2.0.2

Published

A CLI tool that evaluates AI agent skills and produces quality scores

Downloads

59

Readme

npm version License: MIT Node.js CI TypeScript

The universal quality standard for AI agent skills. Evaluate any SKILL.md — from skills.sh, ClawHub, GitHub, or your local machine.


✨ Features

  • 🎯 Comprehensive Evaluation: 7 Anthropic-aligned scoring categories with weighted importance
  • 🎨 Multiple Output Formats: Terminal (colorful), JSON, and Markdown reports
  • 🔍 Deterministic Analysis: Reliable, reproducible scoring without requiring API keys
  • 📋 Detailed Feedback: Specific findings and actionable recommendations
  • Fast & Reliable: Built with TypeScript for speed and reliability
  • 🌍 Cross-Platform: Works on Windows, macOS, and Linux
  • 🐙 GitHub Integration: Score skills directly from GitHub repositories
  • 📊 Batch Mode: Compare multiple skills with a summary table
  • 🗣️ Verbose Mode: See all findings, not just truncated summaries

📦 Installation

Global Installation (Recommended)

npm install -g skillscore

Local Installation

npm install skillscore
npx skillscore ./my-skill/

From Source

git clone https://github.com/joeynyc/skillscore.git
cd skillscore
npm install
npm run build
npm link

🚀 Quick Start

Evaluate a skill directory:

skillscore ./my-skill/

📖 Usage Examples

Basic Usage

# Evaluate a skill
skillscore ./skills/my-skill/

# Evaluate with verbose output (shows all findings)
skillscore ./skills/my-skill/ --verbose

GitHub Integration

# Full GitHub URL (always recognized)
skillscore https://github.com/FrancyJGLisboa/agent-skill-creator

# GitHub shorthand (requires -g/--github flag)
skillscore -g FrancyJGLisboa/agent-skill-creator

Output Formats

# JSON output
skillscore ./skills/my-skill/ --json

# Markdown report
skillscore ./skills/my-skill/ --markdown

# Save to file
skillscore ./skills/my-skill/ --output report.md
skillscore ./skills/my-skill/ --json --output score.json

Batch Mode

# Compare multiple skills (auto-enters batch mode)
skillscore ./skill1 ./skill2 ./skill3

# Explicit batch mode flag
skillscore ./skill1 ./skill2 --batch

# Compare GitHub skills
skillscore -g user/repo1/skill1 user/repo2/skill2 --json

Utility Commands

# Show version
skillscore --version

# Get help
skillscore --help

📊 Example Output

Terminal Output

📊 SKILLSCORE EVALUATION REPORT
============================================================

📋 Skill: weather-fetcher
   Fetches current weather data for any city when the user asks for forecasts or conditions.
   Path: ./weather-skill

🎯 OVERALL SCORE
   A- - 92.0% (9.2/10.0 points)

📝 CATEGORY BREAKDOWN
------------------------------------------------------------
Identity & Metadata ████████████████████ 100.0%
   YAML frontmatter with valid name/description, proper format, not vague
   Score: 10/10 (weight: 20%)
   ✓ Frontmatter name: "weather-fetcher" (+2)
   ✓ Name format valid (lowercase-hyphen, ≤64 chars) (+2)
   ✓ Frontmatter description present (+2)
   ... 3 more findings

Clarity & Instructions ██████████████████░░ 90.0%
   Workflow steps, consistent terminology, templates/examples, degrees of freedom
   Score: 9/10 (weight: 15%)
   ✓ Has structured workflow steps (numbered lists or checklists) (+3)
   ✓ Consistent terminology throughout (+2)
   ✓ 4 code blocks with templates/examples (+2)
   ... 2 more findings (use --verbose to see all)

Safety & Security ██████████████░░░░░░ 70.0%
   No destructive commands without confirmation, no secret exfil, no privilege escalation
   Score: 7/10 (weight: 15%)
   ✓ No dangerous destructive commands found (+3)
   ✓ No secret exfiltration risk detected (+2)
   ⚠ Privilege escalation with justification: sudo (+1)

📈 SUMMARY
------------------------------------------------------------
✅ Strengths: Identity & Metadata, Conciseness, Clarity & Instructions, Routing & Scope
❌ Areas for improvement: Safety & Security

Generated: 3/13/2026, 1:37:51 AM

Batch Mode Output

📊 BATCH SKILL EVALUATION
Evaluating 3 skill(s)...

[1/3] Processing: ./weather-skill
✅ Completed

[2/3] Processing: ./file-backup
✅ Completed

[3/3] Processing: user/repo/skill
✅ Completed

📋 COMPARISON SUMMARY

Skill                          Grade  Score    Identity Routing Safety Status
weather-fetcher                A-     92.0%    100%     100%    70%    OK
file-backup                    B+     87.0%    90%      80%     90%    OK
data-processor                 A      94.0%    100%     100%    85%    OK

📈 BATCH SUMMARY
✅ Successful: 3
📊 Average Score: 91.0%

🏆 Scoring System

SkillScore evaluates skills across 7 weighted categories aligned with Anthropic's official skill documentation:

| Category | Weight | Description | |----------|--------|-------------| | Identity & Metadata | 20% | YAML frontmatter name/description, lowercase-hyphen format, not vague | | Conciseness | 15% | Body ≤500 lines, progressive disclosure, no over-explaining basics | | Clarity & Instructions | 15% | Workflow steps, consistent terminology, templates/examples, degrees of freedom | | Routing & Scope | 15% | WHAT+WHEN description, negative routing, domain vocabulary, third-person voice | | Robustness | 10% | Error handling in code blocks, validation steps, dependency verification | | Safety & Security | 15% | No destructive commands, proximity-based secret exfil detection, no privilege escalation | | Portability & Standards | 10% | No platform-specific paths, MCP tool format, no time-sensitive info, relative paths |

Scoring Methodology

Each category is scored from 0-10 points based on specific criteria:

  • Identity & Metadata: Validates YAML frontmatter name (lowercase-hyphen, ≤64 chars, no reserved words) and description (≤1024 chars, third person, no XML tags), rejects vague names/descriptions
  • Conciseness: Enforces the 500-line body limit, checks for progressive disclosure via file references, flags over-explaining basics Claude already knows
  • Clarity & Instructions: Checks for numbered steps or checklists, consistent terminology (no synonym pairs used interchangeably), code block examples, and a mix of imperative ("must") and flexible ("consider") guidance
  • Routing & Scope: Validates description has action verbs + trigger conditions, negative routing examples ("don't use when..."), domain-specific vocabulary, and third-person voice
  • Robustness: Scans code blocks for error handling (try/catch, ||, set -e), validates dependency verification commands (--version, command -v), flags magic constants
  • Safety & Security: Proximity-based secret exfil detection (secrets + network within 5 lines), destructive command scanning with confirmation check, privilege escalation detection, unbounded loop detection
  • Portability & Standards: Flags Windows-style paths, hardcoded absolute paths, validates MCP tool ServerName:tool_name format, detects time-sensitive info (dates, pinned versions)

v2.0.0: Anthropic-Aligned Rubric

Complete scoring redesign replacing the original 8 generic categories with 7 categories aligned to Anthropic's official skill documentation:

| Change | Details | |--------|---------| | Frontmatter validation | Skills must have YAML frontmatter with name and description fields | | Name format checks | Names must be lowercase-hyphen (^[a-z0-9][a-z0-9-]*$), ≤64 chars, no reserved words | | Conciseness scoring | New category enforcing 500-line limit, progressive disclosure, no over-explaining | | Third-person detection | Descriptions should use third-person voice, not "I/We/My" | | Proximity-based exfil | Secret + network pattern detection within 5-line proximity windows | | MCP format validation | MCP tool references must use ServerName:tool_name format | | Time-sensitive detection | Flags specific dates, "as of", and pinned version references |

Grade Scale

| Grade | Score Range | Description | |-------|-------------|-------------| | A+ | 97-100% | Exceptional quality | | A | 93-96% | Excellent | | A- | 90-92% | Very good | | B+ | 87-89% | Good | | B | 83-86% | Above average | | B- | 80-82% | Satisfactory | | C+ | 77-79% | Acceptable | | C | 73-76% | Fair | | C- | 70-72% | Needs improvement | | D+ | 67-69% | Poor | | D | 65-66% | Very poor | | D- | 60-64% | Failing | | F | 0-59% | Unacceptable |

📁 What Makes a Good Skill?

Required Structure

my-skill/
├── SKILL.md           # Main skill definition (REQUIRED)
├── README.md          # Documentation (recommended)
├── package.json       # Dependencies (if applicable)
├── scripts/           # Executable scripts
│   ├── setup.sh
│   └── main.py
└── examples/          # Usage examples
    └── example.md

SKILL.md Template

---
name: my-awesome-skill
description: Performs [specific task] when the user needs to [trigger condition].
---

# My Awesome Skill

Performs [specific task] using [specific tools/inputs].

## When to Use

Use this skill when you need to [specific task] with [specific tools/inputs].

## When NOT to Use

Don't use this skill when:
- The task is [alternative scenario] — use [other skill] instead
- You need [different capability]

## Dependencies

- Tool 1: Installation instructions (`tool --version` to verify)
- API Key: How to obtain and configure
- Environment: OS requirements

## Workflow

1. Step-by-step instructions
2. Specific commands to run
3. Expected outputs

- [ ] Verify dependencies
- [ ] Confirm configuration

## Output

Results are written to `./output/` as JSON files.

## Error Handling

You must always validate output. Consider retrying on transient failures.

```bash
if ! result=$(./scripts/main.py --input "data"); then
  echo "Error: processing failed"
  exit 1
fi

Examples

Example Output

{
  "status": "success",
  "result": "Example of what the skill produces"
}

Limitations

  • Known constraints
  • Platform-specific notes
  • Edge cases

See docs/advanced.md for more details.


## 🔧 API Usage

Use SkillScore programmatically in your Node.js projects:

```typescript
import { SkillParser, SkillScorer, TerminalReporter } from 'skillscore';
import type { Reporter, SkillScore } from 'skillscore';

const parser = new SkillParser();
const scorer = new SkillScorer();
const reporter: Reporter = new TerminalReporter();

async function evaluateSkill(skillPath: string): Promise<SkillScore> {
  const skill = await parser.parseSkill(skillPath);
  const score = await scorer.scoreSkill(skill);
  const report = reporter.generateReport(score);

  console.log(report);
  return score;
}

All three reporters (TerminalReporter, JsonReporter, MarkdownReporter) implement the Reporter interface.

ParsedSkill Fields (v2.0)

The parser now extracts additional metadata used by the new scoring rubric:

interface ParsedSkill {
  // Existing fields
  skillPath: string;
  skillMdExists: boolean;
  skillMdContent: string;
  name: string;
  description: string;
  files: string[];
  metadata: Record<string, unknown>;
  structure: FileStructure;

  // New in v2.0
  frontmatter: Record<string, unknown>;    // YAML frontmatter (same ref as metadata)
  bodyContent: string;                      // SKILL.md after stripping frontmatter
  bodyLineCount: number;                    // Line count of body
  nameSource: 'frontmatter' | 'heading' | 'fallback';
  descriptionSource: 'frontmatter' | 'inline' | 'inferred' | 'none';
  referencedFiles: string[];               // Markdown links extracted from SKILL.md
}

🛠️ CLI Options

Usage: skillscore [options] <path...>

Arguments:
  path                   Path(s) to skill directory, GitHub URL, or shorthand

Options:
  -V, --version         Output the version number
  -j, --json            Output in JSON format
  -m, --markdown        Output in Markdown format
  -o, --output <file>   Write output to file
  -v, --verbose         Show ALL findings (not just truncated)
  -b, --batch           Batch mode for comparing multiple skills
  -g, --github          Treat shorthand paths as GitHub repos (user/repo/path)
  -h, --help           Display help for command

🧪 Testing

# Run all tests
npm test

# Run tests in watch mode
npm test

# Run tests once
npm run test:run

# Lint code
npm run lint

# Build project
npm run build

🤝 Contributing

We welcome contributions! Here's how to get started:

Development Setup

git clone https://github.com/joeynyc/skillscore.git
cd skillscore
npm install
npm run build
npm link

# Run in development mode
npm run dev ./test-skill/

# Build for production
npm run build

Running Tests

npm test

Submitting Changes

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes
  4. Add tests for new functionality
  5. Ensure all tests pass (npm test)
  6. Lint your code (npm run lint)
  7. Commit your changes (git commit -m 'Add amazing feature')
  8. Push to the branch (git push origin feature/amazing-feature)
  9. Open a Pull Request

Coding Standards

  • Use TypeScript for all new code
  • Follow existing code style (enforced by ESLint)
  • Add tests for new features
  • Update documentation for API changes
  • Keep commits focused and descriptive

🐛 Troubleshooting

Common Issues

Error: "Path does not exist"

  • Check for typos in the path
  • Ensure you have permission to read the directory
  • Verify the path points to a directory, not a file

Error: "No SKILL.md file found"

  • Skills must contain a SKILL.md file
  • Check if you're pointing to the right directory
  • The file must be named exactly "SKILL.md"

Error: "Git is not available"

  • Install Git to clone GitHub repositories
  • macOS: xcode-select --install
  • Ubuntu: sudo apt-get install git
  • Windows: Download from git-scm.com

Scores seem too high/low

  • Scoring is calibrated against real-world skills
  • See the scoring methodology above
  • Consider the specific criteria for each category

Getting Help

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

  • Inspired by the need for quality assessment in AI agent skills
  • Built for the OpenClaw and Claude Code communities
  • Thanks to all contributors and skill creators
  • Scoring methodology aligned with Anthropic's official skill documentation

📊 Example Scores

Real-world skills scored with SkillScore v2.0:

  • FrancyJGLisboa/agent-skill-creator: 83.5% (B) - Perfect identity & robustness, needs negative routing and trimming (617 lines)
  • gapmiss/obsidian-plugin-skill: 52% (F) - No frontmatter, weak routing signals, missing structured workflow
  • skill-creator (local): 86% (B) - Strong identity & conciseness (353 lines, 6 file refs), needs error handling in code blocks