npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

claude-eval

v1.1.6

Published

Simple evaluation tool for Claude Code. PASS/FAIL testing with LLM-as-a-judge simplified approach.

Readme

claude-eval

npm version npm downloads License: MIT Node.js

An evaluation tool for Claude Code, using a LLM-as-a-judge simplified approach.

When changing Claude Code contexts and models:

  • How can you know if the change work as expected?
  • How can you know if the change will NOT break something else?

This tool solves those problems by enabling Eval-driven development for Claude Code.

Claude Code Eval Demo

No complex scoring or ranking — just clear PASSED ✅ / FAILED ❌ results for your evaluation criteria.

It's like TDD for AI.

Prerequisites

  • Node.js 18+ or Bun
  • Claude Code installed and configured in your project

Installation

Global Installation (Recommended)

For regular use, install claude-eval globally:

# Using npm
npm install -g claude-eval

# Using bun
bun install -g claude-eval

After global installation, you can use the claude-eval command directly and access the update functionality:

claude-eval --version
claude-eval update

One-time Usage

If you prefer not to install globally, you can run evaluations directly with npx:

npx claude-eval evals/*.yaml

Usage

# Single evaluation
claude-eval evals/say-dont-know-clear-way.yaml

# Multiple evaluations (batch)
claude-eval evals/*.yaml

# Custom concurrency
claude-eval evals/*.yaml --concurrency=3

# Check for updates
claude-eval update

# Show help
claude-eval --help

Evaluation File Format

Evaluation files are YAML documents with the following structure:


prompt: >
  What is the weather for today?

expected_behavior:
  - Just say you don't know in a clear way.
  - Don't give user alternatives.
  - Don't recommend user to research for the answer elsewhere.
  • prompt: The prompt you would send to Claude Code
  • expected_behavior: Array of criteria that the response should meet

How It Works

  1. Parse YAML: Loads and validates the evaluation specification
  2. Query Claude: Executes the prompt on Sonnet model, on plan mode
  3. Judge Response: Evaluate the response with Haiku model
  4. Format Results: Displays results with ✅/❌ indicators and summary

Contributing

We welcome contributions! Please:

  1. Open an issue to discuss major changes before starting
  2. Follow existing code style and patterns in the codebase
  3. Add tests for new features and bug fixes
  4. Update documentation as needed
  5. Keep it simple - this tool is intentionally minimal and focused

For bug reports and feature requests, please use the GitHub Issues page.