npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

@dankelleher/mcp-eval

v0.8.1

Published

A CLI to evaluate MCP servers performance

Downloads

1,109

Readme

A CLI to evaluate MCP servers performance

oclif Version Downloads/week

Quick start

  • Export your Openrouter API key as OPENROUTER_API_KEY environment variable``
$ export OPENROUTER_API_KEY=<your-key>
  • Write your myserver.yml test case
test_cases:
  - name: "Open a contribution PR on Github"
    input_prompt: "I'd like to contribute to mcp-eval. I want to enable ... feature. I'll let you go ahead and implement the feature as you see fit. Open a pull request with the proposed modification once you're done."
    expected_tool_call:
      tool_name: "open-pr"
      parameters:
        branch: "new-feature"
  • Run your test suite
$ npx -y @alpic-ai/mcp-eval@latest run --url=https://mcp.github.com ./myserver.yml

For servers requiring authentication, you can pass custom headers:

$ npx -y @alpic-ai/mcp-eval@latest run --url=https://nexus.civic.com/hub/mcp -h "Authorization: Bearer ACCESS_TOKEN" ./myserver.yml
  • Et voilà 🎉!

Requirements

  • Nodejs >= 22
  • StreamableHTTP or SSE compatible public MCP server

Usage

$ npm install -g @dankelleher/mcp-eval
$ mcp-eval COMMAND
running command...
$ mcp-eval (--version)
@dankelleher/mcp-eval/0.8.1 darwin-arm64 node-v22.17.0
$ mcp-eval --help [COMMAND]
USAGE
  $ mcp-eval COMMAND
...

Commands

mcp-eval run TESTFILE

Run the test suite described in the provided YAML file.

USAGE
  $ mcp-eval run TESTFILE -u <value> [-a anthropic/claude] [-h <value>...]

ARGUMENTS
  TESTFILE  YAML file path containing the test suite

FLAGS
  -a, --assistant=<option>  [default: anthropic/claude] Assistant configuration to use (impact model and system prompt)
                            <options: anthropic/claude>
  -h, --header=<value>...   Custom headers to send with requests (format: 'Header: value')
  -u, --url=<value>         (required) URL of the MCP server

DESCRIPTION
  Run the test suite described in the provided YAML file.

EXAMPLES
  $ mcp-eval run

See code: src/commands/run.ts

Test Suite Syntax

Test suite should be written in YAML. A test suite file should have a root test_cases property with at least one test.

Each test requires:

  • name: a convenient name for your test
  • input_conversation: the conversation to send to the assistant. This can be either a single user message or a multi-turn conversation containing both assistant and user messages. It can contain tool calls that already happened during the model thinking process.
  • expected_tool_call: an object detailing the expected tool to be called with:
    • tool_name: the name as advertized by the MCP server of the tool to be called
    • parameters: the expected set of parameters the tool is expected to be called with. Only these specified properties will be checked during comparison with the actual tool call. Extra properties set by the model will not cause the test to fail.

Simple user message example

test_cases:
  - name: "Find flights from Paris to Tokyo"
    input_prompt: "I'd like to plan a trip to Tokyo, Japan. Find me a flight from Paris to Tokyo on October 3rd and returning on October 5th."
    expected_tool_call:
      tool_name: "search-flight"
      parameters:
        flyFrom: Paris
        flyTo: Tokyo
        departureDate: 03/10/2025
        returnDate: 05/10/2025

Multi-turn conversation example

test_cases:
  - name: "Create issue in frontend team for login bug"
    input_conversation:
      - role: user
        content: "I'm seeing a bug where the login button doesn't work. Can you create an issue for this?"
      - role: assistant
        content: "Sure, first let me check which team to assign the issue to. Listing your teams now."
      - role: tool
        tool_name: list_teams
        parameters: {}
        response: |
          [
            {"id": "team_123", "name": "Frontend"},
            {"id": "team_456", "name": "Backend"}
          ]
      - role: assistant
        content: "Now that I see the available teams, I'll assign the issue to the Frontend team."
    expected_tool_call:
      tool_name: "create_issue"
      parameters:
        title: "Login button doesn't work"
        description: "User reports that the login button is not functioning."
        team_id: "team_123"