npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

mcp-fit

v0.1.1

Published

MCP server agent-usability scorer — scores and auto-fixes tool descriptions

Readme

mcp-fit

CI License: Apache-2.0

Score MCP servers for agent-usability — then auto-fix them.

Plenty of tools let you expose an MCP server. None tell you whether it is actually agent-friendly: clean namespacing, strict params, lean typed outputs, helpful errors, low tool-selection confusion. mcp-fit does.

It connects to a target MCP server, scores it across five contract-usability axes, runs real agent tasks against it, and — in fix mode — rewrites the server's tool and parameter descriptions to measurably raise that score, proving the gain with a before/after delta.

The scorecard axes are the provider-side dual of the RubricRefine tool-use contract taxonomy (arXiv 2605.09730): namespacing, tool-selection-confusion, param-strictness, output-leanness, error-helpfulness.

Quickstart

Score the bundled strawman server (a deliberately bad in-memory note store):

Scanning your own server needs no clone → npx mcp-fit scan -- <your-server-command>. The walkthrough below uses the repo's bundled strawman fixture, so it is clone-based.

# 1. Clone and install
git clone <repo-url> mcp-fit && cd mcp-fit
npm install

# 2. Install strawman dependencies
cd fixtures/strawman-server && npm install && cd ../..

# 3. Build mcp-fit
npm run build

# 4. Scan the strawman — renders a scorecard and writes compat.json + evals.jsonl
node dist/cli.js scan \
  --out ./out \
  -- fixtures/strawman-server/node_modules/.bin/tsx fixtures/strawman-server/server.ts

Expected output (lint-only — the deterministic badge scores only the axes static lint can verify; behavioural axes are eval-only):

┌────────────────────────────────────────────────────────────┐
│  mcp-fit scorecard · strawman v0.1.0 (stdio)               │
├────────────────────────────────────────────────────────────┤
│  Axis                             Score   Grade Findings   │
├────────────────────────────────────────────────────────────┤
│  namespacing                      9  /10  A     0err 1warn │
│  tool-selection-confusion         —  /10  ·     eval-only  │
│  param-strictness                 1  /10  F     7err 0warn │
│  output-leanness                  —  /10  ·     eval-only  │
│  error-helpfulness                —  /10  ·     eval-only  │
├────────────────────────────────────────────────────────────┤
│  LINT SCORE (deterministic)   5.6 / 10                     │
│  WEIGHTED AGGREGATE           5.6 / 10  [grade: C]         │
└────────────────────────────────────────────────────────────┘

The axes are eval-only: static lint cannot grade runtime output shape, error quality, or tool-selection confusion, so the deterministic badge does not claim a verdict on them. Run scan --eval (needs ANTHROPIC_API_KEY) to score them stochastically.

Keyless red→green (no API key)

fixtures/strawman-fixed-server is the strawman with clean contracts. Scan both and compare the deterministic LINT SCORE — a reproducible before/after with no LLM call:

# bad: 5.6 / 10  (param-strictness F)
node bin/mcp-fit scan -- fixtures/strawman-server/node_modules/.bin/tsx fixtures/strawman-server/server.ts
# fixed: 10 / 10  (A)
node bin/mcp-fit scan -- fixtures/strawman-fixed-server/node_modules/.bin/tsx fixtures/strawman-fixed-server/server.ts

See sample-artifacts/ for a pre-generated compat.json and evals.jsonl from the strawman run.

Auto-fix mode

Generate improved descriptions and show the before/after delta:

node dist/cli.js fix \
  --out ./out \
  -- fixtures/strawman-server/node_modules/.bin/tsx fixtures/strawman-server/server.ts

Note: fix calls the Claude API. Set ANTHROPIC_API_KEY in your environment, or copy .env.example to .env first.

SSE / HTTP transport

node dist/cli.js scan --sse http://localhost:3001/sse
node dist/cli.js fix  --sse http://localhost:3001/sse --out ./out

After npm link or npm install -g mcp-fit

mcp-fit scan -- node my-server.js
mcp-fit fix  -- npx -y @my-org/my-server --out ./results
mcp-fit help

CLI reference

mcp-fit scan [--out <dir>] -- <command> [args...]
mcp-fit scan [--out <dir>] --sse <url>
mcp-fit fix  [--out <dir>] -- <command> [args...]
mcp-fit fix  [--out <dir>] --sse <url>
mcp-fit help

| Option | Default | Description | |--------|---------|-------------| | --out <dir> | . | Directory for emitted artifacts | | --sse <url> | — | SSE transport URL (instead of -- cmd) |

Scorecard axes

| Axis | Lineage | Measures | |------|---------|----------| | namespacing | tool-choice | tools distinguishable; documented path obvious | | tool-selection-confusion | tool-choice | overlapping / ambiguous tools that mislead selection | | param-strictness | call-signature | unambiguous signatures; clear required args | | output-leanness | output-contract | typed values vs labeled prose / token bloat | | error-helpfulness | provider-only | errors that guide recovery vs opaque failures |

Scores are 1–10 ordinal (10 = trivially correct; 1–4 = very easy to get wrong). The lint score is deterministic and badge-able. The eval score (stochastic, requires --eval) is reported with variance.

Artifacts

| File | Schema | Content | |------|--------|---------| | compat.json | schemas/compat.schema.json | Full scorecard (all axes, findings, aggregate) | | evals.jsonl | schemas/evals.schema.json | Per-task agent traces (one JSON object per line) |

Development

npm run typecheck   # tsc --noEmit
npm run build       # compile src/ → dist/
npm test            # vitest run

Security

  • ANTHROPIC_API_KEY — required only for fix mode and eval. Never committed; load from .env (gitignored).
  • mcp-fit spawns and queries servers with your consent; never auto-runs an untrusted server without an explicit command.

Architecture

src/connect/   MCP client, transports, introspect, proxy   B-001
src/lint/      deterministic rule engine + rules            B-002
fixtures/      strawman bad server + task corpus            B-003
src/report/    artifact emitter + schema validation         B-004
src/eval/      dynamic-eval runner (Claude SDK harness)     B-005
src/score/     scorer + contract-rubric loop                B-006
src/fix/       description rewriter + re-validate + delta   B-007
src/cli.ts     CLI entry point (this bead)                  B-008

Source of truth: specs/mcp-fit/spec.md · Implementation plan: plan.md · Issue tracking: tasks.md

License

Apache-2.0 — see LICENSE.