npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

toolbudget

v0.1.1

Published

Know what your MCP server's tool surface costs an agent — in tokens and tool-selection accuracy.

Downloads

288

Readme

toolbudget

Know what your MCP tool surface costs an agent.

Every tool you expose over MCP — its name, description, and JSON schema — is re-sent to the model on every call. That "tool surface" silently eats your context window and degrades tool-selection accuracy. toolbudget measures the total token weight of your surface, flags the rules that make it worse, and gives you a single 0–100 score you can gate on in CI.

It leads with the metric that actually matters — token cost per call and its effect on agent performance — not generic schema linting.

$ npx toolbudget --stdio "npx -y @modelcontextprotocol/server-everything"

toolbudget
Score 82/100  |  13 tools  |  ~1015 tokens/call (est.)

Heaviest tools
  gzip-file-as-resource   200 tok (20%)
  simulate-research-query 127 tok (13%)
  get-annotated-message    99 tok (10%)
  ...

That's ~1015 tokens spent on tool definitions before the agent has done any work — on every turn.

Install

No install required:

npx toolbudget --input tools.json

Or add it to a project:

npm install -D toolbudget

Requires Node 22+.

Usage

Three ways to point toolbudget at a tool surface:

# 1. A captured tools/list JSON file
toolbudget --input tools.json

# 2. Launch a stdio MCP server and introspect it live
toolbudget --stdio "npx -y @modelcontextprotocol/server-filesystem /tmp"

# 3. Connect to a Streamable HTTP MCP server
toolbudget --url https://your-host/mcp

Three output formats — pretty (default), json, markdown:

toolbudget --stdio "node server.js" --format markdown
toolbudget --input tools.json --format json

Gate it in CI — exits non-zero when the score drops below --min-score (default 80) or any error-level finding is present:

toolbudget --input tools.json --ci --min-score 80

Override budgets per run with --max-tools <n> and --token-budget <n>.

Example report

Live run against the official @modelcontextprotocol/server-everything reference server (--format markdown):

# toolbudget report

- **Score:** 82/100
- **Tools:** 13
- **Surface cost:** ~1015 tokens/call (est.)

## Heaviest tools
- `gzip-file-as-resource` — 200 tokens (20%)
- `simulate-research-query` — 127 tokens (13%)
- `get-annotated-message` — 99 tokens (10%)
- `get-resource-reference` — 82 tokens (8%)
- `trigger-long-running-operation` — 82 tokens (8%)
- `get-resource-links` — 77 tokens (8%)
- `get-structured-content` — 71 tokens (7%)
- `get-sum` — 66 tokens (7%)
- `echo` — 53 tokens (5%)
- `toggle-simulated-logging` — 43 tokens (4%)

## Findings
- **[warn]** `tool/description-too-short` (`echo`): Description is 5 words (<12). Add purpose + when to use it.
- **[warn]** `tool/description-too-short` (`get-annotated-message`): Description is 11 words (<12). Add purpose + when to use it.
- **[warn]** `tool/description-too-short` (`get-env`): Description is 10 words (<12). Add purpose + when to use it.
- **[warn]** `tool/param-missing-description` (`get-resource-reference`): Parameters lack descriptions: resourceType.

Rules

| Rule | What it catches | | --- | --- | | surface/too-many-tools | More tools than the budget; large surfaces hurt tool-selection accuracy. | | surface/token-budget | Total tool-surface token cost over budget. | | tool/missing-description | A tool with no description — agents can't choose what they can't read. | | tool/description-too-short | Description too terse to convey purpose and when to use it. | | tool/description-too-long | Bloated description burning tokens on every call. | | tool/unclear-name | Non-descriptive name (e.g. a, do, tool1); rename to verb_object. | | tool/param-missing-description | Parameters with no description. | | tool/schema-too-large | A single tool's input schema is oversized. | | tool/schema-deeply-nested | Schema nests deeper than is reliable for models to fill. | | tool/freeform-should-enum | Free-text param that should be a constrained enum. | | tool/no-examples | Oversized description (≥120 words) with no usage example to anchor correct calls. | | redundancy/near-duplicate-tools | Two tools that look near-identical; merge to cut surface and ambiguity. |

Free vs Pro

| | Free | Pro | | --- | --- | --- | | Full audit + token metric + 0–100 score | ✓ | ✓ | | All reporters (pretty / json / markdown) | ✓ | ✓ | | Basic --ci gating | ✓ | ✓ | | --fix codemods (auto-trim & rewrite) (roadmap) | | ✓ | | Custom rules + per-rule severity | | ✓ | | --baseline drift tracking | | ✓ | | SARIF reporter | | ✓ |

Pro is a one-time license$9 for individuals, $99 for a team / commercial license (launch pricing). No subscription — buy once, own it. Pro is coming soon; the store isn't open yet.

How the score works

The score starts at 100 and subtracts a weighted penalty for every finding (error 10, warn 4, info 1). The penalty is normalized by the number of tools, so a large server isn't doomed simply for having more tools — it's judged on the quality of its surface. A score of ≥80 is healthy; lower means real, fixable cost in tokens or agent reliability.


Brought to you by Pancratic.