npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@inferwise/mcp

v1.0.0

Published

MCP server for Inferwise — AI agent tools for LLM cost estimation and model selection

Readme

@inferwise/mcp

MCP server for Inferwise — gives AI agents tools to estimate LLM costs, suggest cheaper models, and audit codebases for cost optimization.

Works with any AI tool that supports the Model Context Protocol: Claude Code, Cursor, VS Code (1.99+), Windsurf, Cline, and more.

Setup

Claude Code

claude mcp add inferwise -- npx -y @inferwise/mcp

Cursor / VS Code / Windsurf

Add to your MCP settings (.cursor/mcp.json, .vscode/mcp.json, etc.):

{
  "mcpServers": {
    "inferwise": {
      "command": "npx",
      "args": ["-y", "@inferwise/mcp"]
    }
  }
}

Cline

Add to Cline MCP settings:

{
  "mcpServers": {
    "inferwise": {
      "command": "npx",
      "args": ["-y", "@inferwise/mcp"]
    }
  }
}

Tools

Once connected, the AI agent gets four tools:

suggest_model

Suggest the cheapest LLM model capable of handling a given task. Analyzes the task description using keyword-based pattern matching to infer required capabilities (code, reasoning, general, creative, vision, search, audio), then finds the cheapest model across all providers that has those capabilities. If no specific capabilities are detected, defaults to general.

Input:

  • task (string, required) — Description of what you want the LLM to do
  • provider (string, optional) — Restrict to a specific provider (anthropic, openai, google, xai, perplexity)
  • maxCostPerMillionTokens (number, optional) — Maximum acceptable cost per million output tokens (USD)

Returns: Recommended model with pricing, up to 3 cheaper alternatives, inferred capabilities, and reasoning.

Example: Agent asks "classify support tickets by category" → Inferwise infers ["general"] capability → returns gpt-4o-mini at $0.60/MTok instead of gpt-4o at $10/MTok.

See the main repo for details on capability inference and cross-provider ranking.

estimate_cost

Estimate the cost of an LLM API call given provider, model, and token counts. Optionally projects monthly costs based on daily request volume.

Input:

  • provider (string, required) — LLM provider
  • model (string, required) — Model ID (e.g., claude-sonnet-4-6, gpt-4o)
  • inputTokens (number, required) — Number of input tokens
  • outputTokens (number, required) — Number of output tokens
  • requestsPerDay (number, optional) — Daily volume for monthly projection
  • useBatch (boolean, optional) — Use Batch API pricing
  • useCache (boolean, optional) — Assume cache-hit pricing

Returns: Cost per call, monthly projection, and pricing breakdown.

audit

Scan a directory for LLM API calls, estimate costs, and suggest cheaper capable models for each call site.

Input:

  • directory (string, required) — Absolute path to the directory to scan
  • volume (number, optional) — Requests per day for monthly projection (default: 1000)

Returns: Per-call-site cost estimates, total monthly cost, unknown models, and smart model recommendations with savings percentages.

apply_recommendations

Apply model swap recommendations to source files. Replaces expensive model IDs with cheaper alternatives in-place. Can auto-detect recommendations via audit, or accept explicit swaps.

Input:

  • directory (string, required) — Absolute path to the project directory
  • volume (number, optional) — Requests per day for monthly projection (default: 1000)
  • dryRun (boolean, optional) — Preview changes without modifying files (default: false)
  • recommendations (array, optional) — Explicit swaps to apply. Each item has file, line, currentModel, suggestedModel. If omitted, runs audit and applies all recommendations.

Returns: Applied swaps, skipped swaps (with reasons), and estimated monthly savings.

Example flow:

  1. Agent calls audit → gets recommendations
  2. Agent calls apply_recommendations with those recommendations → files are rewritten
  3. Agent commits the changes

Or in one step: call apply_recommendations with just { directory: "." } — runs audit + applies all fixes automatically.

How It Works

The MCP server runs locally as a subprocess — no hosted infrastructure, no API keys needed for basic usage. It communicates via stdio using the MCP protocol (JSON-RPC over stdin/stdout).

The server depends on:

Non-MCP Alternatives

If your tool doesn't support MCP, use the CLI directly:

# JSON output for programmatic consumption
inferwise audit . --format json
inferwise price openai gpt-4o --format json

# SDK for embedding in pipelines
import { estimate } from "inferwise/sdk";

License

Apache 2.0