mcp-token-optimizer

v0.1.1

Published

5 days ago

MCP server that cuts LLM token costs: accurate token counting, cost estimates across GPT/Claude/Gemini, rule-based prompt slimming with measured savings, and cheapest-model recommendations. Works with Claude, Cursor, and any MCP client.

0High
0Medium
0Low

rileycraig

mcp model-context-protocol llm tokens token-counter cost-optimization prompt-compression openai anthropic claude gpt gemini ai llmops

mcp-token-optimizer

An MCP server that cuts your LLM token costs. Give your AI assistant (Claude, Cursor, or any MCP client) the ability to measure, price, and shrink prompts before they cost you money.

Teams routinely overspend 60–80% on LLM tokens through bloated prompts and using a pricier model than the task needs. This server adds four tools that make those savings one call away — no LLM call required for the optimization itself, so it's free and instant.

Tools

| Tool | What it does | |---|---| | count_tokens | Token count for any text + input cost across common models (or one model). | | estimate_cost | Per-call and monthly/yearly spend for a prompt, model, output size, and call volume. | | slim_prompt | Safely compresses a prompt (shortens verbose phrases, drops filler, dedupes lines, normalizes whitespace) and reports tokens and dollars saved. | | compare_model_costs | Costs the same prompt across many models and recommends the cheapest one. |

Token counts use gpt-tokenizer (exact for OpenAI; a close estimate for Claude/Gemini). Prices are an editable mid-2026 snapshot in lib.js — verify against each provider's pricing page.

Install

Add to your MCP client config:

{
  "mcpServers": {
    "token-optimizer": {
      "command": "npx",
      "args": ["-y", "mcp-token-optimizer"]
    }
  }
}

Claude Desktop: add the block above to claude_desktop_config.json.
Cursor: add it to .cursor/mcp.json.
Any MCP host: run the binary mcp-token-optimizer (stdio transport).

Then ask your assistant things like:

"Count the tokens in this prompt and tell me the cost on gpt-4o vs gpt-4o-mini."
"Slim this system prompt and show me what I'd save at 50,000 calls a month."
"Which model is cheapest for this prompt with ~400 tokens of output?"

Run locally

npm install
npm test
npx mcp-token-optimizer   # starts the stdio server

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

mcp-token-optimizer

Tools

Install

Run locally

License