@yawlabs/spend

v0.9.0

Published

a day ago

spend.sh MCP server — AI spend tracking, cost estimation, and provider comparison

0High
0Medium
0Low

mcp llm ai-spend finops tokens cost pricing vector-db inference-host agent-spend anthropic openai gemini model-context-protocol

spend.sh

AI-native FinOps for AI-native teams. See every dollar your agents spend across LLMs, vector DBs, inference hosts, observability, and tool APIs — from one place.

spend.sh | Docs | Pricing

What it does

Bring your own keys. spend.sh pulls billing data directly from every AI-adjacent SaaS you run — Anthropic, OpenAI, Pinecone, Runpod, LangSmith, ElevenLabs, and 13 more — normalizes it into a unified spend ledger, and surfaces spikes, drift, budgets, and per-agent attribution in one dashboard. No proxy required: the numbers come from each provider's own billing API.

On top of the ledger: a local MCP server that answers "what did this session cost?" in your editor, a cost-estimation engine for pre-flight comparison across models, and an optional API gateway for smart routing + spend caps. Caps block requests that flow through the gateway; direct-to-provider traffic is observed and alerted on, not blocked. For MCP-native agents, pair with vend.sh to get structural enforcement at the payment layer.

What it is NOT: a metrics/logging platform. We do not compete with Datadog/Grafana. We track dollars, not latency traces.

Quick Start

Local MCP Server (no account needed)

npx @yawlabs/spend

Runs an MCP server locally with in-memory storage. Community tools work immediately — no API key, no sign-up. Pricing data for every major provider ships with the binary.

Add to Claude Code

claude mcp add spend -- npx @yawlabs/spend

Add to Cursor / VS Code

{
  "mcpServers": {
    "spend": {
      "command": "npx",
      "args": ["@yawlabs/spend"]
    }
  }
}

Remote MCP (hosted, BYOK spend ingestion)

{
  "mcpServers": {
    "spend": {
      "url": "https://mcp.spend.sh",
      "headers": {
        "Authorization": "Bearer sp_your_api_key"
      }
    }
  }
}

Connected keys get pulled every 3 hours by a background sync job; every dollar is reconcilable to the upstream provider's own billing API.

API Gateway (optional)

Swap your provider base URL for smart routing, semantic caching, and hard budget caps:

OPENAI_BASE_URL=https://gateway.spend.sh/v1

BYOK spend ingestion

spend.sh adapters pull completed daily spend buckets from each provider and normalize them into one unified spend_events table. Credentials are encrypted at rest with AES-256-GCM. Sync is idempotent — a mid-flight crash-and-retry will not double-count a single cent.

| Category | Providers | |----------|-----------| | LLM | Anthropic · OpenAI · Together · AWS Bedrock · Azure OpenAI · GCP Vertex · OpenRouter · Portkey | | Vector DB | Pinecone · Qdrant Cloud | | Inference host | Runpod · Replicate | | Tool / multimodal API | Tavily · Exa · ElevenLabs · Deepgram | | Observability / eval | LangSmith · LangFuse · Helicone |

19 adapters. One ledger. All keys stay yours.

MCP tools

Community (free, bundled with the OSS package)

| Tool | Description | |------|-------------| | get_spend_summary | Total spend today, this week, this month — by provider | | get_session_cost | Cost of the current conversation/session | | get_model_pricing | Per-token pricing for any model | | get_cost_estimate | Estimate cost for a workload across models | | compare_models | Side-by-side cost comparison | | get_budget_status | Current spend vs. configured limits | | get_providers | List connected providers and sync status |

Platform (requires spend.sh account)

| Tool | Description | |------|-------------| | get_spend_breakdown | Detailed spend by provider, model, project | | get_cost_trend | Daily spend over time + trend analysis | | get_top_models | Rank models by spend or token volume | | get_anomalies | Flag unusual spend spikes | | set_budget_alert | Create spend threshold notifications | | set_budget_cap | Hard spending caps that block requests | | delete_budget_alert | Remove a budget alert | | tag_session | Label sessions by project for attribution | | export_report | Generate spend reports (JSON/CSV) | | get_rate_limits | Rate limit status per provider | | get_routing_status | Provider health + routing status | | set_routing_rule | Configure smart routing (cheapest, fastest) | | set_fallback_chain | Define provider failover order | | set_project_quota | Per-project spend caps | | get_latency_report | P50/P95/P99 latency per provider | | set_model_alias | Map friendly names to provider/model | | get_cache_stats | Semantic cache hit rate + savings | | set_cache_config | Enable/disable cache, adjust TTL |

Team

| Tool | Description | |------|-------------| | get_team_spend | Aggregated spend across team members | | set_team_budget | Organization-wide budget caps |

Development

npm install
npm run dev            # stdio MCP server
npm run dev:gateway    # API gateway
npm test
npm run build

Releasing

Releases run locally via ./release.sh. There is no CI release workflow.

npm login --auth-type=web   # once per machine; WebAuthn-backed session
./release.sh 0.6.8          # lint, test, bump, tag, push, publish, GH release

release.sh is idempotent — re-running with the same version resumes from wherever it failed. If you hit a pre-flight npm not authenticated error, run the npm login command above and re-run the release.

Pricing drift check

Upstream providers update per-token prices; scripts/check-pricing-drift.mjs diffs the bundled src/data/models.yaml against the latest public pricing pages and prints a report. Run it locally any time:

npm run check:pricing

Exit codes: 0 = no drift, 1 = drift (writes pricing-diff.md), 2 = fatal.

This also runs daily in CI via .github/workflows/pricing-drift.yml (14:00 UTC, manually triggerable from the Actions tab). When drift is detected the workflow opens (or comments on) a GitHub issue titled "Pricing drift detected" with the diff inline — look under open issues. We open an issue rather than a PR because vendor pages occasionally typo their own pricing during a release; a human reviews the diff before any YAML change lands.

Architecture

src/
  index.ts              # Local MCP entry point (stdio)
  mcp/                  # Tool definitions, local + remote MCP wiring
  db/                   # MemoryStore (local) + PostgresStore (cloud)
  providers/            # BYOK spend ingestion adapters (19 providers)
    <provider>/ingest.ts  # pull + normalize provider billing into SpendEvents
    shared/               # credentials AES-256-GCM, idempotency keys
  pricing/              # Per-model pricing YAML + lookup engine
  tools/                # MCP tool handlers
  gateway/              # Optional API gateway (routing, cache, caps)
  api/                  # REST API server (cloud)
  billing/              # LemonSqueezy integration
  auth/                 # Auth + team management
  sync-providers.ts     # Cronjob entry: runs every 3h, fans out to all BYOK adapters
  retention.ts          # Data retention cronjob

The local MCP server (src/index.ts) uses MemoryStore — zero external infrastructure, no database, no API keys required. Data is ephemeral.

The cloud service adds PostgresStore, Valkey-backed caching, the BYOK sync cronjob, the REST API, and the web dashboard.

Part of the Yaw Labs ecosystem

Yaw Terminal — Multi-provider AI terminal
mcp.hosting — Deploy MCP servers in one click
ctxlint — Lint context to reduce token waste
tailscale-mcp — MCP servers over private tailnets
Token Limit News — Weekly AI tooling newsletter