@yawlabs/spend
v0.9.0
Published
spend.sh MCP server — AI spend tracking, cost estimation, and provider comparison
Maintainers
Readme
spend.sh
AI-native FinOps for AI-native teams. See every dollar your agents spend across LLMs, vector DBs, inference hosts, observability, and tool APIs — from one place.
What it does
Bring your own keys. spend.sh pulls billing data directly from every AI-adjacent SaaS you run — Anthropic, OpenAI, Pinecone, Runpod, LangSmith, ElevenLabs, and 13 more — normalizes it into a unified spend ledger, and surfaces spikes, drift, budgets, and per-agent attribution in one dashboard. No proxy required: the numbers come from each provider's own billing API.
On top of the ledger: a local MCP server that answers "what did this session cost?" in your editor, a cost-estimation engine for pre-flight comparison across models, and an optional API gateway for smart routing + spend caps. Caps block requests that flow through the gateway; direct-to-provider traffic is observed and alerted on, not blocked. For MCP-native agents, pair with vend.sh to get structural enforcement at the payment layer.
What it is NOT: a metrics/logging platform. We do not compete with Datadog/Grafana. We track dollars, not latency traces.
Quick Start
Local MCP Server (no account needed)
npx @yawlabs/spendRuns an MCP server locally with in-memory storage. Community tools work immediately — no API key, no sign-up. Pricing data for every major provider ships with the binary.
Add to Claude Code
claude mcp add spend -- npx @yawlabs/spendAdd to Cursor / VS Code
{
"mcpServers": {
"spend": {
"command": "npx",
"args": ["@yawlabs/spend"]
}
}
}Remote MCP (hosted, BYOK spend ingestion)
Sign up at spend.sh, connect your provider API keys, and add to your MCP config:
{
"mcpServers": {
"spend": {
"url": "https://mcp.spend.sh",
"headers": {
"Authorization": "Bearer sp_your_api_key"
}
}
}
}Connected keys get pulled every 3 hours by a background sync job; every dollar is reconcilable to the upstream provider's own billing API.
API Gateway (optional)
Swap your provider base URL for smart routing, semantic caching, and hard budget caps:
OPENAI_BASE_URL=https://gateway.spend.sh/v1BYOK spend ingestion
spend.sh adapters pull completed daily spend buckets from each provider and normalize them into one unified spend_events table. Credentials are encrypted at rest with AES-256-GCM. Sync is idempotent — a mid-flight crash-and-retry will not double-count a single cent.
| Category | Providers | |----------|-----------| | LLM | Anthropic · OpenAI · Together · AWS Bedrock · Azure OpenAI · GCP Vertex · OpenRouter · Portkey | | Vector DB | Pinecone · Qdrant Cloud | | Inference host | Runpod · Replicate | | Tool / multimodal API | Tavily · Exa · ElevenLabs · Deepgram | | Observability / eval | LangSmith · LangFuse · Helicone |
19 adapters. One ledger. All keys stay yours.
MCP tools
Community (free, bundled with the OSS package)
| Tool | Description |
|------|-------------|
| get_spend_summary | Total spend today, this week, this month — by provider |
| get_session_cost | Cost of the current conversation/session |
| get_model_pricing | Per-token pricing for any model |
| get_cost_estimate | Estimate cost for a workload across models |
| compare_models | Side-by-side cost comparison |
| get_budget_status | Current spend vs. configured limits |
| get_providers | List connected providers and sync status |
Platform (requires spend.sh account)
| Tool | Description |
|------|-------------|
| get_spend_breakdown | Detailed spend by provider, model, project |
| get_cost_trend | Daily spend over time + trend analysis |
| get_top_models | Rank models by spend or token volume |
| get_anomalies | Flag unusual spend spikes |
| set_budget_alert | Create spend threshold notifications |
| set_budget_cap | Hard spending caps that block requests |
| delete_budget_alert | Remove a budget alert |
| tag_session | Label sessions by project for attribution |
| export_report | Generate spend reports (JSON/CSV) |
| get_rate_limits | Rate limit status per provider |
| get_routing_status | Provider health + routing status |
| set_routing_rule | Configure smart routing (cheapest, fastest) |
| set_fallback_chain | Define provider failover order |
| set_project_quota | Per-project spend caps |
| get_latency_report | P50/P95/P99 latency per provider |
| set_model_alias | Map friendly names to provider/model |
| get_cache_stats | Semantic cache hit rate + savings |
| set_cache_config | Enable/disable cache, adjust TTL |
Team
| Tool | Description |
|------|-------------|
| get_team_spend | Aggregated spend across team members |
| set_team_budget | Organization-wide budget caps |
Development
npm install
npm run dev # stdio MCP server
npm run dev:gateway # API gateway
npm test
npm run buildReleasing
Releases run locally via ./release.sh. There is no CI release workflow.
npm login --auth-type=web # once per machine; WebAuthn-backed session
./release.sh 0.6.8 # lint, test, bump, tag, push, publish, GH releaserelease.sh is idempotent — re-running with the same version resumes from
wherever it failed. If you hit a pre-flight npm not authenticated error,
run the npm login command above and re-run the release.
Pricing drift check
Upstream providers update per-token prices; scripts/check-pricing-drift.mjs
diffs the bundled src/data/models.yaml against the latest public pricing
pages and prints a report. Run it locally any time:
npm run check:pricingExit codes: 0 = no drift, 1 = drift (writes pricing-diff.md), 2 = fatal.
This also runs daily in CI via .github/workflows/pricing-drift.yml
(14:00 UTC, manually triggerable from the Actions tab). When drift is
detected the workflow opens (or comments on) a GitHub issue titled
"Pricing drift detected" with the diff inline — look under
open issues.
We open an issue rather than a PR because vendor pages occasionally typo
their own pricing during a release; a human reviews the diff before any
YAML change lands.
Architecture
src/
index.ts # Local MCP entry point (stdio)
mcp/ # Tool definitions, local + remote MCP wiring
db/ # MemoryStore (local) + PostgresStore (cloud)
providers/ # BYOK spend ingestion adapters (19 providers)
<provider>/ingest.ts # pull + normalize provider billing into SpendEvents
shared/ # credentials AES-256-GCM, idempotency keys
pricing/ # Per-model pricing YAML + lookup engine
tools/ # MCP tool handlers
gateway/ # Optional API gateway (routing, cache, caps)
api/ # REST API server (cloud)
billing/ # LemonSqueezy integration
auth/ # Auth + team management
sync-providers.ts # Cronjob entry: runs every 3h, fans out to all BYOK adapters
retention.ts # Data retention cronjobThe local MCP server (src/index.ts) uses MemoryStore — zero external infrastructure, no database, no API keys required. Data is ephemeral.
The cloud service adds PostgresStore, Valkey-backed caching, the BYOK sync cronjob, the REST API, and the web dashboard.
Part of the Yaw Labs ecosystem
- Yaw Terminal — Multi-provider AI terminal
- mcp.hosting — Deploy MCP servers in one click
- ctxlint — Lint context to reduce token waste
- tailscale-mcp — MCP servers over private tailnets
- Token Limit News — Weekly AI tooling newsletter
License
Copyright (c) 2026 Yaw Labs LLC. All rights reserved.
