@inferlane/mcp

v0.8.0

Published

25 days ago

Local-first compute fuel gauge for Claude Code + 50 MCP tools for model selection, spend tracking, budget firewall, credibility, routing, and the compute exchange. Auto-ingests real usage from Claude Code transcripts — no API key needed. Works with Claude

@inferlane/mcp

The cost intelligence layer for AI agents. 41 MCP tools for model selection, spend tracking, routing, scheduling, and credibility scoring.

Install

npx -y @inferlane/mcp

Or add to your MCP config:

{
  "mcpServers": {
    "inferlane": {
      "command": "npx",
      "args": ["-y", "@inferlane/mcp"]
    }
  }
}

Modes

Set INFERLANE_MODE to control how many tools register (and how much context they cost):

| Mode | Tools | Context | What you get | |------|-------|---------|--------------| | (unset) full | all | ~27K tokens | Everything below | | core | 8 | ~6K tokens | Cost intelligence essentials: pick_model, log_request, session_cost, assess_routing, rate_recommendation, get_cost_comparison, suggest_savings, get_model_pricing | | firewall | 10 | ~7K tokens | The spend-firewall profile: fuel_gauge, platform_budget, platform_spend, check_dispatch, log_request, session_cost, pick_model, get_model_pricing, ping, register_webhook |

Firewall mode is built for agent fleets: check_dispatch pre-flights every model call against the local budget (INFERLANE_BUDGET_TOTAL) first and the platform budget (INFERLANE_API_KEY) second, returning a structured ALLOW/DENY JSON decision with the estimated cost, remaining budgets, and a cheaper-model suggestion on deny. With no budget configured it answers ALLOW with an advisory note — it never fakes a denial. Budget denials from pick_model and route_via_platform also carry a machine-parseable {"error": "BUDGET_EXCEEDED", ...} JSON block alongside the human-readable text.

{
  "mcpServers": {
    "inferlane": {
      "command": "npx",
      "args": ["-y", "@inferlane/mcp"],
      "env": { "INFERLANE_MODE": "firewall", "INFERLANE_BUDGET_TOTAL": "25" }
    }
  }
}

Tools

Cost Intelligence

pick_model — Choose the optimal model for any task
check_dispatch — Pre-flight a dispatch against local + platform budgets (structured ALLOW/DENY)
session_cost — Track session spend in real-time
log_request — Log API calls for cost tracking
get_model_pricing — Look up pricing across providers
get_cost_comparison — Compare costs across models
suggest_savings — Get cost optimization recommendations

Routing & Triage

triage — Auto-classify prompts by complexity, urgency, and cost
triage_settings — Configure routing preferences
assess_routing — Evaluate local vs cloud routing
route_to_cloud — Report routing decisions for credibility

Dispatch & Scheduling

dispatch — Send prompts to best available provider
dispatch_chain — Multi-provider sequential chains
dispatch_status — Check async task status
schedule_prompt — Schedule prompts for later
create_chain — Create multi-step chains
list_scheduled / cancel_scheduled / chain_status

Agent Intelligence

agent_status — Traffic light status (green/amber/red/blue)
set_agent_status — Manual status override
set_lifecycle_phase — Track coding→testing→CI→deploy phases
lifecycle_report — Cost-per-phase breakdown
token_tachometer — Real-time token velocity
state_of_compute — Full compute market report

Credibility

credibility_profile — View agent credibility score
credibility_leaderboard — Compete with other agents
rate_recommendation — Rate model quality
model_ratings — View community ratings
improvement_cycle — Run quality analysis

Platform (requires INFERLANE_API_KEY)

cost_savings — View savings from smart routing
check_promotions — Active provider promotions
platform_spend / platform_budget — Platform billing
route_via_platform — Route through InferLane platform
session_history — Cross-provider session tracking

Environment Variables

| Variable | Required | Description | |----------|----------|-------------| | INFERLANE_API_KEY | No | Enables platform features (dispatch, scheduling, savings) | | INFERLANE_MODE | No | core or firewall to register a trimmed tool profile (see Modes) | | INFERLANE_EVENTS_PORT | No | Enable SSE event stream on this port | | INFERLANE_BUDGET_TOTAL | No | Monthly budget cap in USD — enforced by check_dispatch, pick_model, route_via_platform | | OLLAMA_HOST | No | Local Ollama endpoint for routing assessment |

How It Works

Install once. Every AI agent session becomes cost-aware:

Before a task — pick_model recommends the cheapest viable model
After a task — log_request tracks what was spent
Over time — credibility score builds, recommendations improve

No configuration needed for basic cost tracking. Add INFERLANE_API_KEY for platform features.

Data & Privacy

This package reads some files on your machine to do its job. Everything below stays local and is never transmitted — the watcher does zero network I/O — unless you set INFERLANE_API_KEY to opt into cloud sync. Without that key, InferLane has no visibility into any of it.

What it reads locally (read-only, on by default):

Your Claude Code transcripts. On startup the server reads Claude Code's JSONL transcript files at ~/.claude/projects/**/*.jsonl to pull the per-message token-usage blocks (message.usage: input / output / cache token counts + model name) so the fuel gauge reflects actual spend with no API key and no manual logging. It only reads these files — it never modifies them — and it does not send their contents anywhere unless you enable cloud sync. This watcher is on by default; opt out by setting INFERLANE_NO_CC_WATCH=1 (it also stays inactive if ~/.claude/projects doesn't exist).

What it writes locally:

~/.inferlane/state.db — a SQLite database holding your request log, model ratings, credibility scores, and budget config so they survive restarts. It lives only on your machine. If SQLite can't initialize, the server falls back to in-memory mode and writes nothing.

Optional persistence installer (separate tool, opt-in):

A separate, related InferLane tool — the persistence installer shipped as the inferlane-persist / inferlane-unpersist commands in the @inferlane/mcp-server package — can make the cost-awareness guidance persist across agent sessions. It is opt-in and prompts for confirmation before changing anything: a [y/N] prompt that defaults to No, so an unattended or piped run installs nothing unless you pass an explicit --yes flag. When you consent, it modifies only:

~/.claude/CLAUDE.md (and, if you choose them, the equivalent per-project instruction files it detects: project CLAUDE.md, .cursorrules, .github/copilot-instructions.md, .gemini/styleguide.md, CONVENTIONS.md, AGENTS.md) — it inserts a small, clearly marker-delimited activation block ( … ). Your existing content is left intact.
~/.claude/settings.json — with a separate explicit confirmation, it merges a single PreToolUse spend-guard hook without clobbering any other hooks or keys.

Every file that installer touches and every hook it wires is recorded in a manifest at ~/.inferlane/persist-manifest.json so the uninstaller can remove exactly what was added and nothing else.

This mirrors §1.3 ("Local-only data") of the InferLane Privacy Policy.

License

Apache-2.0

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@inferlane/mcp

Install

Modes

Tools

Cost Intelligence

Routing & Triage

Dispatch & Scheduling

Agent Intelligence

Credibility

Platform (requires INFERLANE_API_KEY)

Environment Variables

How It Works

Data & Privacy

License