inferwise
v1.0.0
Published
FinOps CLI for LLM API costs: scan code, estimate token spend, and block over-budget PRs.
Maintainers
Readme
inferwise
Smart model selection and cost enforcement for LLM API calls.
Inferwise scans your codebase for LLM API calls, recommends the cheapest model that can handle each task, estimates per-token costs, and enforces budget guardrails. Works with any CI system or locally as a git hook.
Note: Inferwise tracks pay-as-you-go API costs (billed per token to your API key). It does not track flat-rate subscriptions like Claude Code, Cursor, Copilot, or ChatGPT Plus.
Quick Start
# See what your LLM calls cost
npx inferwise estimate .
# Get smart model recommendations
npx inferwise audit .
# Set up guardrails: config + git hooks + CI
npx inferwise initOr install globally:
npm install -g inferwiseCommands
inferwise init
Set up config file, git hooks (husky/lefthook/plain git), and print CI setup instructions for GitHub Actions, GitLab CI, Bitbucket, and more.
inferwise estimate [path]
Scan for LLM API calls and estimate costs.
inferwise estimate .
inferwise estimate ./src --volume 5000
inferwise estimate . --format jsoninferwise diff [path]
Compare token costs between two git refs. Enforces budget policy from inferwise.config.json.
inferwise diff
inferwise diff --base main --head HEAD
inferwise diff --fail-on-increase 500inferwise check [path]
Verify total LLM costs are within budget. Exits with code 1 if exceeded. For AI agents and automation.
inferwise check . --max-monthly-cost 10000
inferwise check . --max-cost-per-call 0.05inferwise calibrate [path]
Fetch real usage from provider APIs (Anthropic, OpenAI, OpenRouter) and compute correction factors for more accurate estimates.
ANTHROPIC_ADMIN_API_KEY=sk-ant-admin-... inferwise calibrate .
OPENROUTER_API_KEY=sk-or-... inferwise calibrate . # All providers via OpenRouter
inferwise calibrate . --dry-runinferwise audit [path]
Find cost optimizations with smart, capability-aware model recommendations. Infers what each LLM call does from prompts in your code and suggests cheaper models that can handle the task — with reasoning and confidence levels.
inferwise fix [path]
Auto-apply model swap recommendations from audit. Rewrites model IDs in source files.
inferwise fix . # Apply all recommendations
inferwise fix . --dry-run # Preview without modifying files
inferwise fix . --min-savings 500 # Only apply fixes saving >$500/moinferwise price [provider] [model]
Look up model pricing. Compare models side-by-side. Designed for humans and AI agents.
inferwise price anthropic claude-sonnet-4
inferwise price --compare anthropic/claude-sonnet-4 openai/gpt-4o
inferwise price --list-allBudget Enforcement
Add budgets to inferwise.config.json (created by inferwise init):
{
"budgets": {
"warn": 2000,
"block": 50000,
"requireApproval": 10000,
"approvers": ["platform-eng"]
}
}warn— flags the PR with a warning labelblock— fails the CI check, blocks merge (emergency brake)requireApproval— requests review from approvers before merge
SDK (Programmatic API)
import { estimateAndCheck } from "inferwise/sdk";
const result = await estimateAndCheck("./src", { maxMonthlyCost: 10000 });
if (!result.ok) {
console.error("Over budget:", result.violations);
}Pure data, no console output, no process.exit — safe for embedding in agent orchestration, pipelines, or automation.
Supported Providers
Anthropic, OpenAI, Google AI, xAI, Perplexity — with LangChain, Vercel AI SDK, AWS Bedrock, Azure OpenAI, and LiteLLM pattern detection.
File types: .ts, .tsx, .js, .jsx, .mjs, .cjs, .py
Documentation
See the main repo for full documentation, CI setup guides, and estimation methodology.
License
Apache 2.0
