@runcycles/openclaw-budget-guard
v0.7.7
Published
OpenClaw plugin for budget-aware model and tool execution using Cycles.
Downloads
1,358
Readme
cycles-openclaw-budget-guard
OpenClaw plugin for budget-aware model and tool execution using Cycles.
Why use this plugin?
AI agents make autonomous decisions — calling models, invoking tools, retrying on failure — with no human in the loop. Without runtime enforcement, several things go wrong:
Runaway spend. A single agent stuck in a tool loop or retrying failed calls can burn through hundreds of dollars in minutes. Provider spending caps are account-wide and too coarse. Rate limits don't account for cost. In-app counters don't survive restarts or coordinate across concurrent agents.
Uncontrolled side-effects. An agent can send 100 emails, trigger 50 deployments, or call dangerous APIs with nothing to stop it. Cost limits alone don't help — some actions are consequential regardless of price.
Noisy neighbors. In multi-tenant or multi-user setups, one agent can consume the entire team or tenant budget, starving other users. Without per-user scoping, there's no isolation.
No session-level cost visibility. When an agent session ends, you have no idea what it spent, which tools it called most, or whether it was cost-efficient. Debugging cost overruns after the fact is painful.
Abrupt failure. When budget runs out, the agent crashes instead of adapting — switching to cheaper models, reducing output length, or disabling expensive tools.
This plugin addresses those failure modes by checking model and tool execution before it runs, then degrading or blocking when budget conditions require it. It also tracks session-level cost breakdowns, tool usage, and budget transitions for debugging and operations.
Beyond enforcement, the plugin monitors for problems as they develop:
- Burn rate anomaly detection catches runaway tool loops — if spending spikes 3x above the session average,
onBurnRateAnomalyfires immediately - Predictive exhaustion warnings estimate when budget will run out and fire
onExhaustionForecastbefore it happens - Automatic retry with backoff on transient Cycles server errors (429/503) prevents spurious denials under load
- Reservation heartbeat auto-extends long-running tool reservations so cost tracking doesn't silently break
- Observability via
metricsEmitter(Datadog, Prometheus, Grafana, OTLP) and opt-in session event logs
In typical OpenClaw setups, you can add enforcement without changing agent logic.
For deeper background, see Why Rate Limits Are Not Enough and Runaway Agents and Tool Loops.
Overview
A comprehensive OpenClaw plugin that integrates with a live Cycles server to enforce budget boundaries during agent execution. It hooks into the OpenClaw plugin lifecycle to:
- Reserve budget for model and tool calls using the reserve → commit → release protocol
- Downgrade models when budget is low (configurable fallback chains)
- Block execution when budget is exhausted (fail-closed by default)
- Inject budget hints into prompts so the model is budget-aware
- Detect budget transitions and fire callbacks/webhooks on level changes
- Control tool access with allowlists, blocklists, and per-tool call limits
- Apply graceful degradation strategies when budget is low
- Retry denied reservations and transient server errors with configurable backoff
- Keep long-running tools alive with automatic reservation heartbeat
- Detect anomalies — burn rate spikes and predictive exhaustion warnings
- Emit metrics to Datadog, Prometheus, Grafana, or any OTLP-compatible backend
- Record an event log of every budget decision for debugging and compliance
- Report unconfigured tools so you know which tools are using default cost estimates
- Support dry-run mode for testing without a live Cycles server
- Track per-tool cost breakdowns and session analytics with model cost reconciliation
- Support multi-currency budgets with per-tool/model overrides
- Support budget pools/hierarchies via parent budget visibility
The plugin uses the runcycles TypeScript client to communicate with a Cycles server.
Important: Budget exhaustion is enforced fail-closed by default, but Cycles server connectivity failures are handled fail-open — the plugin assumes healthy budget and allows execution to continue. See Fail-Open Behavior for details.
Prerequisites
- OpenClaw >= 0.1.0 with plugin support
- Node.js >= 20.0.0
- A running Cycles server with:
- A base URL (e.g.
http://localhost:7878) - An API key
- A tenant configured with a budget scope
- A base URL (e.g.
If you don't have a Cycles server yet, see the Cycles quickstart to set one up. Alternatively, use dry-run mode to test without a server.
To see budget enforcement in action before wiring up your own agent, run the Cycles Runaway Demo — it shows the exact failure mode this plugin prevents, with a live before/after comparison.
Quick Start
1. Install the plugin
openclaw plugins install @runcycles/openclaw-budget-guardFor local development:
openclaw plugins install -l ./cycles-openclaw-budget-guard2. Enable the plugin
openclaw plugins enable openclaw-budget-guard3. Add minimal configuration
Add the following to your OpenClaw config file (typically openclaw.json or openclaw.config.json):
{
"plugins": {
"entries": {
"openclaw-budget-guard": {
"config": {
"cyclesBaseUrl": "http://localhost:7878",
"cyclesApiKey": "cyc_your_api_key_here",
"tenant": "my-org"
}
}
}
}
}That's it — the plugin uses sensible defaults for everything else. The agent will now enforce budget limits on every run.
Need an API key? API keys are created via the Cycles Admin Server (port 7979). See the deployment guide to create one, or see API Key Management for details.
4. (Optional) Keep secrets out of config files
Use OpenClaw's env var interpolation to avoid hardcoding API keys:
{
"plugins": {
"entries": {
"openclaw-budget-guard": {
"config": {
"cyclesBaseUrl": "${CYCLES_BASE_URL}",
"cyclesApiKey": "${CYCLES_API_KEY}",
"tenant": "my-org"
}
}
}
}
}Then set the env vars in your shell or CI:
export CYCLES_BASE_URL="http://localhost:7878"
export CYCLES_API_KEY="cyc_your_api_key_here"5. (Optional) Try dry-run mode
To test without a live Cycles server:
{
"plugins": {
"entries": {
"openclaw-budget-guard": {
"config": {
"tenant": "my-org",
"cyclesBaseUrl": "http://unused",
"cyclesApiKey": "unused",
"dryRun": true,
"dryRunBudget": 100000000
}
}
}
}
}6. Verify it's working
After restarting OpenClaw, check the logs for:
Cycles Budget Guard for OpenClaw v0.6.1
https://runcycles.io
tenant: my-org
cyclesBaseUrl: http://localhost:7878
...Run your agent and look for budget activity:
[openclaw-budget-guard] before_model_resolve: model=anthropic/claude-sonnet-4-20250514 level=healthyIf you see this, the plugin is actively checking budget on every model and tool call.
Understanding the cost model
The plugin uses a simple model: every model call and tool call reserves a fixed cost from the budget.
Currency. The default is USD_MICROCENTS — 1 unit = $0.00001 (one hundred-thousandth of a dollar). So:
| Amount (units) | USD equivalent | |----------------|---------------| | 100,000 | $0.001 (0.1 cents) | | 1,000,000 | $0.01 (1 cent) | | 10,000,000 | $0.10 (10 cents) | | 100,000,000 | $1.00 |
Example. With a $5 budget (500,000,000 units):
anthropic/claude-opusat 1,500,000/call = ~333 calls before exhaustionanthropic/claude-sonnetat 300,000/call = ~1,666 callsweb_searchat 500,000/call = ~1,000 callslowBudgetThreshold: 10000000triggers model downgrade when $0.10 remains
Model names. OpenClaw passes model identifiers in provider/model format (e.g., openai/gpt-4o, anthropic/claude-sonnet-4-20250514). Your modelBaseCosts, modelFallbacks, and defaultModelName must use the same format — bare model names like gpt-4o won't match. The plugin automatically strips the provider prefix when returning modelOverride to OpenClaw, so you can use provider/model consistently in all config fields without double-prefixing issues.
Setting toolBaseCosts. Start with the default (100,000 units per call). After your first session, check the unconfiguredTools list in the session summary — it tells you which tools need explicit costs. For tools that call external APIs, estimate higher (500K-1M). For lightweight tools, estimate lower (10K-50K).
Full Configuration Example
{
"plugins": {
"entries": {
"openclaw-budget-guard": {
"config": {
"enabled": true,
"cyclesBaseUrl": "http://localhost:7878",
"cyclesApiKey": "cyc_your_api_key_here",
"tenant": "my-org",
"budgetId": "my-app",
"currency": "USD_MICROCENTS",
"lowBudgetThreshold": 10000000,
"exhaustedThreshold": 0,
"modelFallbacks": {
"anthropic/claude-opus-4-20250514": ["anthropic/claude-sonnet-4-20250514", "anthropic/claude-haiku-4-5-20251001"],
"openai/gpt-4o": "openai/gpt-4o-mini"
},
"modelBaseCosts": {
"anthropic/claude-opus-4-20250514": 1500000,
"anthropic/claude-sonnet-4-20250514": 300000,
"openai/gpt-4o": 1000000,
"openai/gpt-4o-mini": 100000
},
"toolBaseCosts": {
"web_search": 500000,
"code_execution": 1000000
},
"toolCallLimits": {
"send_email": 10,
"deploy": 3
},
"injectPromptBudgetHint": true,
"maxPromptHintChars": 200,
"failClosed": true,
"logLevel": "info",
"reservationTtlMs": 60000,
"overagePolicy": "ALLOW_IF_AVAILABLE",
"lowBudgetStrategies": ["downgrade_model"],
"maxTokensWhenLow": 1024,
"retryOnDeny": false,
"dryRun": false
}
}
}
}
}Config Presets
Common starting configurations for typical deployment scenarios.
Strict Enforcement
For production agents handling real spend. Blocks on exhaustion, downgrades models, caps tool calls:
{
"plugins": {
"entries": {
"openclaw-budget-guard": {
"config": {
"tenant": "my-org",
"failClosed": true,
"lowBudgetStrategies": ["downgrade_model", "disable_expensive_tools", "limit_remaining_calls"],
"modelFallbacks": {
"anthropic/claude-opus-4-20250514": ["anthropic/claude-sonnet-4-20250514", "anthropic/claude-haiku-4-5-20251001"]
},
"modelBaseCosts": {
"anthropic/claude-opus-4-20250514": 1500000,
"anthropic/claude-sonnet-4-20250514": 300000,
"anthropic/claude-haiku-4-5-20251001": 100000
},
"toolBaseCosts": {
"web_search": 500000,
"code_execution": 1000000
},
"toolCallLimits": {
"send_email": 10,
"deploy": 3
},
"maxRemainingCallsWhenLow": 5
}
}
}
}
}Development / Testing
Dry-run mode with generous budget. No Cycles server needed:
{
"plugins": {
"entries": {
"openclaw-budget-guard": {
"config": {
"tenant": "dev",
"cyclesBaseUrl": "http://unused",
"cyclesApiKey": "unused",
"dryRun": true,
"dryRunBudget": 500000000,
"logLevel": "debug"
}
}
}
}
}Cost-Conscious
Aggressive cost savings. Low thresholds, model downgrade with token limits, expensive tools disabled early:
{
"plugins": {
"entries": {
"openclaw-budget-guard": {
"config": {
"tenant": "my-org",
"lowBudgetThreshold": 5000000,
"exhaustedThreshold": 100000,
"lowBudgetStrategies": ["downgrade_model", "reduce_max_tokens", "disable_expensive_tools"],
"maxTokensWhenLow": 512,
"expensiveToolThreshold": 200000,
"modelFallbacks": {
"anthropic/claude-opus-4-20250514": "anthropic/claude-haiku-4-5-20251001",
"openai/gpt-4o": "openai/gpt-4o-mini"
}
}
}
}
}
}Configure for your use case
Most users only need 5-10 config properties. Start with what you need:
I just want to stop runaway agents (3 required fields only):
{ "tenant": "my-org", "cyclesBaseUrl": "...", "cyclesApiKey": "..." }The defaults (failClosed: true, lowBudgetThreshold: 10000000) will block agents that exhaust their budget and warn when it gets low.
I want cost-aware model selection — add:
{
"modelFallbacks": { "anthropic/claude-opus-4-20250514": ["anthropic/claude-sonnet-4-20250514", "anthropic/claude-haiku-4-5-20251001"] },
"modelBaseCosts": { "anthropic/claude-opus-4-20250514": 1500000, "anthropic/claude-sonnet-4-20250514": 300000, "anthropic/claude-haiku-4-5-20251001": 100000 }
}I want to cap dangerous tool calls — add:
{ "toolCallLimits": { "send_email": 10, "deploy": 3, "delete_data": 1 } }I want observability — add:
{ "otlpMetricsEndpoint": "http://localhost:4318/v1/metrics" }I want to catch runaway loops — add:
{ "burnRateAlertThreshold": 3.0, "onBurnRateAnomaly": "..." }I want full debugging — add:
{ "enableEventLog": true, "logLevel": "debug" }Config Reference
Core Settings
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| enabled | boolean | true | Master switch — set to false to disable the plugin |
| cyclesBaseUrl | string | — | Cycles server URL (required) |
| cyclesApiKey | string | — | Cycles API key (required) |
| tenant | string | — | Cycles tenant identifier (required) |
| budgetId | string | — | Optional app-level scope for balance queries and reservations |
| currency | string | USD_MICROCENTS | Default budget unit for all reservations |
| failClosed | boolean | true | Block model calls when budget is exhausted or reservation is denied (false = warn, allow, and track cost locally). See failClosed behavior. |
| logLevel | string | info | debug / info / warn / error |
Budget Thresholds
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| lowBudgetThreshold | number | 10000000 | Remaining budget at or below this triggers "low" mode |
| exhaustedThreshold | number | 0 | Remaining budget at or below this triggers "exhausted" mode |
Note:
exhaustedThresholdmust be strictly less thanlowBudgetThreshold.
Model Configuration
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| modelFallbacks | object | {} | Map: model → fallback model or chain of fallbacks (string or string[]) |
| modelBaseCosts | object | {} | Map: model name → estimated cost per call |
| defaultModelCost | number | 500000 | Fallback cost when a model isn't in modelBaseCosts |
| defaultModelActionKind | string | llm.completion | Action kind for model reservations |
| modelCurrency | string | — | Override currency for model reservations (defaults to currency) |
Tool Configuration
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| toolBaseCosts | object | {} | Map: tool name → estimated cost per call |
| defaultToolActionKindPrefix | string | tool. | Prefix for tool action kinds (e.g. tool.web_search) |
| toolAllowlist | string[] | — | Only these tools are permitted (supports * wildcards) |
| toolBlocklist | string[] | — | These tools are blocked (supports * wildcards, takes precedence over allowlist) |
| toolCurrencies | object | — | Map: tool name → currency override |
| toolReservationTtls | object | — | Map: tool name → TTL override in ms |
| toolOveragePolicies | object | — | Map: tool name → overage policy override |
| toolCallLimits | object | — | Map: tool name → max invocations per session (e.g. {"send_email": 10}) |
Prompt Hints
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| injectPromptBudgetHint | boolean | true | Inject budget status into the system prompt |
| maxPromptHintChars | number | 200 | Max characters for the injected budget hint |
Reservation Settings
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| reservationTtlMs | number | 60000 | Default TTL for tool reservations (ms) |
| overagePolicy | string | ALLOW_IF_AVAILABLE | Default overage policy (REJECT, ALLOW_IF_AVAILABLE, ALLOW_WITH_OVERDRAFT) |
| snapshotCacheTtlMs | number | 5000 | How long to cache budget snapshots (ms) |
Low Budget Strategies
When budget drops below lowBudgetThreshold, the plugin applies degradation strategies to reduce spend. Strategies only activate when explicitly listed in lowBudgetStrategies. The default is ["downgrade_model"].
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| lowBudgetStrategies | string[] | ["downgrade_model"] | Strategies to apply when budget is low. Each strategy below only takes effect when listed here. |
| maxTokensWhenLow | number | 1024 | Token limit hint (requires "reduce_max_tokens" in lowBudgetStrategies) |
| expensiveToolThreshold | number | — | Cost threshold (requires "disable_expensive_tools" in lowBudgetStrategies) |
| maxRemainingCallsWhenLow | number | 10 | Max calls allowed (requires "limit_remaining_calls" in lowBudgetStrategies) |
downgrade_model — Switch to cheaper models when budget is low. Requires modelFallbacks to define the fallback chain. The plugin tries each candidate in order and picks the first one whose cost (from modelBaseCosts) fits within the remaining budget. If no candidate fits, the original model is used.
{
"lowBudgetStrategies": ["downgrade_model"],
"modelFallbacks": {
"anthropic/claude-opus-4-20250514": ["anthropic/claude-sonnet-4-20250514", "anthropic/claude-haiku-4-5-20251001"]
},
"modelBaseCosts": {
"anthropic/claude-opus-4-20250514": 1500000,
"anthropic/claude-sonnet-4-20250514": 300000,
"anthropic/claude-haiku-4-5-20251001": 100000
}
}reduce_max_tokens — Append a token limit instruction to the system prompt hint (e.g., "Limit responses to 512 tokens"). This is advisory — the LLM may not obey it. Does not enforce a hard token cap at the API level.
{
"lowBudgetStrategies": ["downgrade_model", "reduce_max_tokens"],
"maxTokensWhenLow": 512
}disable_expensive_tools — Block tools whose estimated cost exceeds a threshold. The threshold defaults to lowBudgetThreshold / 10 if not set explicitly. Tools are always hard-blocked (regardless of failClosed).
{
"lowBudgetStrategies": ["downgrade_model", "disable_expensive_tools"],
"expensiveToolThreshold": 200000,
"toolBaseCosts": {
"web_search": 500000,
"code_execution": 1000000,
"read_file": 50000
}
}In this example, web_search (500K) and code_execution (1M) would be blocked when budget is low, but read_file (50K) would still be allowed.
limit_remaining_calls — Cap the total number of model + tool calls allowed while budget is low. Both model and tool calls decrement a shared counter. When the counter reaches zero, models respect failClosed (block or warn) while tools are always blocked.
{
"lowBudgetStrategies": ["downgrade_model", "limit_remaining_calls"],
"maxRemainingCallsWhenLow": 5
}Important: Each strategy's config parameters (e.g.,
maxTokensWhenLow,expensiveToolThreshold,maxRemainingCallsWhenLow) are silently ignored unless the corresponding strategy is listed inlowBudgetStrategies. The plugin warns at startup if it detects this misconfiguration.
Strategies can be combined. They run in different hooks:
- Model calls (
before_model_resolve):downgrade_model→limit_remaining_calls - Tool calls (
before_tool_call):disable_expensive_tools→limit_remaining_calls - Prompt build (
before_prompt_build):reduce_max_tokens
Within each hook, an earlier strategy that blocks prevents later strategies from running.
A typical production config uses all four:
{
"lowBudgetStrategies": ["downgrade_model", "reduce_max_tokens", "disable_expensive_tools", "limit_remaining_calls"],
"maxTokensWhenLow": 512,
"expensiveToolThreshold": 200000,
"maxRemainingCallsWhenLow": 5
}Retry on Deny
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| retryOnDeny | boolean | false | Retry tool reservations after denial |
| retryDelayMs | number | 2000 | Delay between retries (ms) |
| maxRetries | number | 1 | Maximum retry attempts |
Dry-Run Mode
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| dryRun | boolean | false | Use in-memory simulated budget (no Cycles server needed) |
| dryRunBudget | number | 100000000 | Starting budget for dry-run mode |
Cost Estimation
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| costEstimator | function | — | Custom callback (context) => number \| undefined for dynamic tool cost estimation |
The costEstimator receives a context object with toolName, durationMs, estimate, and result and should return the actual cost or undefined to use the estimate.
Budget Transitions
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| onBudgetTransition | function | — | Callback fired when budget level changes (e.g. healthy → low) |
| budgetTransitionWebhookUrl | string | — | POST webhook URL for budget level transitions |
Per-User/Session Scoping
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| userId | string | — | User ID for budget scoping (can be overridden via ctx.metadata.userId) |
| sessionId | string | — | Session ID for budget scoping (can be overridden via ctx.metadata.sessionId) |
Session Analytics
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| onSessionEnd | function | — | Callback with session summary at agent end |
| analyticsWebhookUrl | string | — | POST webhook URL for session summary data |
Budget Pools
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| parentBudgetId | string | — | Parent budget ID — when set, pool balance is included in hints |
Model Cost Reconciliation (v0.5.0)
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| modelCostEstimator | function | — | Callback (ctx: { model, estimatedCost, turnIndex }) => number | undefined to reconcile model cost at commit time |
Observability (v0.5.0)
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| metricsEmitter | object | — | Object with gauge/counter/histogram methods for observability pipeline integration |
| aggressiveCacheInvalidation | boolean | true | Proactively refetch budget snapshot after every commit/release for fresher data |
| otlpMetricsEndpoint | string | — | OTLP HTTP endpoint for auto metrics export (e.g. http://localhost:4318/v1/metrics) |
| otlpMetricsHeaders | object | — | Custom HTTP headers for OTLP requests |
Resilience (v0.6.0)
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| heartbeatIntervalMs | number | 30000 | Interval for auto-extending long-running tool reservations (ms). Set 0 to disable. |
| retryableStatusCodes | number[] | [429, 503, 504] | HTTP status codes that trigger automatic retry with exponential backoff |
| transientRetryMaxAttempts | number | 2 | Max retry attempts for transient Cycles server errors |
| transientRetryBaseDelayMs | number | 500 | Base delay for exponential backoff on retries (ms) |
Anomaly Detection (v0.6.0)
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| burnRateWindowMs | number | 60000 | Time window for burn rate anomaly detection (ms) |
| burnRateAlertThreshold | number | 3.0 | Alert when current window burn rate exceeds this multiple of the previous window |
| onBurnRateAnomaly | function | — | Callback (event: BurnRateAnomalyEvent) => void on burn rate spike |
| exhaustionWarningThresholdMs | number | 120000 | Warn when estimated time-to-exhaustion drops below this (ms) |
| onExhaustionForecast | function | — | Callback (event: ExhaustionForecastEvent) => void on exhaustion forecast |
Debugging (v0.6.0)
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| enableEventLog | boolean | false | Record every reserve/commit/deny/block decision in sessionSummary.eventLog |
How It Works
Budget Levels
| Level | Condition | What Happens |
|-------|-----------|--------------|
| healthy | remaining > lowBudgetThreshold | Pass through — no intervention |
| low | exhaustedThreshold < remaining <= lowBudgetThreshold | Apply low-budget strategies, inject warnings |
| exhausted | remaining <= exhaustedThreshold | Block execution (failClosed=true) or warn + track locally (failClosed=false) |
Hook: before_model_resolve
Fetches budget state and reserves budget for the model call. The reservation is held open and committed later (in before_prompt_build or at agent_end), allowing the optional modelCostEstimator callback to reconcile estimated vs actual costs. When budget is low:
- Applies model fallbacks (including chained fallbacks like
opus → [sonnet, haiku]) - Enforces
limit_remaining_callsif configured - Attaches budget status metadata to
ctx.metadata["openclaw-budget-guard-status"]
When budget is exhausted and failClosed=true, the plugin blocks the model call by overriding the model name to __cycles_budget_exhausted__, which causes the LLM provider to reject the request. The user sees "Unknown model: openai/cycles_budget_exhausted" — this is intentional. OpenClaw's before_model_resolve hook does not support { block: true } like before_tool_call does (feature request), so this workaround is the only way to prevent model execution when budget runs out.
Hook: before_prompt_build
Commits any pending model reservation from the previous turn (with modelCostEstimator reconciliation if configured). When injectPromptBudgetHint is enabled, injects a system context hint with:
- Current remaining balance and percentage
- Budget level warnings
- Forecast projections (estimated remaining tool/model calls based on average costs)
- Team pool balance (when
parentBudgetIdis configured) - Token limit guidance (when
reduce_max_tokensstrategy is active)
Example hint:
Budget: 5000000 USD_MICROCENTS remaining. Budget is low — prefer cheaper models and avoid expensive tools. 50% of budget remaining. Est. ~10 tool calls and ~5 model calls remaining at current rate. Team pool: 50000000 remaining.Hook: before_tool_call
- Checks tool permissions against allowlist/blocklist
- Applies
disable_expensive_toolsandlimit_remaining_callsstrategies - Creates a Cycles reservation with configured TTL, overage policy, and currency
- On denial, optionally retries (when
retryOnDeny=true) - Blocks or allows based on the reservation decision
Hook: after_tool_call
Commits the reservation with actual cost. Uses the costEstimator callback if configured, otherwise uses the original estimate. Tracks per-tool cost breakdowns for the session summary.
Hook: agent_end
- Releases orphaned reservations (defensive cleanup)
- Fetches final budget state
- Builds session summary with cost breakdown, forecasts, and timing
- Calls
onSessionEndcallback and fires analytics webhook if configured - Attaches summary to
ctx.metadata["openclaw-budget-guard"]
Chained Model Fallbacks
Model fallbacks support both single values and ordered chains:
{
"modelFallbacks": {
"anthropic/claude-opus-4-20250514": ["anthropic/claude-sonnet-4-20250514", "anthropic/claude-haiku-4-5-20251001"],
"openai/gpt-4o": "openai/gpt-4o-mini"
}
}When budget is low, the plugin tries each candidate in order and selects the first one whose cost fits within the remaining budget.
Tool Allowlists and Blocklists
Control which tools can be called using glob-style patterns:
{
"toolAllowlist": ["web_search", "code_*"],
"toolBlocklist": ["dangerous_*"]
}- Blocklist takes precedence over allowlist
- Supports exact names and
*wildcards (prefix:code_*, suffix:*_tool, all:*)
Tool Call Limits
Cap the number of times a specific tool can be invoked per session. Useful for consequential actions like sending emails or triggering deployments:
{
"toolCallLimits": {
"send_email": 10,
"deploy": 3
}
}Once a tool reaches its limit, further calls are blocked with a descriptive reason. Tools without a limit are unrestricted. Limits reset on each new agent session.
Budget Transition Alerts
Configure callbacks or webhooks to be notified when budget level changes:
{
"budgetTransitionWebhookUrl": "https://hooks.example.com/budget-alert"
}Or programmatically:
{
onBudgetTransition: (event) => {
console.log(`Budget changed: ${event.previousLevel} → ${event.currentLevel}`);
}
}Error Handling
The plugin exports two structured error types:
import { BudgetExhaustedError, ToolBudgetDeniedError } from "@runcycles/openclaw-budget-guard";BudgetExhaustedError(code: "BUDGET_EXHAUSTED") — thrown when budget is exhausted andfailClosed=true. Includesremaining,tenant, andbudgetIdproperties. The error message includes an actionable hint to increase budget via the Cycles API.ToolBudgetDeniedError(code: "TOOL_BUDGET_DENIED") — available as a structured error type for tool denials. IncludestoolNameproperty.
failClosed — Block vs. Allow on Budget Denial
The failClosed setting (default: true) controls what happens when a model reservation is denied — either because the budget is exhausted or because the Cycles server rejects the reservation (e.g., the estimated cost exceeds remaining budget).
failClosed: true — The plugin blocks the model call. It returns a synthetic model override (__cycles_budget_exhausted__) that causes the LLM provider to reject the request. The agent stops. Use this in production when overspend is unacceptable.
failClosed: false — The plugin logs a warning and allows the model call to proceed. The estimated cost is tracked locally (session summary, cost breakdown, forecasting) even though no server-side reservation was committed. Use this for shadow/monitoring mode — you see what would have been blocked without disrupting the agent.
| Scenario | failClosed: true | failClosed: false |
|---|---|---|
| Budget exhausted (cached snapshot) | Block | Warn + allow |
| Server denies reservation (estimate > remaining) | Block | Warn + allow + track cost locally |
| Low-budget call limit reached (model) | Block | Warn + allow |
| Low-budget call limit reached (tool) | Always block | Always block |
| Expensive tool threshold exceeded | Always block | Always block |
| Tool reservation denied | Always block | Always block |
Note: All tool-level enforcement (reservation denials, call limits, expensive tool threshold) always blocks regardless of
failClosed— tools have no fallback mechanism.failClosedonly affects model-level decisions.
Fail-Open Behavior (Network Errors)
Separately from failClosed, the plugin handles network/transient errors with a fail-open strategy:
- If the Cycles server is unreachable, the plugin assumes healthy budget and allows execution
- If a commit fails, execution continues (logged but non-blocking)
This is always fail-open regardless of failClosed — a transient network blip should not kill every agent. failClosed only controls behavior when the server confirms the budget is insufficient.
Troubleshooting
"Skipping registration" warning during install
- This is normal. OpenClaw loads the plugin during install before your config is written. The plugin detects the missing config, logs a warning, and skips registration. After you add your config and restart the gateway, the plugin will register normally.
Plugin not loading
- Verify the plugin is enabled:
openclaw plugins list - Check that
openclaw.plugin.jsonis included in the installed package
"Unknown model: openai/__cycles_budget_exhausted__" or "Budget exhausted"
Your budget has run out. To resume:
Fund the budget via the Cycles Admin API:
curl -X POST "http://localhost:7979/v1/admin/budgets/fund?scope=tenant:my-org&unit=USD_MICROCENTS" \ -H "X-Cycles-API-Key: your-admin-key" \ -H "Content-Type: application/json" \ -d '{"operation": "CREDIT", "amount": 50000000, "idempotency_key": "topup-001"}'This adds 50,000,000 units ($0.50) to the budget. Adjust the
scopeto match yourtenant(andbudgetIdif set).Start a new agent session — the plugin fetches fresh budget state at the start of each session.
For details on budget management, see Budget Allocation and Management.
"cyclesBaseUrl is required" error
- Set
cyclesBaseUrlin your plugin config (use"${CYCLES_BASE_URL}"for env var interpolation)
Budget always shows "healthy"
- Verify
currency,tenant, andbudgetIdmatch your Cycles setup - Set
logLevel: "debug"to see raw balance responses
Tools not being blocked
- Check
toolBaseCostsincludes your tool (default cost is 100,000 units) - Check
failClosedistrue(default)
Model not being downgraded
- The exact model name must match a key in
modelFallbacks - Check model costs in
modelBaseCosts— fallback must be cheaper than remaining budget
Production checklist
Before deploying to production:
- [ ] API key stored as env var (
CYCLES_API_KEY), not in config file - [ ]
failClosed: true(default — blocks on exhausted budget) - [ ]
dryRun: false(default — uses real Cycles server) - [ ]
modelBaseCostsset for each model your agent uses - [ ]
toolBaseCostsset for at least your top 5 tools by usage - [ ]
toolCallLimitsset for dangerous tools (send_email,deploy, etc.) - [ ]
lowBudgetThresholdcalibrated for your session duration (default 10M = $0.10) - [ ] Budget transition monitoring via
onBudgetTransitioncallback orbudgetTransitionWebhookUrl - [ ] Session analytics via
onSessionEndcallback oranalyticsWebhookUrl - [ ] Run one test session with
logLevel: "debug"andenableEventLog: trueto verify costs
Known Limitations
| Limitation | Impact | Workaround |
|---|---|---|
| Model cost is estimated by default. OpenClaw has no after_model_resolve hook, so model costs are based on modelBaseCosts estimates. The modelCostEstimator callback can reconcile costs if you have a proxy or gateway with token counts. | Cost tracking for models is approximate unless you provide a modelCostEstimator. The plugin will never overspend — it may under-track slightly. | Use modelCostEstimator to reconcile costs. Or buffer modelBaseCosts estimates 10–20% higher than expected. |
| ALLOW_WITH_CAPS decisions are not enforced. If the Cycles server returns caps (max_tokens, tool allowlist) alongside an ALLOW decision, the plugin stores them but does not apply them downstream. | Low risk — v0 Cycles servers rarely return caps. | Monitor Cycles protocol updates. |
| Per-user/session scoping uses custom dimensions. User and session IDs are passed as dimensions.user / dimensions.session in the reservation subject. v0 Cycles servers may ignore custom dimensions for balance filtering. | Per-user budget isolation depends on server support for dimensions. | Verify scoping works with your Cycles server version before relying on it in production. |
| Heartbeat requires client support. Reservation auto-extension (heartbeatIntervalMs) calls client.extendReservation(). If the Cycles client does not implement this method, heartbeats are silently skipped. | Long-running tools may still lose cost tracking if the client lacks extendReservation. | Use per-tool TTL overrides via toolReservationTtls as fallback. |
| Model blocking uses a provider-error workaround. OpenClaw's before_model_resolve hook does not support { block: true } (feature request). When budget is exhausted, the plugin overrides the model to __cycles_budget_exhausted__, causing the provider to reject the call. The user sees "Unknown model" instead of a clean budget error. | Model calls are effectively blocked, but the error message is a provider error rather than a budget message. Tool blocking via before_tool_call works cleanly with { block: true }. | Pending OpenClaw adding block support to before_model_resolve. |
| OpenClaw does not pass model name in hook events. The before_model_resolve event only contains { prompt } — no model name (feature request). The plugin auto-detects the model from system config or falls back to defaultModelName. | Model-specific cost tracking requires defaultModelName to be set in plugin config. | Set defaultModelName to your agent's model (e.g. "openai/gpt-5-nano"). |
For project structure, architecture diagrams, and development workflow, see ARCHITECTURE.md.
Documentation
- Cycles Documentation — full docs site
- OpenClaw Integration Guide — detailed integration guide
- API Key Management — creating and managing API keys
License
Apache-2.0
