@runcycles/openclaw-budget-guard

v0.8.4

Published

2 months ago

Runtime budget, action, and audit authority for OpenClaw agents — enforce LLM cost limits, tool call caps, and audit trails before execution.

cycles-openclaw-budget-guard

OpenClaw plugin for budget-aware model and tool execution using Cycles.

Why use this plugin?

AI agents make autonomous decisions — calling models, invoking tools, retrying on failure — with no human in the loop. Without runtime enforcement, several things go wrong:

Runaway spend. A single agent stuck in a tool loop or retrying failed calls can burn through hundreds of dollars in minutes. Provider spending caps are account-wide and too coarse. Rate limits don't account for cost. In-app counters don't survive restarts or coordinate across concurrent agents.

Uncontrolled side-effects. An agent can send 100 emails, trigger 50 deployments, or call dangerous APIs with nothing to stop it. Cost limits alone don't help — some actions are consequential regardless of price.

Noisy neighbors. In multi-tenant or multi-user setups, one agent can consume the entire team or tenant budget, starving other users. Without per-user scoping, there's no isolation.

No session-level cost visibility. When an agent session ends, you have no idea what it spent, which tools it called most, or whether it was cost-efficient. Debugging cost overruns after the fact is painful.

Abrupt failure. When budget runs out, the agent crashes instead of adapting — switching to cheaper models, reducing output length, or disabling expensive tools.

This plugin addresses those failure modes by checking model and tool execution before it runs, then degrading or blocking when budget conditions require it. It also tracks session-level cost breakdowns, tool usage, and budget transitions for debugging and operations.

Beyond enforcement, the plugin monitors for problems as they develop:

Burn rate anomaly detection catches runaway tool loops — if spending spikes 3x above the session average, onBurnRateAnomaly fires immediately
Predictive exhaustion warnings estimate when budget will run out and fire onExhaustionForecast before it happens
Automatic retry with backoff on transient Cycles server errors (429/503) prevents spurious denials under load
Reservation heartbeat auto-extends long-running tool reservations so cost tracking doesn't silently break
Observability via metricsEmitter (Datadog, Prometheus, Grafana, OTLP) and opt-in session event logs

In typical OpenClaw setups, you can add enforcement without changing agent logic.

For deeper background, see Why Rate Limits Are Not Enough and Runaway Agents and Tool Loops.

Overview

A comprehensive OpenClaw plugin that integrates with a live Cycles server to enforce budget boundaries during agent execution. It hooks into the OpenClaw plugin lifecycle to:

Reserve budget for model and tool calls using the reserve → commit → release protocol
Downgrade models when budget is low (configurable fallback chains)
Block execution when budget is exhausted (fail-closed by default)
Inject budget hints into prompts so the model is budget-aware
Detect budget transitions and fire callbacks/webhooks on level changes
Control tool access with allowlists, blocklists, and per-tool call limits
Apply graceful degradation strategies when budget is low
Retry denied reservations and transient server errors with configurable backoff
Keep long-running tools alive with automatic reservation heartbeat
Detect anomalies — burn rate spikes and predictive exhaustion warnings
Emit metrics to Datadog, Prometheus, Grafana, or any OTLP-compatible backend
Record an event log of every budget decision for debugging and compliance
Report unconfigured tools so you know which tools are using default cost estimates
Support dry-run mode for testing without a live Cycles server
Track per-tool cost breakdowns and session analytics with model cost reconciliation
Support multi-currency budgets with per-tool/model overrides
Support budget pools/hierarchies via parent budget visibility

The plugin uses the runcycles TypeScript client to communicate with a Cycles server.

Important: Budget exhaustion is enforced fail-closed by default, but Cycles server connectivity failures are handled fail-open — the plugin assumes healthy budget and allows execution to continue. Set failClosedOnSnapshotError: true to flip this for hardened deployments. See Fail-Open Behavior for details.

Prerequisites

OpenClaw >= 0.1.0 with plugin support
Node.js >= 20.0.0
A running Cycles server with:
- A base URL (e.g. http://localhost:7878)
- An API key
- A tenant configured with a budget scope

If you don't have a Cycles server yet, see the Cycles quickstart to set one up. Alternatively, use dry-run mode to test without a server.

To see budget enforcement in action before wiring up your own agent, run the Cycles Runaway Demo — it shows the exact failure mode this plugin prevents, with a live before/after comparison.

Quick Start

1. Install the plugin

openclaw plugins install @runcycles/openclaw-budget-guard

For local development:

openclaw plugins install -l ./cycles-openclaw-budget-guard

2. Enable the plugin

openclaw plugins enable openclaw-budget-guard

3. Add minimal configuration

Add the following to your OpenClaw config file (typically openclaw.json or openclaw.config.json):

{
  "plugins": {
    "entries": {
      "openclaw-budget-guard": {
        "config": {
          "cyclesBaseUrl": "http://localhost:7878",
          "cyclesApiKey": "cyc_your_api_key_here",
          "tenant": "my-org"
        }
      }
    }
  }
}

That's it — the plugin uses sensible defaults for everything else. The agent will now enforce budget limits on every run.

Need an API key? API keys are created via the Cycles Admin Server (port 7979). See the deployment guide to create one, or see API Key Management for details.

4. (Optional) Keep secrets out of config files

Use OpenClaw's env var interpolation to avoid hardcoding API keys:

{
  "plugins": {
    "entries": {
      "openclaw-budget-guard": {
        "config": {
          "cyclesBaseUrl": "${CYCLES_BASE_URL}",
          "cyclesApiKey": "${CYCLES_API_KEY}",
          "tenant": "my-org"
        }
      }
    }
  }
}

Then set the env vars in your shell or CI:

export CYCLES_BASE_URL="http://localhost:7878"
export CYCLES_API_KEY="cyc_your_api_key_here"

5. (Optional) Try dry-run mode

To test without a live Cycles server:

{
  "plugins": {
    "entries": {
      "openclaw-budget-guard": {
        "config": {
          "tenant": "my-org",
          "cyclesBaseUrl": "http://unused",
          "cyclesApiKey": "unused",
          "dryRun": true,
          "dryRunBudget": 100000000
        }
      }
    }
  }
}

6. Verify it's working

After restarting OpenClaw, check the logs for:

  Cycles Budget Guard for OpenClaw v0.7.10
  https://runcycles.io
  tenant: my-org
  cyclesBaseUrl: http://localhost:7878
  ...

Run your agent and look for budget activity:

[openclaw-budget-guard] before_model_resolve: model=anthropic/claude-sonnet-4-20250514 level=healthy

If you see this, the plugin is actively checking budget on every model and tool call.

Understanding the cost model

The plugin uses a simple model: every model call and tool call reserves a fixed cost from the budget.

Currency. The default is USD_MICROCENTS — 1 unit = $0.00001 (one hundred-thousandth of a dollar). So:

| Amount (units) | USD equivalent | |----------------|---------------| | 100,000 | $0.001 (0.1 cents) | | 1,000,000 | $0.01 (1 cent) | | 10,000,000 | $0.10 (10 cents) | | 100,000,000 | $1.00 |

Example. With a $5 budget (500,000,000 units):

anthropic/claude-opus at 1,500,000/call = ~333 calls before exhaustion
anthropic/claude-sonnet at 300,000/call = ~1,666 calls
web_search at 500,000/call = ~1,000 calls
lowBudgetThreshold: 10000000 triggers model downgrade when $0.10 remains

Model names. OpenClaw passes model identifiers in provider/model format (e.g., openai/gpt-4o, anthropic/claude-sonnet-4-20250514). Your modelBaseCosts, modelFallbacks, and defaultModelName must use the same format — bare model names like gpt-4o won't match. The plugin automatically strips the provider prefix when returning modelOverride to OpenClaw, so you can use provider/model consistently in all config fields without double-prefixing issues.

Setting toolBaseCosts. Start with the default (100,000 units per call). After your first session, check the unconfiguredTools list in the session summary — it tells you which tools need explicit costs. For tools that call external APIs, estimate higher (500K-1M). For lightweight tools, estimate lower (10K-50K).

Full Configuration Example

{
  "plugins": {
    "entries": {
      "openclaw-budget-guard": {
        "config": {
          "enabled": true,
          "cyclesBaseUrl": "http://localhost:7878",
          "cyclesApiKey": "cyc_your_api_key_here",
          "tenant": "my-org",
          "budgetScope": { "app": "my-app" },
          "currency": "USD_MICROCENTS",
          "lowBudgetThreshold": 10000000,
          "exhaustedThreshold": 0,
          "modelFallbacks": {
            "anthropic/claude-opus-4-20250514": ["anthropic/claude-sonnet-4-20250514", "anthropic/claude-haiku-4-5-20251001"],
            "openai/gpt-4o": "openai/gpt-4o-mini"
          },
          "modelBaseCosts": {
            "anthropic/claude-opus-4-20250514": 1500000,
            "anthropic/claude-sonnet-4-20250514": 300000,
            "openai/gpt-4o": 1000000,
            "openai/gpt-4o-mini": 100000
          },
          "toolBaseCosts": {
            "web_search": 500000,
            "code_execution": 1000000
          },
          "toolCallLimits": {
            "send_email": 10,
            "deploy": 3
          },
          "injectPromptBudgetHint": true,
          "maxPromptHintChars": 200,
          "failClosed": true,
          "logLevel": "info",
          "reservationTtlMs": 60000,
          "overagePolicy": "ALLOW_IF_AVAILABLE",
          "lowBudgetStrategies": ["downgrade_model"],
          "maxTokensWhenLow": 1024,
          "retryOnDeny": false,
          "dryRun": false
        }
      }
    }
  }
}

Config Presets

Common starting configurations for typical deployment scenarios.

Strict Enforcement

For production agents handling real spend. Blocks on exhaustion, downgrades models, caps tool calls:

{
  "plugins": {
    "entries": {
      "openclaw-budget-guard": {
        "config": {
          "tenant": "my-org",
          "failClosed": true,
          "lowBudgetStrategies": ["downgrade_model", "disable_expensive_tools", "limit_remaining_calls"],
          "modelFallbacks": {
            "anthropic/claude-opus-4-20250514": ["anthropic/claude-sonnet-4-20250514", "anthropic/claude-haiku-4-5-20251001"]
          },
          "modelBaseCosts": {
            "anthropic/claude-opus-4-20250514": 1500000,
            "anthropic/claude-sonnet-4-20250514": 300000,
            "anthropic/claude-haiku-4-5-20251001": 100000
          },
          "toolBaseCosts": {
            "web_search": 500000,
            "code_execution": 1000000
          },
          "toolCallLimits": {
            "send_email": 10,
            "deploy": 3
          },
          "maxRemainingCallsWhenLow": 5
        }
      }
    }
  }
}

Development / Testing

Dry-run mode with generous budget. No Cycles server needed:

{
  "plugins": {
    "entries": {
      "openclaw-budget-guard": {
        "config": {
          "tenant": "dev",
          "cyclesBaseUrl": "http://unused",
          "cyclesApiKey": "unused",
          "dryRun": true,
          "dryRunBudget": 500000000,
          "logLevel": "debug"
        }
      }
    }
  }
}

Cost-Conscious

Aggressive cost savings. Low thresholds, model downgrade with token limits, expensive tools disabled early:

{
  "plugins": {
    "entries": {
      "openclaw-budget-guard": {
        "config": {
          "tenant": "my-org",
          "lowBudgetThreshold": 5000000,
          "exhaustedThreshold": 100000,
          "lowBudgetStrategies": ["downgrade_model", "reduce_max_tokens", "disable_expensive_tools"],
          "maxTokensWhenLow": 512,
          "expensiveToolThreshold": 200000,
          "modelFallbacks": {
            "anthropic/claude-opus-4-20250514": "anthropic/claude-haiku-4-5-20251001",
            "openai/gpt-4o": "openai/gpt-4o-mini"
          }
        }
      }
    }
  }
}

Configure for your use case

Most users only need 5-10 config properties. Start with what you need:

I just want to stop runaway agents (3 required fields only):

{ "tenant": "my-org", "cyclesBaseUrl": "...", "cyclesApiKey": "..." }

The defaults (failClosed: true, lowBudgetThreshold: 10000000) will block agents that exhaust their budget and warn when it gets low.

I want cost-aware model selection — add:

{
  "modelFallbacks": { "anthropic/claude-opus-4-20250514": ["anthropic/claude-sonnet-4-20250514", "anthropic/claude-haiku-4-5-20251001"] },
  "modelBaseCosts": { "anthropic/claude-opus-4-20250514": 1500000, "anthropic/claude-sonnet-4-20250514": 300000, "anthropic/claude-haiku-4-5-20251001": 100000 }
}

I want to cap dangerous tool calls — add:

{ "toolCallLimits": { "send_email": 10, "deploy": 3, "delete_data": 1 } }

I want observability — add:

{ "otlpMetricsEndpoint": "http://localhost:4318/v1/metrics" }

I want to catch runaway loops — add:

{ "burnRateAlertThreshold": 3.0, "onBurnRateAnomaly": "..." }

I want full debugging — add:

{ "enableEventLog": true, "logLevel": "debug" }

Config Reference

Core Settings

| Field | Type | Default | Description | |-------|------|---------|-------------| | enabled | boolean | true | Master switch — set to false to disable the plugin | | cyclesBaseUrl | string | — | Cycles server URL (required) | | cyclesApiKey | string | — | Cycles API key (required) | | tenant | string | — | Cycles tenant identifier (required) | | budgetScope | object | — | Scope segments for targeting a specific budget (e.g. { "workspace": "road", "app": "lane" }). See Budget Scoping. | | budgetId | string | — | Deprecated — use budgetScope instead. Equivalent to budgetScope: { "app": "<value>" }. | | currency | string | USD_MICROCENTS | Default budget unit for all reservations | | failClosed | boolean | true | Block model calls when budget is exhausted or reservation is denied (false = warn, allow, and track cost locally). See failClosed behavior. | | logLevel | string | info | debug / info / warn / error |

Budget Thresholds

| Field | Type | Default | Description | |-------|------|---------|-------------| | lowBudgetThreshold | number | 10000000 | Remaining budget at or below this triggers "low" mode | | exhaustedThreshold | number | 0 | Remaining budget at or below this triggers "exhausted" mode |

Note: exhaustedThreshold must be strictly less than lowBudgetThreshold.

Model Configuration

| Field | Type | Default | Description | |-------|------|---------|-------------| | modelFallbacks | object | {} | Map: model → fallback model or chain of fallbacks (string or string[]) | | modelBaseCosts | object | {} | Map: model name → estimated cost per call | | defaultModelCost | number | 500000 | Fallback cost when a model isn't in modelBaseCosts | | defaultModelActionKind | string | llm.completion | Action kind for model reservations | | modelCurrency | string | — | Override currency for model reservations (defaults to currency) |

Tool Configuration

| Field | Type | Default | Description | |-------|------|---------|-------------| | toolBaseCosts | object | {} | Map: tool name → estimated cost per call | | defaultToolActionKindPrefix | string | tool. | Prefix for tool action kinds (e.g. tool.web_search) | | toolAllowlist | string[] | — | Only these tools are permitted (supports * wildcards) | | toolBlocklist | string[] | — | These tools are blocked (supports * wildcards, takes precedence over allowlist) | | toolCurrencies | object | — | Map: tool name → currency override | | toolReservationTtls | object | — | Map: tool name → TTL override in ms | | toolOveragePolicies | object | — | Map: tool name → overage policy override | | toolCallLimits | object | — | Map: tool name → max invocations per session (e.g. {"send_email": 10}) |

Prompt Hints

| Field | Type | Default | Description | |-------|------|---------|-------------| | injectPromptBudgetHint | boolean | true | Inject budget status into the system prompt | | maxPromptHintChars | number | 200 | Max characters for the injected budget hint |

Reservation Settings

| Field | Type | Default | Description | |-------|------|---------|-------------| | reservationTtlMs | number | 60000 | Default TTL for tool reservations (ms). Capped at 1 hour. | | overagePolicy | string | ALLOW_IF_AVAILABLE | Default overage policy (REJECT, ALLOW_IF_AVAILABLE, ALLOW_WITH_OVERDRAFT) | | snapshotCacheTtlMs | number | 5000 | How long to cache budget snapshots (ms) | | failClosedOnSnapshotError | boolean | false | Treat an unreachable Cycles control plane as exhausted instead of fail-open healthy. See Fail-Open Behavior. |

Low Budget Strategies

When budget drops below lowBudgetThreshold, the plugin applies degradation strategies to reduce spend. Strategies only activate when explicitly listed in lowBudgetStrategies. The default is ["downgrade_model"].

| Field | Type | Default | Description | |-------|------|---------|-------------| | lowBudgetStrategies | string[] | ["downgrade_model"] | Strategies to apply when budget is low. Each strategy below only takes effect when listed here. | | maxTokensWhenLow | number | 1024 | Token limit hint (requires "reduce_max_tokens" in lowBudgetStrategies) | | expensiveToolThreshold | number | — | Cost threshold (requires "disable_expensive_tools" in lowBudgetStrategies) | | maxRemainingCallsWhenLow | number | 10 | Max calls allowed (requires "limit_remaining_calls" in lowBudgetStrategies) |

downgrade_model — Switch to cheaper models when budget is low. Requires modelFallbacks to define the fallback chain. The plugin tries each candidate in order and picks the first one whose cost (from modelBaseCosts) fits within the remaining budget. If no candidate fits, the original model is used.

{
  "lowBudgetStrategies": ["downgrade_model"],
  "modelFallbacks": {
    "anthropic/claude-opus-4-20250514": ["anthropic/claude-sonnet-4-20250514", "anthropic/claude-haiku-4-5-20251001"]
  },
  "modelBaseCosts": {
    "anthropic/claude-opus-4-20250514": 1500000,
    "anthropic/claude-sonnet-4-20250514": 300000,
    "anthropic/claude-haiku-4-5-20251001": 100000
  }
}

reduce_max_tokens — Append a token limit instruction to the system prompt hint (e.g., "Limit responses to 512 tokens"). This is advisory — the LLM may not obey it. Does not enforce a hard token cap at the API level.

{
  "lowBudgetStrategies": ["downgrade_model", "reduce_max_tokens"],
  "maxTokensWhenLow": 512
}

disable_expensive_tools — Block tools whose estimated cost exceeds a threshold. The threshold defaults to lowBudgetThreshold / 10 if not set explicitly. Tools are always hard-blocked (regardless of failClosed).

{
  "lowBudgetStrategies": ["downgrade_model", "disable_expensive_tools"],
  "expensiveToolThreshold": 200000,
  "toolBaseCosts": {
    "web_search": 500000,
    "code_execution": 1000000,
    "read_file": 50000
  }
}

In this example, web_search (500K) and code_execution (1M) would be blocked when budget is low, but read_file (50K) would still be allowed.

limit_remaining_calls — Cap the total number of model + tool calls allowed while budget is low. Both model and tool calls decrement a shared counter. When the counter reaches zero, models respect failClosed (block or warn) while tools are always blocked.

{
  "lowBudgetStrategies": ["downgrade_model", "limit_remaining_calls"],
  "maxRemainingCallsWhenLow": 5
}

Important: Each strategy's config parameters (e.g., maxTokensWhenLow, expensiveToolThreshold, maxRemainingCallsWhenLow) are silently ignored unless the corresponding strategy is listed in lowBudgetStrategies. The plugin warns at startup if it detects this misconfiguration.

Strategies can be combined. They run in different hooks:

Model calls (before_model_resolve): downgrade_model → limit_remaining_calls
Tool calls (before_tool_call): disable_expensive_tools → limit_remaining_calls
Prompt build (before_prompt_build): reduce_max_tokens

Within each hook, an earlier strategy that blocks prevents later strategies from running.

A typical production config uses all four:

{
  "lowBudgetStrategies": ["downgrade_model", "reduce_max_tokens", "disable_expensive_tools", "limit_remaining_calls"],
  "maxTokensWhenLow": 512,
  "expensiveToolThreshold": 200000,
  "maxRemainingCallsWhenLow": 5
}

Retry on Deny

| Field | Type | Default | Description | |-------|------|---------|-------------| | retryOnDeny | boolean | false | Retry tool reservations after denial | | retryDelayMs | number | 2000 | Delay between retries (ms). Capped at 60s. | | maxRetries | number | 1 | Maximum retry attempts. Integer 0–10. |

Dry-Run Mode

| Field | Type | Default | Description | |-------|------|---------|-------------| | dryRun | boolean | false | Use in-memory simulated budget (no Cycles server needed) | | dryRunBudget | number | 100000000 | Starting budget for dry-run mode |

Cost Estimation

| Field | Type | Default | Description | |-------|------|---------|-------------| | costEstimator | function | — | Custom callback (context) => number \| undefined for dynamic tool cost estimation |

The costEstimator receives a context object with toolName, durationMs, estimate, and result and should return the actual cost or undefined to use the estimate.

Budget Transitions

| Field | Type | Default | Description | |-------|------|---------|-------------| | onBudgetTransition | function | — | Callback fired when budget level changes (e.g. healthy → low) | | budgetTransitionWebhookUrl | string | — | POST webhook URL for budget level transitions. Must be http/https (validated at config load). |

Per-User/Session Scoping

| Field | Type | Default | Description | |-------|------|---------|-------------| | userId | string | — | User ID for budget scoping (can be overridden via ctx.metadata.userId) | | sessionId | string | — | Session ID for budget scoping (can be overridden via ctx.metadata.sessionId) |

Session Analytics

| Field | Type | Default | Description | |-------|------|---------|-------------| | onSessionEnd | function | — | Callback with session summary at agent end | | analyticsWebhookUrl | string | — | POST webhook URL for session summary data. Must be http/https (validated at config load). |

Budget Scoping with `budgetScope`

By default, the plugin tracks all spend against the tenant-level budget. If you run multiple agents or applications under the same tenant, they share one budget pool — one agent can consume the entire budget and starve others.

budgetScope targets a specific budget in the Cycles scope hierarchy. It supports any combination of scope levels (workspace, app, workflow, agent, toolset):

tenant: "my-org"                        ← shared across all apps
  workspace: "team-a"                   ← team-level isolation
    app: "research-agent"               ← $5 budget
    app: "coding-agent"                 ← $10 budget

Step 1: Create and fund the budget in Cycles (via the Admin API):

curl -X POST "http://localhost:7979/v1/admin/budgets/fund?scope=tenant:my-org/workspace:team-a/app:research-agent&unit=USD_MICROCENTS" \
  -H "X-Cycles-API-Key: your-admin-key" \
  -H "Content-Type: application/json" \
  -d '{
    "operation": "CREDIT",
    "amount": 500000000,
    "idempotency_key": "fund-research-agent-001"
  }'

This creates the scope under tenant:my-org and funds it with 500,000,000 units ($5.00). The scope is created automatically if it doesn't exist.

Step 2: Set budgetScope in the plugin config:

{
  "tenant": "my-org",
  "budgetScope": {
    "workspace": "team-a",
    "app": "research-agent"
  }
}

The plugin then:

Queries balances filtered to workspace: "team-a", app: "research-agent"
Creates reservations scoped to all specified segments
Reports spend against that specific budget, not the tenant total

For simple app-only scoping, use just the app key:

{
  "tenant": "my-org",
  "budgetScope": { "app": "research-agent" }
}

Migration from budgetId: budgetId is deprecated but still works. "budgetId": "research-agent" is equivalent to "budgetScope": { "app": "research-agent" }. If both are set, budgetScope takes precedence.

When to use budgetScope:

Multiple agents under the same tenant that need isolated budgets
Per-project or per-team spend tracking
Budgets with intermediate scope levels (workspace, workflow, etc.)
Preventing one agent from consuming the entire tenant budget

When to skip it:

Single agent setup — tenant-level budget is sufficient
You want all agents to share a single budget pool

Cycles scope hierarchy and what the plugin supports:

The Cycles protocol supports a full scope hierarchy: tenant → workspace → app → workflow → agent → toolset. The plugin supports all levels via budgetScope:

| Cycles scope | Plugin config | Used for | |---|---|---| | tenant | tenant (required) | Top-level budget boundary | | workspace, app, workflow, agent, toolset | budgetScope (optional) | Budget isolation at any scope level | | dimensions.user | userId | Per-user spend tracking within a scope | | dimensions.session | sessionId | Per-session spend tracking within a scope |

Budget Pools (Team Visibility)

| Field | Type | Default | Description | |-------|------|---------|-------------| | parentBudgetId | string | — | Parent budget scope — when set, the team/pool balance is included in prompt hints |

parentBudgetId is a read-only visibility feature. When set, the plugin fetches the parent scope's balance and includes it in the prompt hint (e.g., "Team pool: 50000000 remaining"). It does not enforce the parent budget — enforcement happens at the scoped level via budgetScope.

Model Cost Reconciliation (v0.5.0)

| Field | Type | Default | Description | |-------|------|---------|-------------| | modelCostEstimator | function | — | Callback (ctx: { model, estimatedCost, turnIndex }) => number | undefined to reconcile model cost at commit time |

Observability (v0.5.0)

| Field | Type | Default | Description | |-------|------|---------|-------------| | metricsEmitter | object | — | Object with gauge/counter/histogram methods and optional flush() for observability pipeline integration. flush() is called at agent_end to ensure buffered metrics are sent. | | aggressiveCacheInvalidation | boolean | true | Proactively refetch budget snapshot after every commit/release for fresher data | | otlpMetricsEndpoint | string | — | OTLP HTTP endpoint for auto metrics export (e.g. http://localhost:4318/v1/metrics). Must be http/https (validated at config load). | | otlpMetricsHeaders | object | — | Custom HTTP headers for OTLP requests |

Resilience (v0.6.0)

| Field | Type | Default | Description | |-------|------|---------|-------------| | heartbeatIntervalMs | number | 30000 | Interval for auto-extending long-running tool reservations (ms). Set 0 to disable. Capped at 1 hour. | | retryableStatusCodes | number[] | [429, 503, 504] | HTTP status codes that trigger automatic retry with exponential backoff | | transientRetryMaxAttempts | number | 2 | Max retry attempts for transient Cycles server errors. Integer 0–10. | | transientRetryBaseDelayMs | number | 500 | Base delay for exponential backoff on retries (ms). Capped at 60s. |

Anomaly Detection (v0.6.0)

| Field | Type | Default | Description | |-------|------|---------|-------------| | burnRateWindowMs | number | 60000 | Time window for burn rate anomaly detection (ms) | | burnRateAlertThreshold | number | 3.0 | Alert when current window burn rate exceeds this multiple of the previous window | | onBurnRateAnomaly | function | — | Callback (event: BurnRateAnomalyEvent) => void on burn rate spike | | exhaustionWarningThresholdMs | number | 120000 | Warn when estimated time-to-exhaustion drops below this (ms) | | onExhaustionForecast | function | — | Callback (event: ExhaustionForecastEvent) => void on exhaustion forecast |

Debugging (v0.6.0)

| Field | Type | Default | Description | |-------|------|---------|-------------| | enableEventLog | boolean | false | Record every reserve/commit/deny/block decision in sessionSummary.eventLog |

Function-Type Config — Not Available in OpenClaw

The config reference includes several function-type parameters (costEstimator, modelCostEstimator, onBudgetTransition, onSessionEnd, onBurnRateAnomaly, onExhaustionForecast, metricsEmitter). These cannot be used with OpenClaw. OpenClaw plugins are configured via JSON only — there is no mechanism to pass JavaScript functions.

Use these JSON-configurable alternatives instead:

| Instead of... | Use... | How it works | |---|---|---| | costEstimator | toolBaseCosts | Fixed cost per tool. Tune estimates using session summary data. | | modelCostEstimator | modelBaseCosts | Fixed cost per model. | | onBudgetTransition | budgetTransitionWebhookUrl | Sends HTTP POST with level change event to your endpoint. | | onSessionEnd | analyticsWebhookUrl | Sends HTTP POST with full session summary to your endpoint. | | onBurnRateAnomaly | otlpMetricsEndpoint | Emits cycles.budget.burn_rate_anomaly counter to your OTLP backend. | | onExhaustionForecast | otlpMetricsEndpoint | Emits cycles.budget.exhaustion_forecast_ms gauge to your OTLP backend. | | metricsEmitter | otlpMetricsEndpoint | Auto-creates an OTLP emitter — no custom code needed. |

Tuning cost estimates without a callback:

Start with rough values in toolBaseCosts / modelBaseCosts (or use defaults)
Set enableEventLog: true in your config
Run a few agent sessions
Check the session summary in the logs — it shows per-tool and per-model cost breakdowns, plus an unconfiguredTools list of tools using the default estimate
Adjust your cost values based on actual usage patterns and re-run

This iterative approach is more practical than writing a cost estimator function, since the estimates only need to be "close enough" — Cycles reservations lock the estimated amount and commits charge the actual.

Why do the function params exist? The plugin is also published as an npm package. The function API is available for developers who import the plugin as a library in custom agent frameworks or test harnesses — not for standard OpenClaw JSON config.

How It Works

Budget Levels

| Level | Condition | What Happens | |-------|-----------|--------------| | healthy | remaining > lowBudgetThreshold | Pass through — no intervention | | low | exhaustedThreshold < remaining <= lowBudgetThreshold | Apply low-budget strategies, inject warnings | | exhausted | remaining <= exhaustedThreshold | Block execution (failClosed=true) or warn + track locally (failClosed=false) |

Hook: `before_model_resolve`

Fetches budget state and reserves budget for the model call. The reservation is held open and committed later (in before_prompt_build or at agent_end), allowing the optional modelCostEstimator callback to reconcile estimated vs actual costs. When budget is low:

Applies model fallbacks (including chained fallbacks like opus → [sonnet, haiku])
Enforces limit_remaining_calls if configured
Attaches budget status metadata to ctx.metadata["openclaw-budget-guard-status"]

When budget is exhausted and failClosed=true, the plugin blocks the model call by overriding the model name to __cycles_budget_exhausted__, which causes the LLM provider to reject the request. The user sees "Unknown model: openai/cycles_budget_exhausted" — this is intentional. OpenClaw's before_model_resolve hook does not support { block: true } like before_tool_call does (feature request), so this workaround is the only way to prevent model execution when budget runs out.

Hook: `before_prompt_build`

Commits any pending model reservation from the previous turn (with modelCostEstimator reconciliation if configured). When injectPromptBudgetHint is enabled, injects a system context hint with:

Current remaining balance and percentage
Budget level warnings
Forecast projections (estimated remaining tool/model calls based on average costs)
Team pool balance (when parentBudgetId is configured)
Token limit guidance (when reduce_max_tokens strategy is active)

Example hint:

Budget: 5000000 USD_MICROCENTS remaining. Budget is low — prefer cheaper models and avoid expensive tools. 50% of budget remaining. Est. ~10 tool calls and ~5 model calls remaining at current rate. Team pool: 50000000 remaining.

Hook: `before_tool_call`

Checks tool permissions against allowlist/blocklist
Applies disable_expensive_tools and limit_remaining_calls strategies
Creates a Cycles reservation with configured TTL, overage policy, and currency
On denial, optionally retries (when retryOnDeny=true)
Blocks or allows based on the reservation decision

Hook: `after_tool_call`

Commits the reservation with actual cost. Uses the costEstimator callback if configured, otherwise uses the original estimate. Tracks per-tool cost breakdowns for the session summary.

Hook: `agent_end`

Releases orphaned reservations (defensive cleanup)
Fetches final budget state
Builds session summary with cost breakdown, forecasts, and timing
Calls onSessionEnd callback and fires analytics webhook if configured
Attaches summary to ctx.metadata["openclaw-budget-guard"]

Chained Model Fallbacks

Model fallbacks support both single values and ordered chains:

{
  "modelFallbacks": {
    "anthropic/claude-opus-4-20250514": ["anthropic/claude-sonnet-4-20250514", "anthropic/claude-haiku-4-5-20251001"],
    "openai/gpt-4o": "openai/gpt-4o-mini"
  }
}

When budget is low, the plugin tries each candidate in order and selects the first one whose cost fits within the remaining budget.

Tool Allowlists and Blocklists

Control which tools can be called using glob-style patterns:

{
  "toolAllowlist": ["web_search", "code_*"],
  "toolBlocklist": ["dangerous_*"]
}

Blocklist takes precedence over allowlist
Supports exact names and * wildcards (prefix: code_*, suffix: *_tool, all: *)

Tool Call Limits

Cap the number of times a specific tool can be invoked per session. Useful for consequential actions like sending emails or triggering deployments:

{
  "toolCallLimits": {
    "send_email": 10,
    "deploy": 3
  }
}

Once a tool reaches its limit, further calls are blocked with a descriptive reason. Tools without a limit are unrestricted. Limits reset on each new agent session.

Budget Transition Alerts

Configure callbacks or webhooks to be notified when budget level changes:

{
  "budgetTransitionWebhookUrl": "https://hooks.example.com/budget-alert"
}

Or programmatically:

{
  onBudgetTransition: (event) => {
    console.log(`Budget changed: ${event.previousLevel} → ${event.currentLevel}`);
  }
}

Error Handling

The plugin exports two structured error types:

import { BudgetExhaustedError, ToolBudgetDeniedError } from "@runcycles/openclaw-budget-guard";

BudgetExhaustedError (code: "BUDGET_EXHAUSTED") — thrown when budget is exhausted and failClosed=true. Includes remaining, tenant, and budgetId properties. The error message includes an actionable hint to increase budget via the Cycles API.
ToolBudgetDeniedError (code: "TOOL_BUDGET_DENIED") — available as a structured error type for tool denials. Includes toolName property.

`failClosed` — Block vs. Allow on Budget Denial

The failClosed setting (default: true) controls what happens when a model reservation is denied — either because the budget is exhausted or because the Cycles server rejects the reservation (e.g., the estimated cost exceeds remaining budget).

failClosed: true — The plugin blocks the model call. It returns a synthetic model override (__cycles_budget_exhausted__) that causes the LLM provider to reject the request. The agent stops. Use this in production when overspend is unacceptable.

failClosed: false — The plugin logs a warning and allows the model call to proceed. The estimated cost is tracked locally (session summary, cost breakdown, forecasting) even though no server-side reservation was committed. Use this for shadow/monitoring mode — you see what would have been blocked without disrupting the agent.

| Scenario | failClosed: true | failClosed: false | |---|---|---| | Budget exhausted (cached snapshot) | Block | Warn + allow | | Server denies reservation (estimate > remaining) | Block | Warn + allow + track cost locally | | Low-budget call limit reached (model) | Block | Warn + allow | | Low-budget call limit reached (tool) | Always block | Always block | | Expensive tool threshold exceeded | Always block | Always block | | Tool reservation denied | Always block | Always block |

Note: All tool-level enforcement (reservation denials, call limits, expensive tool threshold) always blocks regardless of failClosed — tools have no fallback mechanism. failClosed only affects model-level decisions.

Fail-Open Behavior (Network Errors)

Separately from failClosed, the plugin handles network/transient errors with a fail-open strategy by default:

If the Cycles server is unreachable, the plugin assumes healthy budget and allows execution
If a commit fails, execution continues (logged but non-blocking)

failClosed only controls behavior when the server confirms the budget is insufficient — a transient network blip will not kill every agent.

Hardening with failClosedOnSnapshotError (v0.8.3+). Set failClosedOnSnapshotError: true to flip the snapshot-fetch path to fail-closed: an unreachable Cycles server is then treated as exhausted instead of healthy. Combined with failClosed: true, this prevents an attacker (or a bad day) from lifting budget caps by DoS-ing the Cycles control plane. Default remains false for back-compat — existing deployments keep their current fail-open behavior unless they opt in.

{
  "failClosed": true,
  "failClosedOnSnapshotError": true
}

Note: commit failures are still fail-open even with this flag — once a reservation is created, the spend has already been committed server-side and local commit retries can't change that. Reservations expire via TTL if commits never reach the server.

Troubleshooting

"Skipping registration" warning during install

This is normal. OpenClaw loads the plugin during install before your config is written. The plugin detects the missing config, logs a warning, and skips registration. After you add your config and restart the gateway, the plugin will register normally.

Plugin not loading

Verify the plugin is enabled: openclaw plugins list
Check that openclaw.plugin.json is included in the installed package

"Unknown model: openai/__cycles_budget_exhausted__" or "Budget exhausted"

Your budget has run out. To resume:

Fund the budget via the Cycles Admin API:

curl -X POST "http://localhost:7979/v1/admin/budgets/fund?scope=tenant:my-org&unit=USD_MICROCENTS" \
  -H "X-Cycles-API-Key: your-admin-key" \
  -H "Content-Type: application/json" \
  -d '{"operation": "CREDIT", "amount": 50000000, "idempotency_key": "topup-001"}'

This adds 50,000,000 units ($0.50) to the budget. Adjust the scope to match your tenant and budgetScope.

Start a new agent session — the plugin fetches fresh budget state at the start of each session.

For details on budget management, see Budget Allocation and Management.

"cyclesBaseUrl is required" error

Set cyclesBaseUrl in your plugin config (use "${CYCLES_BASE_URL}" for env var interpolation)

Budget always shows "healthy"

Verify currency, tenant, and budgetScope match your Cycles setup
Set logLevel: "debug" to see raw balance responses

Tools not being blocked

Check toolBaseCosts includes your tool (default cost is 100,000 units)
Check failClosed is true (default)

Model not being downgraded

The exact model name must match a key in modelFallbacks
Check model costs in modelBaseCosts — fallback must be cheaper than remaining budget

Production checklist

Before deploying to production:

[ ] API key stored as env var (CYCLES_API_KEY), not in config file
[ ] failClosed: true (default — blocks on exhausted budget)
[ ] dryRun: false (default — uses real Cycles server)
[ ] modelBaseCosts set for each model your agent uses
[ ] toolBaseCosts set for at least your top 5 tools by usage
[ ] toolCallLimits set for dangerous tools (send_email, deploy, etc.)
[ ] lowBudgetThreshold calibrated for your session duration (default 10M = $0.10)
[ ] Budget transition monitoring via onBudgetTransition callback or budgetTransitionWebhookUrl
[ ] Session analytics via onSessionEnd callback or analyticsWebhookUrl
[ ] Run one test session with logLevel: "debug" and enableEventLog: true to verify costs

Known Limitations

| Limitation | Impact | Workaround | |---|---|---| | Model cost is estimated by default. OpenClaw has no after_model_resolve hook, so model costs are based on modelBaseCosts estimates. The modelCostEstimator callback can reconcile costs if you have a proxy or gateway with token counts. | Cost tracking for models is approximate unless you provide a modelCostEstimator. The plugin will never overspend — it may under-track slightly. | Use modelCostEstimator to reconcile costs. Or buffer modelBaseCosts estimates 10–20% higher than expected. | | ALLOW_WITH_CAPS decisions are not enforced. If the Cycles server returns caps (max_tokens, tool allowlist) alongside an ALLOW decision, the plugin stores them but does not apply them downstream. | Low risk — v0 Cycles servers rarely return caps. | Monitor Cycles protocol updates. | | Per-user/session scoping uses custom dimensions. User and session IDs are passed as dimensions.user / dimensions.session in the reservation subject. v0 Cycles servers may ignore custom dimensions for balance filtering. | Per-user budget isolation depends on server support for dimensions. | Verify scoping works with your Cycles server version before relying on it in production. | | Heartbeat requires client support. Reservation auto-extension (heartbeatIntervalMs) calls client.extendReservation(). If the Cycles client does not implement this method, heartbeats are silently skipped. | Long-running tools may still lose cost tracking if the client lacks extendReservation. | Use per-tool TTL overrides via toolReservationTtls as fallback. | | Model blocking uses a provider-error workaround. OpenClaw's before_model_resolve hook does not support { block: true } (feature request). When budget is exhausted, the plugin overrides the model to __cycles_budget_exhausted__, causing the provider to reject the call. The user sees "Unknown model" instead of a clean budget error. | Model calls are effectively blocked, but the error message is a provider error rather than a budget message. Tool blocking via before_tool_call works cleanly with { block: true }. | Pending OpenClaw adding block support to before_model_resolve. | | OpenClaw does not pass model name in hook events. The before_model_resolve event only contains { prompt } — no model name (feature request). The plugin auto-detects the model from system config or falls back to defaultModelName. | Model-specific cost tracking requires defaultModelName to be set in plugin config. | Set defaultModelName to your agent's model (e.g. "openai/gpt-5-nano"). |

For project structure, architecture diagrams, and development workflow, see ARCHITECTURE.md.

Documentation

Cycles Documentation — full docs site
OpenClaw Integration Guide — detailed integration guide
API Key Management — creating and managing API keys

License

Apache-2.0