npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@runcycles/openclaw-budget-guard

v0.7.7

Published

OpenClaw plugin for budget-aware model and tool execution using Cycles.

Downloads

1,358

Readme

cycles-openclaw-budget-guard

CI npm License Node TypeScript Coverage

OpenClaw plugin for budget-aware model and tool execution using Cycles.

Why use this plugin?

AI agents make autonomous decisions — calling models, invoking tools, retrying on failure — with no human in the loop. Without runtime enforcement, several things go wrong:

Runaway spend. A single agent stuck in a tool loop or retrying failed calls can burn through hundreds of dollars in minutes. Provider spending caps are account-wide and too coarse. Rate limits don't account for cost. In-app counters don't survive restarts or coordinate across concurrent agents.

Uncontrolled side-effects. An agent can send 100 emails, trigger 50 deployments, or call dangerous APIs with nothing to stop it. Cost limits alone don't help — some actions are consequential regardless of price.

Noisy neighbors. In multi-tenant or multi-user setups, one agent can consume the entire team or tenant budget, starving other users. Without per-user scoping, there's no isolation.

No session-level cost visibility. When an agent session ends, you have no idea what it spent, which tools it called most, or whether it was cost-efficient. Debugging cost overruns after the fact is painful.

Abrupt failure. When budget runs out, the agent crashes instead of adapting — switching to cheaper models, reducing output length, or disabling expensive tools.

This plugin addresses those failure modes by checking model and tool execution before it runs, then degrading or blocking when budget conditions require it. It also tracks session-level cost breakdowns, tool usage, and budget transitions for debugging and operations.

Beyond enforcement, the plugin monitors for problems as they develop:

  • Burn rate anomaly detection catches runaway tool loops — if spending spikes 3x above the session average, onBurnRateAnomaly fires immediately
  • Predictive exhaustion warnings estimate when budget will run out and fire onExhaustionForecast before it happens
  • Automatic retry with backoff on transient Cycles server errors (429/503) prevents spurious denials under load
  • Reservation heartbeat auto-extends long-running tool reservations so cost tracking doesn't silently break
  • Observability via metricsEmitter (Datadog, Prometheus, Grafana, OTLP) and opt-in session event logs

In typical OpenClaw setups, you can add enforcement without changing agent logic.

For deeper background, see Why Rate Limits Are Not Enough and Runaway Agents and Tool Loops.

Overview

A comprehensive OpenClaw plugin that integrates with a live Cycles server to enforce budget boundaries during agent execution. It hooks into the OpenClaw plugin lifecycle to:

  • Reserve budget for model and tool calls using the reserve → commit → release protocol
  • Downgrade models when budget is low (configurable fallback chains)
  • Block execution when budget is exhausted (fail-closed by default)
  • Inject budget hints into prompts so the model is budget-aware
  • Detect budget transitions and fire callbacks/webhooks on level changes
  • Control tool access with allowlists, blocklists, and per-tool call limits
  • Apply graceful degradation strategies when budget is low
  • Retry denied reservations and transient server errors with configurable backoff
  • Keep long-running tools alive with automatic reservation heartbeat
  • Detect anomalies — burn rate spikes and predictive exhaustion warnings
  • Emit metrics to Datadog, Prometheus, Grafana, or any OTLP-compatible backend
  • Record an event log of every budget decision for debugging and compliance
  • Report unconfigured tools so you know which tools are using default cost estimates
  • Support dry-run mode for testing without a live Cycles server
  • Track per-tool cost breakdowns and session analytics with model cost reconciliation
  • Support multi-currency budgets with per-tool/model overrides
  • Support budget pools/hierarchies via parent budget visibility

The plugin uses the runcycles TypeScript client to communicate with a Cycles server.

Important: Budget exhaustion is enforced fail-closed by default, but Cycles server connectivity failures are handled fail-open — the plugin assumes healthy budget and allows execution to continue. See Fail-Open Behavior for details.

Prerequisites

  • OpenClaw >= 0.1.0 with plugin support
  • Node.js >= 20.0.0
  • A running Cycles server with:
    • A base URL (e.g. http://localhost:7878)
    • An API key
    • A tenant configured with a budget scope

If you don't have a Cycles server yet, see the Cycles quickstart to set one up. Alternatively, use dry-run mode to test without a server.

To see budget enforcement in action before wiring up your own agent, run the Cycles Runaway Demo — it shows the exact failure mode this plugin prevents, with a live before/after comparison.

Quick Start

1. Install the plugin

openclaw plugins install @runcycles/openclaw-budget-guard

For local development:

openclaw plugins install -l ./cycles-openclaw-budget-guard

2. Enable the plugin

openclaw plugins enable openclaw-budget-guard

3. Add minimal configuration

Add the following to your OpenClaw config file (typically openclaw.json or openclaw.config.json):

{
  "plugins": {
    "entries": {
      "openclaw-budget-guard": {
        "config": {
          "cyclesBaseUrl": "http://localhost:7878",
          "cyclesApiKey": "cyc_your_api_key_here",
          "tenant": "my-org"
        }
      }
    }
  }
}

That's it — the plugin uses sensible defaults for everything else. The agent will now enforce budget limits on every run.

Need an API key? API keys are created via the Cycles Admin Server (port 7979). See the deployment guide to create one, or see API Key Management for details.

4. (Optional) Keep secrets out of config files

Use OpenClaw's env var interpolation to avoid hardcoding API keys:

{
  "plugins": {
    "entries": {
      "openclaw-budget-guard": {
        "config": {
          "cyclesBaseUrl": "${CYCLES_BASE_URL}",
          "cyclesApiKey": "${CYCLES_API_KEY}",
          "tenant": "my-org"
        }
      }
    }
  }
}

Then set the env vars in your shell or CI:

export CYCLES_BASE_URL="http://localhost:7878"
export CYCLES_API_KEY="cyc_your_api_key_here"

5. (Optional) Try dry-run mode

To test without a live Cycles server:

{
  "plugins": {
    "entries": {
      "openclaw-budget-guard": {
        "config": {
          "tenant": "my-org",
          "cyclesBaseUrl": "http://unused",
          "cyclesApiKey": "unused",
          "dryRun": true,
          "dryRunBudget": 100000000
        }
      }
    }
  }
}

6. Verify it's working

After restarting OpenClaw, check the logs for:

  Cycles Budget Guard for OpenClaw v0.6.1
  https://runcycles.io
  tenant: my-org
  cyclesBaseUrl: http://localhost:7878
  ...

Run your agent and look for budget activity:

[openclaw-budget-guard] before_model_resolve: model=anthropic/claude-sonnet-4-20250514 level=healthy

If you see this, the plugin is actively checking budget on every model and tool call.

Understanding the cost model

The plugin uses a simple model: every model call and tool call reserves a fixed cost from the budget.

Currency. The default is USD_MICROCENTS — 1 unit = $0.00001 (one hundred-thousandth of a dollar). So:

| Amount (units) | USD equivalent | |----------------|---------------| | 100,000 | $0.001 (0.1 cents) | | 1,000,000 | $0.01 (1 cent) | | 10,000,000 | $0.10 (10 cents) | | 100,000,000 | $1.00 |

Example. With a $5 budget (500,000,000 units):

  • anthropic/claude-opus at 1,500,000/call = ~333 calls before exhaustion
  • anthropic/claude-sonnet at 300,000/call = ~1,666 calls
  • web_search at 500,000/call = ~1,000 calls
  • lowBudgetThreshold: 10000000 triggers model downgrade when $0.10 remains

Model names. OpenClaw passes model identifiers in provider/model format (e.g., openai/gpt-4o, anthropic/claude-sonnet-4-20250514). Your modelBaseCosts, modelFallbacks, and defaultModelName must use the same format — bare model names like gpt-4o won't match. The plugin automatically strips the provider prefix when returning modelOverride to OpenClaw, so you can use provider/model consistently in all config fields without double-prefixing issues.

Setting toolBaseCosts. Start with the default (100,000 units per call). After your first session, check the unconfiguredTools list in the session summary — it tells you which tools need explicit costs. For tools that call external APIs, estimate higher (500K-1M). For lightweight tools, estimate lower (10K-50K).

Full Configuration Example

{
  "plugins": {
    "entries": {
      "openclaw-budget-guard": {
        "config": {
          "enabled": true,
          "cyclesBaseUrl": "http://localhost:7878",
          "cyclesApiKey": "cyc_your_api_key_here",
          "tenant": "my-org",
          "budgetId": "my-app",
          "currency": "USD_MICROCENTS",
          "lowBudgetThreshold": 10000000,
          "exhaustedThreshold": 0,
          "modelFallbacks": {
            "anthropic/claude-opus-4-20250514": ["anthropic/claude-sonnet-4-20250514", "anthropic/claude-haiku-4-5-20251001"],
            "openai/gpt-4o": "openai/gpt-4o-mini"
          },
          "modelBaseCosts": {
            "anthropic/claude-opus-4-20250514": 1500000,
            "anthropic/claude-sonnet-4-20250514": 300000,
            "openai/gpt-4o": 1000000,
            "openai/gpt-4o-mini": 100000
          },
          "toolBaseCosts": {
            "web_search": 500000,
            "code_execution": 1000000
          },
          "toolCallLimits": {
            "send_email": 10,
            "deploy": 3
          },
          "injectPromptBudgetHint": true,
          "maxPromptHintChars": 200,
          "failClosed": true,
          "logLevel": "info",
          "reservationTtlMs": 60000,
          "overagePolicy": "ALLOW_IF_AVAILABLE",
          "lowBudgetStrategies": ["downgrade_model"],
          "maxTokensWhenLow": 1024,
          "retryOnDeny": false,
          "dryRun": false
        }
      }
    }
  }
}

Config Presets

Common starting configurations for typical deployment scenarios.

Strict Enforcement

For production agents handling real spend. Blocks on exhaustion, downgrades models, caps tool calls:

{
  "plugins": {
    "entries": {
      "openclaw-budget-guard": {
        "config": {
          "tenant": "my-org",
          "failClosed": true,
          "lowBudgetStrategies": ["downgrade_model", "disable_expensive_tools", "limit_remaining_calls"],
          "modelFallbacks": {
            "anthropic/claude-opus-4-20250514": ["anthropic/claude-sonnet-4-20250514", "anthropic/claude-haiku-4-5-20251001"]
          },
          "modelBaseCosts": {
            "anthropic/claude-opus-4-20250514": 1500000,
            "anthropic/claude-sonnet-4-20250514": 300000,
            "anthropic/claude-haiku-4-5-20251001": 100000
          },
          "toolBaseCosts": {
            "web_search": 500000,
            "code_execution": 1000000
          },
          "toolCallLimits": {
            "send_email": 10,
            "deploy": 3
          },
          "maxRemainingCallsWhenLow": 5
        }
      }
    }
  }
}

Development / Testing

Dry-run mode with generous budget. No Cycles server needed:

{
  "plugins": {
    "entries": {
      "openclaw-budget-guard": {
        "config": {
          "tenant": "dev",
          "cyclesBaseUrl": "http://unused",
          "cyclesApiKey": "unused",
          "dryRun": true,
          "dryRunBudget": 500000000,
          "logLevel": "debug"
        }
      }
    }
  }
}

Cost-Conscious

Aggressive cost savings. Low thresholds, model downgrade with token limits, expensive tools disabled early:

{
  "plugins": {
    "entries": {
      "openclaw-budget-guard": {
        "config": {
          "tenant": "my-org",
          "lowBudgetThreshold": 5000000,
          "exhaustedThreshold": 100000,
          "lowBudgetStrategies": ["downgrade_model", "reduce_max_tokens", "disable_expensive_tools"],
          "maxTokensWhenLow": 512,
          "expensiveToolThreshold": 200000,
          "modelFallbacks": {
            "anthropic/claude-opus-4-20250514": "anthropic/claude-haiku-4-5-20251001",
            "openai/gpt-4o": "openai/gpt-4o-mini"
          }
        }
      }
    }
  }
}

Configure for your use case

Most users only need 5-10 config properties. Start with what you need:

I just want to stop runaway agents (3 required fields only):

{ "tenant": "my-org", "cyclesBaseUrl": "...", "cyclesApiKey": "..." }

The defaults (failClosed: true, lowBudgetThreshold: 10000000) will block agents that exhaust their budget and warn when it gets low.

I want cost-aware model selection — add:

{
  "modelFallbacks": { "anthropic/claude-opus-4-20250514": ["anthropic/claude-sonnet-4-20250514", "anthropic/claude-haiku-4-5-20251001"] },
  "modelBaseCosts": { "anthropic/claude-opus-4-20250514": 1500000, "anthropic/claude-sonnet-4-20250514": 300000, "anthropic/claude-haiku-4-5-20251001": 100000 }
}

I want to cap dangerous tool calls — add:

{ "toolCallLimits": { "send_email": 10, "deploy": 3, "delete_data": 1 } }

I want observability — add:

{ "otlpMetricsEndpoint": "http://localhost:4318/v1/metrics" }

I want to catch runaway loops — add:

{ "burnRateAlertThreshold": 3.0, "onBurnRateAnomaly": "..." }

I want full debugging — add:

{ "enableEventLog": true, "logLevel": "debug" }

Config Reference

Core Settings

| Field | Type | Default | Description | |-------|------|---------|-------------| | enabled | boolean | true | Master switch — set to false to disable the plugin | | cyclesBaseUrl | string | — | Cycles server URL (required) | | cyclesApiKey | string | — | Cycles API key (required) | | tenant | string | — | Cycles tenant identifier (required) | | budgetId | string | — | Optional app-level scope for balance queries and reservations | | currency | string | USD_MICROCENTS | Default budget unit for all reservations | | failClosed | boolean | true | Block model calls when budget is exhausted or reservation is denied (false = warn, allow, and track cost locally). See failClosed behavior. | | logLevel | string | info | debug / info / warn / error |

Budget Thresholds

| Field | Type | Default | Description | |-------|------|---------|-------------| | lowBudgetThreshold | number | 10000000 | Remaining budget at or below this triggers "low" mode | | exhaustedThreshold | number | 0 | Remaining budget at or below this triggers "exhausted" mode |

Note: exhaustedThreshold must be strictly less than lowBudgetThreshold.

Model Configuration

| Field | Type | Default | Description | |-------|------|---------|-------------| | modelFallbacks | object | {} | Map: model → fallback model or chain of fallbacks (string or string[]) | | modelBaseCosts | object | {} | Map: model name → estimated cost per call | | defaultModelCost | number | 500000 | Fallback cost when a model isn't in modelBaseCosts | | defaultModelActionKind | string | llm.completion | Action kind for model reservations | | modelCurrency | string | — | Override currency for model reservations (defaults to currency) |

Tool Configuration

| Field | Type | Default | Description | |-------|------|---------|-------------| | toolBaseCosts | object | {} | Map: tool name → estimated cost per call | | defaultToolActionKindPrefix | string | tool. | Prefix for tool action kinds (e.g. tool.web_search) | | toolAllowlist | string[] | — | Only these tools are permitted (supports * wildcards) | | toolBlocklist | string[] | — | These tools are blocked (supports * wildcards, takes precedence over allowlist) | | toolCurrencies | object | — | Map: tool name → currency override | | toolReservationTtls | object | — | Map: tool name → TTL override in ms | | toolOveragePolicies | object | — | Map: tool name → overage policy override | | toolCallLimits | object | — | Map: tool name → max invocations per session (e.g. {"send_email": 10}) |

Prompt Hints

| Field | Type | Default | Description | |-------|------|---------|-------------| | injectPromptBudgetHint | boolean | true | Inject budget status into the system prompt | | maxPromptHintChars | number | 200 | Max characters for the injected budget hint |

Reservation Settings

| Field | Type | Default | Description | |-------|------|---------|-------------| | reservationTtlMs | number | 60000 | Default TTL for tool reservations (ms) | | overagePolicy | string | ALLOW_IF_AVAILABLE | Default overage policy (REJECT, ALLOW_IF_AVAILABLE, ALLOW_WITH_OVERDRAFT) | | snapshotCacheTtlMs | number | 5000 | How long to cache budget snapshots (ms) |

Low Budget Strategies

When budget drops below lowBudgetThreshold, the plugin applies degradation strategies to reduce spend. Strategies only activate when explicitly listed in lowBudgetStrategies. The default is ["downgrade_model"].

| Field | Type | Default | Description | |-------|------|---------|-------------| | lowBudgetStrategies | string[] | ["downgrade_model"] | Strategies to apply when budget is low. Each strategy below only takes effect when listed here. | | maxTokensWhenLow | number | 1024 | Token limit hint (requires "reduce_max_tokens" in lowBudgetStrategies) | | expensiveToolThreshold | number | — | Cost threshold (requires "disable_expensive_tools" in lowBudgetStrategies) | | maxRemainingCallsWhenLow | number | 10 | Max calls allowed (requires "limit_remaining_calls" in lowBudgetStrategies) |

downgrade_model — Switch to cheaper models when budget is low. Requires modelFallbacks to define the fallback chain. The plugin tries each candidate in order and picks the first one whose cost (from modelBaseCosts) fits within the remaining budget. If no candidate fits, the original model is used.

{
  "lowBudgetStrategies": ["downgrade_model"],
  "modelFallbacks": {
    "anthropic/claude-opus-4-20250514": ["anthropic/claude-sonnet-4-20250514", "anthropic/claude-haiku-4-5-20251001"]
  },
  "modelBaseCosts": {
    "anthropic/claude-opus-4-20250514": 1500000,
    "anthropic/claude-sonnet-4-20250514": 300000,
    "anthropic/claude-haiku-4-5-20251001": 100000
  }
}

reduce_max_tokens — Append a token limit instruction to the system prompt hint (e.g., "Limit responses to 512 tokens"). This is advisory — the LLM may not obey it. Does not enforce a hard token cap at the API level.

{
  "lowBudgetStrategies": ["downgrade_model", "reduce_max_tokens"],
  "maxTokensWhenLow": 512
}

disable_expensive_tools — Block tools whose estimated cost exceeds a threshold. The threshold defaults to lowBudgetThreshold / 10 if not set explicitly. Tools are always hard-blocked (regardless of failClosed).

{
  "lowBudgetStrategies": ["downgrade_model", "disable_expensive_tools"],
  "expensiveToolThreshold": 200000,
  "toolBaseCosts": {
    "web_search": 500000,
    "code_execution": 1000000,
    "read_file": 50000
  }
}

In this example, web_search (500K) and code_execution (1M) would be blocked when budget is low, but read_file (50K) would still be allowed.

limit_remaining_calls — Cap the total number of model + tool calls allowed while budget is low. Both model and tool calls decrement a shared counter. When the counter reaches zero, models respect failClosed (block or warn) while tools are always blocked.

{
  "lowBudgetStrategies": ["downgrade_model", "limit_remaining_calls"],
  "maxRemainingCallsWhenLow": 5
}

Important: Each strategy's config parameters (e.g., maxTokensWhenLow, expensiveToolThreshold, maxRemainingCallsWhenLow) are silently ignored unless the corresponding strategy is listed in lowBudgetStrategies. The plugin warns at startup if it detects this misconfiguration.

Strategies can be combined. They run in different hooks:

  • Model calls (before_model_resolve): downgrade_modellimit_remaining_calls
  • Tool calls (before_tool_call): disable_expensive_toolslimit_remaining_calls
  • Prompt build (before_prompt_build): reduce_max_tokens

Within each hook, an earlier strategy that blocks prevents later strategies from running.

A typical production config uses all four:

{
  "lowBudgetStrategies": ["downgrade_model", "reduce_max_tokens", "disable_expensive_tools", "limit_remaining_calls"],
  "maxTokensWhenLow": 512,
  "expensiveToolThreshold": 200000,
  "maxRemainingCallsWhenLow": 5
}

Retry on Deny

| Field | Type | Default | Description | |-------|------|---------|-------------| | retryOnDeny | boolean | false | Retry tool reservations after denial | | retryDelayMs | number | 2000 | Delay between retries (ms) | | maxRetries | number | 1 | Maximum retry attempts |

Dry-Run Mode

| Field | Type | Default | Description | |-------|------|---------|-------------| | dryRun | boolean | false | Use in-memory simulated budget (no Cycles server needed) | | dryRunBudget | number | 100000000 | Starting budget for dry-run mode |

Cost Estimation

| Field | Type | Default | Description | |-------|------|---------|-------------| | costEstimator | function | — | Custom callback (context) => number \| undefined for dynamic tool cost estimation |

The costEstimator receives a context object with toolName, durationMs, estimate, and result and should return the actual cost or undefined to use the estimate.

Budget Transitions

| Field | Type | Default | Description | |-------|------|---------|-------------| | onBudgetTransition | function | — | Callback fired when budget level changes (e.g. healthy → low) | | budgetTransitionWebhookUrl | string | — | POST webhook URL for budget level transitions |

Per-User/Session Scoping

| Field | Type | Default | Description | |-------|------|---------|-------------| | userId | string | — | User ID for budget scoping (can be overridden via ctx.metadata.userId) | | sessionId | string | — | Session ID for budget scoping (can be overridden via ctx.metadata.sessionId) |

Session Analytics

| Field | Type | Default | Description | |-------|------|---------|-------------| | onSessionEnd | function | — | Callback with session summary at agent end | | analyticsWebhookUrl | string | — | POST webhook URL for session summary data |

Budget Pools

| Field | Type | Default | Description | |-------|------|---------|-------------| | parentBudgetId | string | — | Parent budget ID — when set, pool balance is included in hints |

Model Cost Reconciliation (v0.5.0)

| Field | Type | Default | Description | |-------|------|---------|-------------| | modelCostEstimator | function | — | Callback (ctx: { model, estimatedCost, turnIndex }) => number | undefined to reconcile model cost at commit time |

Observability (v0.5.0)

| Field | Type | Default | Description | |-------|------|---------|-------------| | metricsEmitter | object | — | Object with gauge/counter/histogram methods for observability pipeline integration | | aggressiveCacheInvalidation | boolean | true | Proactively refetch budget snapshot after every commit/release for fresher data | | otlpMetricsEndpoint | string | — | OTLP HTTP endpoint for auto metrics export (e.g. http://localhost:4318/v1/metrics) | | otlpMetricsHeaders | object | — | Custom HTTP headers for OTLP requests |

Resilience (v0.6.0)

| Field | Type | Default | Description | |-------|------|---------|-------------| | heartbeatIntervalMs | number | 30000 | Interval for auto-extending long-running tool reservations (ms). Set 0 to disable. | | retryableStatusCodes | number[] | [429, 503, 504] | HTTP status codes that trigger automatic retry with exponential backoff | | transientRetryMaxAttempts | number | 2 | Max retry attempts for transient Cycles server errors | | transientRetryBaseDelayMs | number | 500 | Base delay for exponential backoff on retries (ms) |

Anomaly Detection (v0.6.0)

| Field | Type | Default | Description | |-------|------|---------|-------------| | burnRateWindowMs | number | 60000 | Time window for burn rate anomaly detection (ms) | | burnRateAlertThreshold | number | 3.0 | Alert when current window burn rate exceeds this multiple of the previous window | | onBurnRateAnomaly | function | — | Callback (event: BurnRateAnomalyEvent) => void on burn rate spike | | exhaustionWarningThresholdMs | number | 120000 | Warn when estimated time-to-exhaustion drops below this (ms) | | onExhaustionForecast | function | — | Callback (event: ExhaustionForecastEvent) => void on exhaustion forecast |

Debugging (v0.6.0)

| Field | Type | Default | Description | |-------|------|---------|-------------| | enableEventLog | boolean | false | Record every reserve/commit/deny/block decision in sessionSummary.eventLog |

How It Works

Budget Levels

| Level | Condition | What Happens | |-------|-----------|--------------| | healthy | remaining > lowBudgetThreshold | Pass through — no intervention | | low | exhaustedThreshold < remaining <= lowBudgetThreshold | Apply low-budget strategies, inject warnings | | exhausted | remaining <= exhaustedThreshold | Block execution (failClosed=true) or warn + track locally (failClosed=false) |

Hook: before_model_resolve

Fetches budget state and reserves budget for the model call. The reservation is held open and committed later (in before_prompt_build or at agent_end), allowing the optional modelCostEstimator callback to reconcile estimated vs actual costs. When budget is low:

  • Applies model fallbacks (including chained fallbacks like opus → [sonnet, haiku])
  • Enforces limit_remaining_calls if configured
  • Attaches budget status metadata to ctx.metadata["openclaw-budget-guard-status"]

When budget is exhausted and failClosed=true, the plugin blocks the model call by overriding the model name to __cycles_budget_exhausted__, which causes the LLM provider to reject the request. The user sees "Unknown model: openai/cycles_budget_exhausted" — this is intentional. OpenClaw's before_model_resolve hook does not support { block: true } like before_tool_call does (feature request), so this workaround is the only way to prevent model execution when budget runs out.

Hook: before_prompt_build

Commits any pending model reservation from the previous turn (with modelCostEstimator reconciliation if configured). When injectPromptBudgetHint is enabled, injects a system context hint with:

  • Current remaining balance and percentage
  • Budget level warnings
  • Forecast projections (estimated remaining tool/model calls based on average costs)
  • Team pool balance (when parentBudgetId is configured)
  • Token limit guidance (when reduce_max_tokens strategy is active)

Example hint:

Budget: 5000000 USD_MICROCENTS remaining. Budget is low — prefer cheaper models and avoid expensive tools. 50% of budget remaining. Est. ~10 tool calls and ~5 model calls remaining at current rate. Team pool: 50000000 remaining.

Hook: before_tool_call

  1. Checks tool permissions against allowlist/blocklist
  2. Applies disable_expensive_tools and limit_remaining_calls strategies
  3. Creates a Cycles reservation with configured TTL, overage policy, and currency
  4. On denial, optionally retries (when retryOnDeny=true)
  5. Blocks or allows based on the reservation decision

Hook: after_tool_call

Commits the reservation with actual cost. Uses the costEstimator callback if configured, otherwise uses the original estimate. Tracks per-tool cost breakdowns for the session summary.

Hook: agent_end

  1. Releases orphaned reservations (defensive cleanup)
  2. Fetches final budget state
  3. Builds session summary with cost breakdown, forecasts, and timing
  4. Calls onSessionEnd callback and fires analytics webhook if configured
  5. Attaches summary to ctx.metadata["openclaw-budget-guard"]

Chained Model Fallbacks

Model fallbacks support both single values and ordered chains:

{
  "modelFallbacks": {
    "anthropic/claude-opus-4-20250514": ["anthropic/claude-sonnet-4-20250514", "anthropic/claude-haiku-4-5-20251001"],
    "openai/gpt-4o": "openai/gpt-4o-mini"
  }
}

When budget is low, the plugin tries each candidate in order and selects the first one whose cost fits within the remaining budget.

Tool Allowlists and Blocklists

Control which tools can be called using glob-style patterns:

{
  "toolAllowlist": ["web_search", "code_*"],
  "toolBlocklist": ["dangerous_*"]
}
  • Blocklist takes precedence over allowlist
  • Supports exact names and * wildcards (prefix: code_*, suffix: *_tool, all: *)

Tool Call Limits

Cap the number of times a specific tool can be invoked per session. Useful for consequential actions like sending emails or triggering deployments:

{
  "toolCallLimits": {
    "send_email": 10,
    "deploy": 3
  }
}

Once a tool reaches its limit, further calls are blocked with a descriptive reason. Tools without a limit are unrestricted. Limits reset on each new agent session.

Budget Transition Alerts

Configure callbacks or webhooks to be notified when budget level changes:

{
  "budgetTransitionWebhookUrl": "https://hooks.example.com/budget-alert"
}

Or programmatically:

{
  onBudgetTransition: (event) => {
    console.log(`Budget changed: ${event.previousLevel} → ${event.currentLevel}`);
  }
}

Error Handling

The plugin exports two structured error types:

import { BudgetExhaustedError, ToolBudgetDeniedError } from "@runcycles/openclaw-budget-guard";
  • BudgetExhaustedError (code: "BUDGET_EXHAUSTED") — thrown when budget is exhausted and failClosed=true. Includes remaining, tenant, and budgetId properties. The error message includes an actionable hint to increase budget via the Cycles API.
  • ToolBudgetDeniedError (code: "TOOL_BUDGET_DENIED") — available as a structured error type for tool denials. Includes toolName property.

failClosed — Block vs. Allow on Budget Denial

The failClosed setting (default: true) controls what happens when a model reservation is denied — either because the budget is exhausted or because the Cycles server rejects the reservation (e.g., the estimated cost exceeds remaining budget).

failClosed: true — The plugin blocks the model call. It returns a synthetic model override (__cycles_budget_exhausted__) that causes the LLM provider to reject the request. The agent stops. Use this in production when overspend is unacceptable.

failClosed: false — The plugin logs a warning and allows the model call to proceed. The estimated cost is tracked locally (session summary, cost breakdown, forecasting) even though no server-side reservation was committed. Use this for shadow/monitoring mode — you see what would have been blocked without disrupting the agent.

| Scenario | failClosed: true | failClosed: false | |---|---|---| | Budget exhausted (cached snapshot) | Block | Warn + allow | | Server denies reservation (estimate > remaining) | Block | Warn + allow + track cost locally | | Low-budget call limit reached (model) | Block | Warn + allow | | Low-budget call limit reached (tool) | Always block | Always block | | Expensive tool threshold exceeded | Always block | Always block | | Tool reservation denied | Always block | Always block |

Note: All tool-level enforcement (reservation denials, call limits, expensive tool threshold) always blocks regardless of failClosed — tools have no fallback mechanism. failClosed only affects model-level decisions.

Fail-Open Behavior (Network Errors)

Separately from failClosed, the plugin handles network/transient errors with a fail-open strategy:

  • If the Cycles server is unreachable, the plugin assumes healthy budget and allows execution
  • If a commit fails, execution continues (logged but non-blocking)

This is always fail-open regardless of failClosed — a transient network blip should not kill every agent. failClosed only controls behavior when the server confirms the budget is insufficient.

Troubleshooting

"Skipping registration" warning during install

  • This is normal. OpenClaw loads the plugin during install before your config is written. The plugin detects the missing config, logs a warning, and skips registration. After you add your config and restart the gateway, the plugin will register normally.

Plugin not loading

  • Verify the plugin is enabled: openclaw plugins list
  • Check that openclaw.plugin.json is included in the installed package

"Unknown model: openai/__cycles_budget_exhausted__" or "Budget exhausted"

Your budget has run out. To resume:

  1. Fund the budget via the Cycles Admin API:

    curl -X POST "http://localhost:7979/v1/admin/budgets/fund?scope=tenant:my-org&unit=USD_MICROCENTS" \
      -H "X-Cycles-API-Key: your-admin-key" \
      -H "Content-Type: application/json" \
      -d '{"operation": "CREDIT", "amount": 50000000, "idempotency_key": "topup-001"}'

    This adds 50,000,000 units ($0.50) to the budget. Adjust the scope to match your tenant (and budgetId if set).

  2. Start a new agent session — the plugin fetches fresh budget state at the start of each session.

For details on budget management, see Budget Allocation and Management.

"cyclesBaseUrl is required" error

  • Set cyclesBaseUrl in your plugin config (use "${CYCLES_BASE_URL}" for env var interpolation)

Budget always shows "healthy"

  • Verify currency, tenant, and budgetId match your Cycles setup
  • Set logLevel: "debug" to see raw balance responses

Tools not being blocked

  • Check toolBaseCosts includes your tool (default cost is 100,000 units)
  • Check failClosed is true (default)

Model not being downgraded

  • The exact model name must match a key in modelFallbacks
  • Check model costs in modelBaseCosts — fallback must be cheaper than remaining budget

Production checklist

Before deploying to production:

  • [ ] API key stored as env var (CYCLES_API_KEY), not in config file
  • [ ] failClosed: true (default — blocks on exhausted budget)
  • [ ] dryRun: false (default — uses real Cycles server)
  • [ ] modelBaseCosts set for each model your agent uses
  • [ ] toolBaseCosts set for at least your top 5 tools by usage
  • [ ] toolCallLimits set for dangerous tools (send_email, deploy, etc.)
  • [ ] lowBudgetThreshold calibrated for your session duration (default 10M = $0.10)
  • [ ] Budget transition monitoring via onBudgetTransition callback or budgetTransitionWebhookUrl
  • [ ] Session analytics via onSessionEnd callback or analyticsWebhookUrl
  • [ ] Run one test session with logLevel: "debug" and enableEventLog: true to verify costs

Known Limitations

| Limitation | Impact | Workaround | |---|---|---| | Model cost is estimated by default. OpenClaw has no after_model_resolve hook, so model costs are based on modelBaseCosts estimates. The modelCostEstimator callback can reconcile costs if you have a proxy or gateway with token counts. | Cost tracking for models is approximate unless you provide a modelCostEstimator. The plugin will never overspend — it may under-track slightly. | Use modelCostEstimator to reconcile costs. Or buffer modelBaseCosts estimates 10–20% higher than expected. | | ALLOW_WITH_CAPS decisions are not enforced. If the Cycles server returns caps (max_tokens, tool allowlist) alongside an ALLOW decision, the plugin stores them but does not apply them downstream. | Low risk — v0 Cycles servers rarely return caps. | Monitor Cycles protocol updates. | | Per-user/session scoping uses custom dimensions. User and session IDs are passed as dimensions.user / dimensions.session in the reservation subject. v0 Cycles servers may ignore custom dimensions for balance filtering. | Per-user budget isolation depends on server support for dimensions. | Verify scoping works with your Cycles server version before relying on it in production. | | Heartbeat requires client support. Reservation auto-extension (heartbeatIntervalMs) calls client.extendReservation(). If the Cycles client does not implement this method, heartbeats are silently skipped. | Long-running tools may still lose cost tracking if the client lacks extendReservation. | Use per-tool TTL overrides via toolReservationTtls as fallback. | | Model blocking uses a provider-error workaround. OpenClaw's before_model_resolve hook does not support { block: true } (feature request). When budget is exhausted, the plugin overrides the model to __cycles_budget_exhausted__, causing the provider to reject the call. The user sees "Unknown model" instead of a clean budget error. | Model calls are effectively blocked, but the error message is a provider error rather than a budget message. Tool blocking via before_tool_call works cleanly with { block: true }. | Pending OpenClaw adding block support to before_model_resolve. | | OpenClaw does not pass model name in hook events. The before_model_resolve event only contains { prompt } — no model name (feature request). The plugin auto-detects the model from system config or falls back to defaultModelName. | Model-specific cost tracking requires defaultModelName to be set in plugin config. | Set defaultModelName to your agent's model (e.g. "openai/gpt-5-nano"). |

For project structure, architecture diagrams, and development workflow, see ARCHITECTURE.md.

Documentation

License

Apache-2.0