@sapience-ai-corporation/openclaw-middleware-suite

v1.0.3

Published

22 days ago

Sapience AI's 6-in-1 middleware suite for OpenClaw. These middlewares sit between your agent and and the systems it can affect — intercepting every turn and every tool call, enforcing Human-in-the-Loop gates, redacting PII, blocking prompt injection, rout

0High
0Medium
0Low

npm-autobot

syed_ali

mfurqanazhar92

security middleware ai-agents openclaw human-in-the-loop runtime-interception zero-trust

The Problem

A single OpenClaw session can read your codebase, run arbitrary shell commands, manage your Google Drive, send emails on your behalf, and push code to production — all autonomously. And every turn along the way spends tokens, picks a model, and accumulates context.

Today there is no layer between "the agent decided to do something" and "it happened." That gap hides three distinct problems — and they rarely share a solution:

Safety gaps. Sandboxing stops the agent from escaping its container. Inside that container, it still holds the keys to the kingdom: API tokens, file system access, outbound network, email credentials. One hallucinated rm -rf, one prompt injection buried in a fetched document, one leaked SSN in a tool argument — and you're dealing with real damage.
Runaway cost. Every turn re-sends the full context window. Every request picks whatever model was wired in, whether the task needs it or not. A long coding session on a frontier model can burn through dollars in minutes, most of it spent re-reading stale history at premium rates.
Degrading context. As sessions grow, the agent loses its earliest instructions, forgets key decisions, and starts contradicting itself. Naive truncation throws away what matters; doing nothing blows the budget.

That's what this suite is for.

The Solution

Sapience AI acts as the control plane for OpenClaw. Six middlewares — each solving a distinct failure mode — work together in a single pipeline.

Four govern and protect the action surface:

HITL — Human approval for dangerous actions
Guardrail — Block prompt injection, data exfiltration, destructive commands
PII Sanitizer — Detect & redact sensitive data before it leaves
Tool Call Limit — Enforce session & request budgets

Two optimize the request itself:

Context Editing — Intelligent context window compaction, preserving what matters
Model Routing — Route requests to the right model for the job

Together, they ensure every turn is safe, observable, and cost-effective.

| # | Middleware | What It Does | |:-:|---|---| | 1 | :shield: HITL | Human approval for dangerous actions | | 2 | :brain: Context Editing | Intelligent context window compaction | | 3 | :zap: Model Routing | Route requests to the right model & provider | | 4 | :lock: Guardrail | Block prompt injection, exfiltration, destructive commands | | 5 | :detective: PII Sanitizer | Detect & redact personally identifiable information | | 6 | :bar_chart: Tool Call Limit | Enforce session & request budgets per tool |

The suite ships in two consumption modes, and the rest of this README is organized around that split:

Plugin — install once, manage with the dashboard and sai CLI. Zero code. The OpenClaw gateway loads the middlewares for you and wires every hook.
Programmatic Usage — import { ... } from '@sapience-ai-corporation/openclaw-middleware-suite/<middleware>'. Embed any subset directly in a Node app, bring your own pipeline, mix and match.

Both modes share the same on-disk config store, so you can start with the plugin and graduate into programmatic embedding without re-learning anything.

The plugin is the zero-code path: install the npm package, configure once with sai init, and the OpenClaw gateway loads all six middlewares and wires every hook. Every operational concern — toggling middlewares on/off, editing policies, viewing audit logs, syncing model catalogs — happens through the dashboard or sai CLI.

Quick Start

1. Install

# From npm (published plugin)
openclaw plugins install @sapience-ai-corporation/openclaw-middleware-suite

# Or from source
git clone https://github.com/Sapience-AI/openclaw-middleware-suite.git
cd openclaw-middleware-suite
npm install && npm run build
openclaw plugins install --link .

# Expose the `sai` CLI on your PATH (Windows users — admin / Developer Mode required)
mkdir -p ~/.local/bin
chmod +x ~/.openclaw/extensions/sapience-ai-suite/dist/index.js
ln -sf ~/.openclaw/extensions/sapience-ai-suite/dist/index.js ~/.local/bin/sai

2. Configure

# Interactive wizard — walks you through security level, modules, and policies
sai init

# Start the gateway — dashboard served at http://localhost:9000/dashboard
openclaw gateway start

That's it. From here, every middleware can be toggled, configured, and inspected from the Dashboard or its dedicated sai <middleware> … CLI subcommand (links inside each middleware section below).

How the Plugin Works

A few facts that apply to all six middlewares once they're loaded as a plugin — worth knowing before you tweak anything:

One authoritative disk store. All six share the same config file: ~/.openclaw/sapience-ai-suite/sapience-ai-suite.json. The dashboard, the sai CLI, and the middlewares themselves all read and write the same JSON. There is no per-middleware config file scattered around the filesystem.
Plugin-level on/off flag. Each middleware has a boolean toggle at plugin_config.middlewares[name] in that file (managed via the dashboard's overview page or sai init). When the flag is false, the plugin runtime short-circuits the corresponding hook — the middleware code isn't executed, so disabling is free of cost.
Operational config lives under each middleware's sub-tree. hitl, context_editing, model_routing, guardrail, pii_sanitizer, tool_call_limit — each owns its own block. (HITL nests its user config under hitl.policy; Context Editing under context_editing.configOverrides.) These are what the dashboard editors and sai <middleware> CLI subcommands manipulate.
The plugin runtime always passes {} at initialize(). It relies entirely on the disk overlay for configuration. The one exception is Context Editing, which receives { pluginApi } so its ICC LLM extraction can dispatch through OpenClaw's plugin API. This is what makes plugin behavior fully reproducible: defaults + whatever's in sapience-ai-suite.json, nothing else.

Middlewares

Quick links: HITL · Context Editing · Model Routing · Guardrail · PII Sanitizer · Tool Call Limit.

:shield: Human-in-the-Loop (HITL)

The last line of defense. Every action your agent takes is evaluated against a security policy. Dangerous actions require explicit human approval before they execute.

Why It Exists

AI agents make mistakes. They hallucinate file paths, misinterpret instructions, and occasionally try to do things that would be catastrophic in production. HITL ensures that a human reviews high-risk actions before they happen — not after.

How It Works

Tool Call Arrives
  │
  ├─ Policy Lookup ─── ALLOW ──→ Execute immediately
  │
  ├─ Policy Lookup ─── DENY ───→ Block with reason
  │
  └─ Policy Lookup ─── ASK ────→ Risk Assessment
                                    │
                                    ├─ Destructive classifier
                                    ├─ Irreversibility scorer (0-100)
                                    └─ Memory risk forecaster
                                    │
                                    ▼
                                 Approval Queue
                                    │
                                    ├─ /approve ──→ Execute
                                    ├─ /approve <TOTP> ──→ Execute (MFA verified)
                                    └─ /deny ──→ Block

Features — vs. ClawReins

ClawReins is the closest comparable HITL layer for OpenClaw. The two share a common foundation — three-decision policies, irreversibility scoring, destructive command detection, and a WhatsApp/Telegram approval channel. Sapience HITL extends that foundation in two places that matter most: a broader set of protected tools, and an approval mechanism the agent can't bypass.

Legend: ✅ supported · ❌ not present

| Feature | ClawReins | Sapience HITL | | ----------------------------------------------------------------------------------------------------------------- | :-------------------------------: | :--------------------------: | | Policy model — ALLOW / DENY / ASK per module & method, with allowPaths / denyPaths globs | ✅ | ✅ | | Risk scoring — irreversibility (0–100), destructive classifier (HIGH / CATASTROPHIC), trajectory forecast | ✅ | ✅ | | Approval queue — async WhatsApp/Telegram + TTY with TTL expiry & trust rate limiting | ✅ | ✅ | | Immutable audit trail — append-only JSONL with full risk scores per decision | ✅ | ✅ | | Catastrophic-action confirmation | On-screen CONFIRM-XXXX token | TOTP 2FA (RFC 6238) | | Agent cannot read the approval code (self-approval prevention) | ❌ token appears in terminal/chat | ✅ code generated off-device | | ArgsHash enforcement on retry (prevents param substitution) | ❌ logged only | ✅ verified on approval | | Protects FileSystem, Shell, Browser, Network, Gateway | ✅ | ✅ | | Gmail (list, send, draft, delete) | ❌ | ✅ | | GoogleDrive (list, upload, download, delete, share, move) | ❌ | ✅ | | Memory (search, add, delete) | ❌ | ✅ | | Process (list, poll, log, write, kill, clear, remove) | ❌ | ✅ | | Shell subcommand routing (gog, gdrive, rclone → Gmail/Drive policy) | ❌ | ✅ | | Gateway endpoint reclassification (gateway.maton.ai/* auto-mapped) | ❌ | ✅ | | MCP tool name mapping (mcp__google_workspace__*) | ❌ | ✅ |

Any unmapped tool falls through to defaultAction (ASK).

CLI

sai hitl policy           # View/manage security policies
sai hitl stats            # View approval statistics
sai hitl audit            # View decision audit trail
sai hitl reset            # Reset statistics

:brain: Context Editing

Intelligent context window compaction. Long sessions don't lose critical context — the middleware automatically compresses old messages while preserving what matters.

Why It Exists

LLM context windows are finite — and expensive. Every token in the context window costs money on every single request. As a session grows, you're paying to re-read thousands of tokens of stale conversation history that the agent no longer needs. A 120K-token session hitting Opus 4.7 on every turn can burn through dollars in minutes, most of it on context the model is barely using.

And it's not just cost. In long coding sessions, the agent gradually loses its earliest instructions, forgets key decisions, and starts contradicting itself as critical context gets pushed out by noise. Naive truncation throws away important context indiscriminately.

Context Editing solves both problems: it compresses old messages using LLM-powered extraction (ICC), keeping token counts — and costs — under control while preserving the context that actually matters.

How It Works

Turn Begins
  │
  └─ before_model_resolve hook ──→ (fires BEFORE OpenClaw's SessionManager opens the JSONL)
                                          │
                                          ├─ Walk session JSONL
                                          │    ├─ Count user messages
                                          │    └─ Sum assistant token usage
                                          │
                                          ├─ Evaluate triggers
                                          │    ├─ Token count > threshold?
                                          │    ├─ Message count > threshold?
                                          │    └─ Adaptive rules?
                                          │
                                          ▼  (if triggered)
                                        Run ICC extraction
                                          ├─ Priority Preservation
                                          ├─ Conflict Resolution
                                          └─ Entity Locks
                                          │
                                          ▼
                                        Append compaction entry to JSONL
                                          │
                                          ▼
  └─ OpenClaw opens SessionManager ──→  Reads compacted JSONL
                                          │
                                          ▼
                                        LLM call sees compacted history

The single-hook design does everything in one place: detect, extract, and write — all before OpenClaw's own SessionManager opens the JSONL, so there's no concurrent-SM-on-same-file race. The same turn's LLM call sees the compacted history (no one-turn lag). Per-turn assistant token usage is read directly from the persisted JSONL entries, so the tokensSaved metric stays provider-precise without needing a separate push hook.

ICC Pipeline (Intelligent Context Compression)

| Stage | What It Extracts | | ------------------------- | --------------------------------------------------------------- | | Priority Preservation | Critical objectives and instructions the agent must not forget | | Conflict Resolution | Contradictions in the transcript are detected and resolved | | Entity Locks | Key values (names, paths, config values) are preserved verbatim |

Features — vs. OpenClaw's Built-in Compaction

OpenClaw already ships a robust compaction pipeline: when a prompt would overflow the context window, it splits the transcript into chunks, summarizes each one, then merges the partials — a bulk summarizer built to survive oversized tool outputs, tool-call pairing constraints, and transient API failures. Sapience Context Editing does not replace that pipeline — it adds a cheaper, steerable fast path on top of it.

The fast path is a single LLM call whose output (the ICC extraction: entities, conflicts, priorities) is the compaction summary. One call means the prompt can be user-steered, the model can be swapped, and compaction can fire early on your own thresholds instead of waiting for overflow. If the ICC call ever fails — for example, on a transcript too large for the extraction model's window — the middleware simply skips that turn and OpenClaw's native overflow-triggered compaction handles it on the next prompt exactly as it would for a vanilla install. You never lose coverage; you just lose the steering on the overflow edge case.

Legend: ✅ supported · ❌ not present

| Feature | OpenClaw built-in | Sapience Context Editing | | -------------------------------------------------------------------------------------------------------------- | :------------------------: | :--------------------------------------: | | Compaction pipeline — summarize old messages into a dense summary | ✅ two-stage chunk + merge | ✅ single ICC call as the summary | | Overflow-triggered compaction (fires when next prompt exceeds context window) | ✅ | ✅ | | Manual /compact command | ✅ | ✅ | | Identifier preservation — UUIDs, hashes, URLs, file names kept verbatim | ✅ | ✅ | | Early / proactive compaction before hitting the context limit | ❌ reactive only | ✅ threshold-triggered | | Token-count threshold for early compaction (default 80k, configurable) | ❌ | ✅ | | Message-count threshold for early compaction (default 50, configurable) | ❌ | ✅ | | Trigger mode selector — token / message / both | ❌ | ✅ | | Keep N recent messages verbatim before the compaction cut | ❌ fixed chunk ratio | ✅ user-configurable | | Keep N recent tokens verbatim before the compaction cut | ❌ internal knob only | ✅ user-configurable | | Custom compaction prompt — user-supplied instructions steer what the summary preserves | ❌ fixed merge prompt | ✅ injected on the single ICC call | | Typed Entity Locks — API endpoints, file paths, variables, constants, model names, code identifiers | ❌ | ✅ | | Conflict Resolution — detects instruction overrides ("use X instead of Y") and locks the resolved value | ❌ | ✅ | | Priority Preservation — TODO / FIXME / REQUIREMENT / MUST segments flagged for verbatim carry-over | ❌ | ✅ | | Custom compaction model — run summarization on a different model than the agent | ➕ raw openclaw.json key | ✅ first-class via sai ctx model --set | | Session pruning toggle (cache-TTL for idle contexts) | ➕ raw openclaw.json key | ✅ first-class via sai ctx pruning | | Interactive wizard for all of the above | ❌ | ✅ sai init | | Per-compaction audit trail (JSONL: entities, conflicts, priorities, instruction hash) | ❌ | ✅ | | Per-session cumulative token-savings stats | ❌ | ✅ |

CLI

sai ctx stats     # Compaction state, token savings, and compaction history
sai ctx reset     # Clear compaction state

:zap: Model Routing

Route every request to the right model. Simple tasks go to fast, cheap models. Complex reasoning goes to the best. Automatic fallbacks, multi-provider support, and real-time cost tracking.

Why It Exists

Not every request needs GPT-5 or Claude Opus. A simple "list files in this directory" doesn't need a $0.015/1K-token model — but without routing, that's what it gets. Model Routing scores request complexity in real time and routes to the optimal model tier, cutting costs by up to 70% without sacrificing quality where it matters.

How It Works

Incoming Request
  │
  ├─ Complexity Scorer
  │    ├─ Message length
  │    ├─ Instruction complexity
  │    ├─ Tool usage patterns
  │    └─ Reasoning depth signals
  │
  ├─ Tier Assignment ──→ simple | standard | complex | reasoning
  │
  ├─ Model Selection ──→ Primary model for tier
  │    └─ Fallback chain if primary fails
  │
  └─ Provider Routing ──→ Route to correct API endpoint
       ├─ OpenAI format
       ├─ Anthropic format
       └─ Google format

Tier System

| Tier | Use Case | Example Models | | ------------- | ------------------------------------------ | -------------------------- | | Simple | File reads, listing, basic Q&A | GPT-4o-mini, Claude Haiku | | Standard | Code generation, moderate reasoning | GPT-4o, Claude Sonnet | | Complex | Architecture decisions, complex refactors | GPT-4, Claude Opus | | Reasoning | Multi-step planning, novel problem solving | o1, Claude Opus (extended) |

Features — vs. Manifest

Manifest is the closest comparable complexity-based model router — both share the same 23-dimension scoring core derived from the same lineage. The two target different deployment models: Manifest ships as a Docker service with a multi-tenant web dashboard and Postgres backend; Sapience Model Routing is an OpenClaw plugin running on the developer's machine with a local JSON config and CLI. The table below focuses on what Sapience adds on top of the shared scoring foundation — in particular per-session model pinning and auto prompt-cache marker injection, which compound to keep the provider's cached prefix warm across every turn of a pinned session.

Legend: ✅ supported · ❌ not present

| Feature | Manifest | Sapience Model Routing | | ---------------------------------------------------------------------------------------------------------------------------------------------------------------- | :----------------------------------------------------: | :-----------------------------------------------------------------------------------: | | 23-dimension scorer — Aho-Corasick trie, density clustering, sigmoid (k=8) with 0.45 ambiguity threshold, four-tier boundaries | ✅ parity | ✅ parity | | Session momentum — length-weighted blend of the last 5 tier decisions | ✅ | ✅ | | Request deduplication — concurrent retries / double-clicks share one upstream call (30s window) | ✅ trace-id / token-based | ✅ SHA-256 inflight + completed | | Capability-filtered fallback chain — filters by tool support, vision, context window, exclusions | ✅ | ✅ max 5 per tier | | Native provider adapters — OpenAI / Anthropic / Google with SSE streaming conversion | ✅ (+ ChatGPT Codex subscription) | ✅ | | Hard overrides — reasoning keyword, short-message, tool-floor, large-context | ✅ 4 overrides | ✅ 6 overrides (+ structured-output floor + session-startup /new /reset → SIMPLE) | | Multilingual keywords — 9 languages, 1,500+ keywords across all 14 keyword dimensions | ❌ English only | ✅ | | Routing profiles — eco / premium / agentic switch the whole fallback chain per request, each with its own per-tier model map | ❌ single deterministic map | ✅ | | Model pinning — the same session keeps the same model across every turn, with auto-release on complexity escalation | ❌ tier-only momentum | ✅ per-session high-water mark | | Three-strike escalation — user retries an identical request 3× → auto-bump to next tier | ❌ | ✅ | | Auto cache-marker injection — Anthropic cache_control on last system block + last tool, Google cachedContent token passthrough | ✅ Anthropic only | ✅ Anthropic + Google | | Session-pinned prompt caching — pinning + injection compound so the provider's cached prefix survives across turns (up to 90% off cached input on Anthropic) | ❌ no pinning → prefix drops whenever the model drifts | ✅ | | Deterministic response cache — LRU for temperature=0 non-streaming requests (bypasses the provider entirely on repeat identical prompts) | ❌ | ✅ opt-in, 200 entries / 10 min | | Daily cost alerts — warn / critical thresholds that fire once per day on top of a 90-day ledger | ❌ notifications exist, no budget caps | ✅ $5 warn / $20 critical (configurable) | | Per-step audit log — request-id-scoped JSONL trace of every routing decision | Postgres agent_message rows | ✅ local JSONL | | Config hot-reload (sapience-ai-suite.json) | ❌ restart required | ✅ fs.watchFile, 2s debounce | | Plugin hook system — onBeforeScore / onAfterScore / onBeforeForward / onAfterForward | ❌ | ✅ 4 hooks | | Full CLI — sai router stats / config / tiers / test / exclude / models / reset | ❌ | ✅ | | Interactive setup wizard — profile + tier customization + port + live catalog pull | Web dashboard signup | ✅ sai init | | Specificity routing — task-type categories (coding / vision / trading / …) override complexity tier | ✅ 9 categories | ❌ | | Subscription OAuth — ChatGPT Plus, Claude Max, MiniMax, GitHub Copilot | ✅ | ❌ API keys only |

CLI

sai router stats       # Daily cost ledger + per-model breakdown + alerts
sai router config      # View / edit profile, thresholds, exclusions
sai router tiers       # Inspect tier-to-model assignments
sai router models      # Sync live model catalog (LiteLLM, 24h cache)
sai router test        # Score a sample request without forwarding
sai router reset       # Reset cost history / session state

Note: Toggling Model Routing on/off requires a gateway restart. The dashboard handles this automatically with a reconnection overlay.

:lock: Guardrail

Multi-layer defense against prompt injection, data exfiltration, and destructive commands. Scans both input and output surfaces with regex, heuristic, and entropy-based detection.

Why It Exists

Prompt injection is the #1 attack vector against AI agents. A single malicious instruction hidden in a document, email, or web page can hijack your agent's behavior — making it exfiltrate secrets, delete data, or execute arbitrary commands. Guardrail catches these attacks at multiple detection layers before they can cause harm.

Detection Layers

| Layer | Technique | What It Catches | | ---------------------- | ---------------------- | ---------------------------------------------- | | Regex Scanner | Pattern matching | Known injection patterns, role overrides | | Prefix Scanner | Known-prefix detection | Common injection prefixes and escape sequences | | Heuristic Scanner | Behavioral analysis | Unusual request patterns, multi-step attacks | | Entropy Analyzer | Randomness detection | Encoded payloads, obfuscated data exfiltration | | Unicode Normalizer | Canonicalization | Unicode escape attacks, homoglyph substitution | | OpenAI Moderation API | ML content classifier (external) | Violence, hate, sexual, self-harm, illicit content — result cached at before_agent_start, severity-tiered enforcement at before_message_write (default: rewrite on HIGH + CRITICAL; MEDIUM is audit-only) |

Guard Modules

| Guard | What It Protects | | ------------------------ | ---------------------------------------------------------------- | | Sensitive Paths | Blocks access to ~/.ssh, ~/.aws, /etc/passwd, .env files | | Egress Control | Prevents unauthorized data transmission to external endpoints | | Destructive Commands | Catches rm -rf, DROP TABLE, kill -9, and similar patterns | | Content Moderation | OpenAI Moderation API check on incoming prompts — flags violence, hate, sexual, self-harm, illicit content (async → sync cache bridge; severity-tiered via moderation.rewriteThreshold, default HIGH) | | Role Impersonation | Detects attempts to masquerade as system/admin roles | | Canary Tracker | Honeypot tokens that trigger alerts if exposed in output | | Output Scrubber | Removes middleware metadata from agent responses |

Key Features

Input + output scanning — Covers both prompt injection and data leakage
Configurable actions — BLOCK, WARN, REDACT per rule
Confidence filtering — Adjustable sensitivity to reduce false positives
Dry-run mode — Log detections without blocking (for tuning)
Custom patterns — Add your own regex rules for domain-specific threats

Features — vs. OpenClaw Shield & OpenGuardrails

The two closest comparables are OpenClaw Shield (Knostic) — a lightweight OpenClaw plugin with five independently-toggleable policy layers, including an advisory "security gate" tool that relies on the agent obeying injected instructions — and OpenGuardrails / MoltGuard — a full-stack platform with an agent-side plugin talking to a hosted Core service that runs a 10-scanner content model and a behavioral rule engine over tool-call sequences. Sapience Guardrail takes a different stance from both: everything runs in-process on OpenClaw's native hooks, and the before_message_write hook actually rewrites the persisted transcript so the LLM can never see the pre-redacted content — not just on the next turn, but on the current one. The rows below focus on the capabilities that actually differ; shared basics (regex scanning, OpenClaw plugin integration, PII redaction) are omitted.

Legend: ✅ supported · ⚠️ partial · ❌ not present

| Feature | OpenClaw Shield | OpenGuardrails | Sapience Guardrail | | ------------------------------------------------------------------------------------------------------------------ | :---------------: | :-----------------------: | :----------------------------------------------------------------------------: | | Prompt injection regex / pattern scanner | ✅ | ✅ (hosted Core S01) | ✅ (21 patterns) | | Heuristic / Shannon-entropy detector for obfuscated payloads | ❌ | ❌ | ✅ (≥ 4.0 on 20+ char tokens) | | Unicode NFKC normalization before scan — homoglyph + zero-width + soft hyphen | ❌ | ❌ | ✅ | | External moderation API integration (OpenAI Moderation, Perspective, …) | ❌ | ⚠️ gateway sanitizer only | ✅ OpenAI Moderation, async → sync cache bridge | | Sensitive file path blocking (.ssh, .env, .aws/credentials, …) | ✅ 18 patterns | ❌ | ✅ 52 patterns + symlink resolution | | Outbound domain allowlist (default-deny) | ❌ | ❌ | ✅ 25 allowed (npm, PyPI, GitHub, AWS, Cloudflare…) | | Private-IP / metadata-endpoint SSRF block (169.254.169.254, RFC 1918, IPv6 ULA, mapped ::ffff:) | ❌ | ❌ | ✅ | | Destructive shell command blocking (rm -rf, DROP TABLE, git push --force main, fork bombs, chmod 777…) | ✅ 6 patterns | ❌ (behavioral only) | ✅ 22 built-in + custom regex | | Role-impersonation / ChatML / fake [SYSTEM] neutralization | ❌ | ❌ | ✅ 16 patterns incl. Llama markers & tool-output tags | | Canary / leakback re-redaction (re-detect previously-redacted content) | ❌ | ❌ | ✅ SHA-256 ring buffer, whitespace-normalized | | Actual message rewrite that persists to transcript (vs log-only or post-persist redaction) | ❌ | ❌ | ✅ before_message_write returns { message } per OpenClaw 2026.4.x contract | | Async → sync cache bridge — external API check in before_agent_start, severity-tiered rewrite in sync before_message_write (configurable threshold) | ❌ | ❌ | ✅ | | Behavioral rule engine over tool-call sequences (e.g. file read → external write) | ❌ | ✅ hosted Core | ❌ | | Advisory "security gate" LLM-policy tool the agent is prompted to call | ✅ L5 | ❌ | ❌ | | Dry-run / shadow mode (log without blocking) | ✅ | ⚠️ unclear | ✅ | | Per-decision JSONL audit log — timestamp, module, severity, sessionKey, agentId | ❌ | ✅ (hosted DB) | ✅ local append-only | | Full CLI surface for runtime config | ❌ JSON edit only | ✅ /og_* slash commands | ✅ sai guardrail … | | Web dashboard | ❌ | ✅ localhost:53668 | ❌ (planned) | | Runs fully in-process (no external service dependency) | ✅ | ❌ requires Core API | ✅ | | Multi-tenant managed service with billing / quotas | ❌ | ✅ | ❌ |

What's genuinely unique to each

OpenClaw Shield — per-layer toggles (L1–L5) so you can ship a degraded config when a host lacks a given hook; advisory L5 gate relies on the agent obeying injected policy rather than host enforcement.
OpenGuardrails / MoltGuard — hosted behavioral rule engine that catches multi-turn attack patterns (credential read → network write), a 10-scanner content model spanning NSFW / MCP poisoning / off-topic drift, and a managed dashboard with a quota system.
Sapience Guardrail — synchronous transcript rewrite so the LLM never sees pre-redacted content; Unicode NFKC + homoglyph + zero-width normalization pre-scan; entropy-based obfuscation detection; full-depth L2 stack (sensitive-paths + egress allowlist + private-IP SSRF + destructive commands) firing before any tool executes; confidence-filtered matching to suppress rephrasing false positives.

What We Adopted From Each

Sapience Guardrail did not emerge in a vacuum. Both comparables contributed foundational ideas we built on — we name them explicitly below.

| Capability | OpenGuardrails | OpenClaw Shield | Sapience Guardrail | |---|:-:|:-:|:-:| | Regex rule engine with category taxonomy (injection / PII / suspicious) | Adopted from | — | Extended: 53 rules across 3 engines (regex + prefix + heuristic) | | Confidence tiers (HIGH / MEDIUM / LOW) | Adopted from | — | Extended: cross-category and same-category multi-match required for MEDIUM | | L2 tool-call interception via before_tool_call hook | — | Adopted from | Extended: 6 guards (sensitive-paths, egress, destructive, shell-indirection, pre-read, param scan) | | File path blocklist concept | — | Adopted from | Extended: 52 patterns + allowlist overrides + symlink resolution | | L3 transcript scanning via before_message_write | — | — | Original — closes the gap Shield left open (tool results, file content entering after execution) | | L1 prompt-guard policy injection into system prompt | — | — | Original — agent learns what is protected, never how | | Egress / SSRF prevention (domain allowlist, IPv4+IPv6 private ranges, 169.254.169.254) | — | — | Original | | Canary tracking / leakback re-redaction (SHA-256 ring buffer) | — | — | Original | | Role impersonation (ChatML, Llama, fake [SYSTEM], tool-output tag injection — 16 patterns) | — | — | Original | | Agent interrogation defense (defense-enumeration detection) | — | — | Original | | OpenAI Moderation API integration with async → sync cache bridge | — | — | Original | | CLI management surface (sai guardrail …) | — | — | Original |

The one thing neither comparable does: combine pre-execution interception with post-execution transcript scanning. OpenGuardrails scans text but can't block tools. OpenClaw Shield blocks tools but can't scan transcripts. Sapience does both in the same pipeline.

Configuration

{
  "dryRunMode": false,
  "entropyThreshold": 4.0,
  "sensitivePaths": { "enabled": true, "action": "BLOCK" },
  "egressControl": { "enabled": true, "blockDataSending": true },
  "destructiveCommands": { "enabled": true, "action": "BLOCK" },
  "moderation": { "rewriteThreshold": "HIGH" },
  "outputScrubber": { "enabled": false, "dryRunMode": false }
}

The middleware's on/off switch is the plugin-level flag (plugin_config.middlewares.guardrail in sapience-ai-suite.json, managed via the dashboard or sai init), not a field inside this config. The sub-feature enabled flags above (sensitivePaths, egressControl, destructiveCommands, outputScrubber) toggle individual guards within the guardrail middleware when it's already running. Output scrubber is double-gated: the master Guardrail toggle must be on AND outputScrubber.enabled must be true. Both default to false so a fresh install ships with neither active — opt in via the dashboard's Guardrail → Output tab or sai guardrail output toggle enable.

moderation.rewriteThreshold controls the severity bar for the async → sync cache bridge. Accepts MEDIUM, HIGH, or CRITICAL — default is HIGH. Flags at or above the threshold trigger a transcript rewrite in before_message_write (hard block); flags below are logged audit-only and pass through so the LLM's own safety layer can handle the gray zone without a synthetic [GUARDRAIL] marker replacing the user's prompt. Set to MEDIUM for maximum strictness, or CRITICAL to only hard-block the most severe categories.

CLI

sai guardrail status                      # Show guardrail state
sai guardrail toggle dry-run              # Toggle dry-run mode on/off
sai guardrail list [category]             # List rules (optionally filtered)
sai guardrail rule-add <name> <category>  # Add a custom regex rule
sai guardrail rule-toggle <name>          # Enable/disable a rule
sai guardrail paths block <pattern>       # Block a sensitive path
sai guardrail egress allow <domain>       # Whitelist an egress domain
sai guardrail egress data-sending <on|off># Toggle outbound body blocking
sai guardrail destructive list            # List blocked command patterns
sai guardrail destructive add <pattern>   # Add a custom destructive pattern
sai guardrail config                      # Print resolved config
sai guardrail reset                       # Reset to defaults

Note: moderation.rewriteThreshold is currently file-level only — edit the guardrail key in sapience-ai-suite.json. A dedicated CLI subcommand is not yet wired up.

:detective: PII Sanitizer

Detect and redact personally identifiable information before it leaves your system. Field-level DLP policies with recursive deep scanning across all tool call arguments.

Why It Exists

AI agents process everything in their context — including sensitive data users paste into conversations. Without a PII layer, an agent can inadvertently pass SSNs, API keys, or email addresses to external APIs, log them to files, or include them in shell commands. The PII Sanitizer intercepts tool calls and applies data loss prevention policies before execution.

Detection Patterns

| Category | Examples | Default Severity | | --------------------------- | ------------------------ | ---------------- | | Email addresses | [email protected] | LOW | | Phone numbers | +1-555-0123 | MEDIUM | | Social Security Numbers | 123-45-6789 | CRITICAL | | Credit card numbers | 4111-1111-1111-1111 | CRITICAL | | API keys & tokens | sk-proj-..., ghp_... | CRITICAL | | IP addresses | 192.168.1.1 | LOW |

DLP Actions

| Action | Behavior | | ---------- | ---------------------------------------------- | | ALLOW | Pass through (validation only) | | REDACT | Replace PII with [REDACTED_<TYPE>] placeholder | | ESCALATE | Force HITL approval before proceeding | | BLOCK | Block the tool call entirely |

Key Features

Recursive deep scanning — Traverses nested objects, arrays, and stringified JSON
Shell argument parsing — Extracts and scans literals from shell commands
Field-level policies — Different actions per PII type and severity
Severity classification — LOW, MEDIUM, HIGH, CRITICAL
Integrates with HITL — ESCALATE action routes to human approval

CLI

sai dlp info                                    # Show DLP status, toggles, rules, tool mappings
sai dlp toggle <enable|disable|dry-run>         # Toggle DLP settings
sai dlp rule-add <name> [options]               # Add or update a PII scanning rule
sai dlp rule-rm <name>                          # Remove a PII scanning rule by name
sai dlp policy-set <tool> <field> <action>      # Set scanning policy for a tool field

:bar_chart: Tool Call Limit

Budget enforcement for AI agent execution. Prevents runaway loops and resource exhaustion with per-session and per-request call limits.

Why It Exists

An agent stuck in a loop can call the same tool hundreds of times in a single session — burning through API quotas, racking up costs, and producing garbage output. Tool Call Limits enforce hard boundaries on how many times each tool can be called, at both the session and request level.

Enforcement Model

Tool Call Arrives
  │
  ├─ Check session counter ─── Under limit ──→ PASS
  │                        └── Soft limit ───→ WARN + PASS
  │                        └── Hard limit ───→ BLOCK
  │
  └─ Check request counter ─── Under limit ──→ PASS
                           └── Soft limit ───→ WARN + PASS
                           └── Hard limit ───→ BLOCK

Default Budgets

| Scope | Global | Gmail/Drive Ops | | ------------- | --------- | --------------- | | Session limit | 100 calls | 50 calls | | Request limit | 20 calls | 10-20 calls |

Key Features

Two enforcement scopes — Session-level and request-level budgets
Soft + hard limits — Warn before blocking
Per-method granularity — Different limits for FileSystem.read vs Gmail.send
Rolling windows — 24-hour configurable window for counter resets
Session tracking — Maps virtual session IDs to real session keys

vs. OpenClaw `tools.loopDetection`

OpenClaw core ships a built-in tools.loopDetection guard that detects degenerate call patterns — same tool + same params repeated, known polling with no state change, ping-pong alternation — over a sliding window. It is pattern-based and disabled by default. Sapience Tool Call Limits is budget-based: it counts cumulative calls against a numeric quota. The two solve different failure modes and are designed to run together.

| Failure mode | OpenClaw loop detector | Sapience Limits | | ------------------------------------------------------------------- | :--------------------: | :-------------: | | Gmail.read polled 50× with identical params | ✅ | ✅ | | Gmail.send to 50 different recipients in one session (spam/exfil) | ❌ params differ | ✅ | | Agent paginates legitimately 100× with varying cursors | ⚠️ may false-positive | ✅ bounded | | Ping-pong Read → Write → Read → Write | ✅ | ❌ | | Cost blowup: 1000 cheap-looking calls, none repeated | ❌ no pattern | ✅ | | Request-level runaway (20+ calls in a single turn, all different) | ❌ | ✅ |

Differentiators beyond pattern-vs-budget:

Dual enforcement scopes — session (lifetime) and request (single turn) budgets evaluated on every call
Per-module × per-method granularity — Gmail.send has a different budget than FileSystem.read
Soft + hard tiers — warn before blocking, so operators see approach-to-limit
Rolling 24h window — counters reset automatically rather than requiring manual intervention
Built-in observability — sai limits status and the dashboard page expose live counters; OpenClaw's detector only logs when it fires

Recommended setup: enable both. OpenClaw's detector catches degenerate shapes cheap; Sapience Limits catches budget overruns the pattern detector can't see (distributed loops, cost blowups, request-level runaway).

CLI

sai limits stats     # View current usage
sai limits show      # View limit policies
sai limits reset     # Reset counters

Real-time configuration and monitoring UI for all six middlewares.

The dashboard is a Preact application served by the OpenClaw gateway. It provides live configuration, status monitoring, and log streaming for every middleware in the suite.

Pages:

| Page | What You Can Do | | -------------------- | ------------------------------------------------------------------ | | Overview | Toggle middlewares on/off, view system-wide health stats | | HITL | View pending approvals, decision audit trail, policy visualization | | Context Editing | Session history, compaction statistics, entity locks | | Model Routing | Route metrics, cost trends, tier configuration | | Guardrail | Threat detection log, rule configuration, egress controls | | PII Sanitizer | Detection patterns, DLP policy editor | | Tool Call Limits | Usage tracking, limit configuration |

Tech stack: Preact + preact-router, @preact/signals for state, uPlot for charts, SSE for real-time streaming.

Update

Pull a newer published version of the plugin:

# 1. Recommended: disable every middleware first via the dashboard or CLI so
#    no in-flight requests are mid-pipeline when the plugin reloads. Toggling
#    them back on after the gateway restart re-runs the persist-defaults
#    handlers cleanly.
sai disable

# 2. Pull the latest published version. `openclaw plugins update` upgrades
#    the tracked plugin to the newest version from its registered source.
openclaw plugins update sapience-ai-suite

# 3. Restart the gateway so the new plugin code is loaded in-process.
openclaw gateway restart

After the restart, re-enable the middlewares you turned off in step 1 (dashboard toggles, or sai enable).

For local dev installs (openclaw plugins install --link .), pull the upstream changes you want, run npm run build, then restart: openclaw gateway restart. No re-link needed because --link follows the working tree.

Uninstall

Reverse the install in three steps. The order matters: drop the CLI shim first so nothing on $PATH can dangle, then unregister the plugin from the gateway, then restart so the runtime stops loading it.

# 1. Remove the `sai` CLI symlink (skip this line if you used `npm install -g` instead)
rm ~/.local/bin/sai

# 2. Unregister the plugin and remove its runtime files from ~/.openclaw/extensions/
openclaw plugins uninstall sapience-ai-suite

# 3. Reload the gateway so the plugin is no longer loaded in-process
openclaw gateway restart

Config and audit data are kept by default. The on-disk config store at ~/.openclaw/sapience-ai-suite/ (HITL policies, model routes, guardrail rules, audit trail JSONL, MFA secrets) survives the uninstall — reinstalling later picks it back up automatically. To wipe it:

# Only continue after user confirmation — this permanently removes audit logs, MFA secrets, and policies.
rm -rf ~/.openclaw/sapience-ai-suite

The suite is published as a single npm package with one subpath per middleware. Importing HITL doesn't pull in router or guardrail code, and each subpath is tree-shakeable on its own. You construct middlewares directly, optionally hand them to a MiddlewareRegistry, and call lifecycle methods yourself.

Quick Start

npm install @sapience-ai-corporation/openclaw-middleware-suite

import { MiddlewareRegistry } from '@sapience-ai-corporation/openclaw-middleware-suite';
import { HitlMiddleware } from '@sapience-ai-corporation/openclaw-middleware-suite/hitl';
import { GuardrailMiddleware } from '@sapience-ai-corporation/openclaw-middleware-suite/guardrail';

const hitl = new HitlMiddleware();
const guard = new GuardrailMiddleware();

await hitl.initialize({ defaultAction: 'ASK' });
await guard.initialize({ dryRunMode: false });

const registry = new MiddlewareRegistry();
registry.register(hitl);
registry.register(guard);

// In your tool-call dispatcher:
const result = await registry.runBeforeToolCall(ctx);
if (result.block) throw new Error(result.reason);

That's the full embedding shape: import → construct → initialize() → register → call lifecycle methods (runBeforeToolCall, runBeforeAgentStart, etc.). Everything else below is detail on the contract, the configuration paths, and the per-middleware programmatic surface.

Root Surface

The root package only exposes cross-cutting framework concerns — pipeline runner, base contract, plugin lifecycle. Middleware classes live under their own subpaths.

import {
  // Plugin lifecycle
  registerPlugin,
  unregisterPlugin,
  isPluginRegistered,

  // Pipeline runner
  MiddlewareRegistry,

  // Base pipeline contract
  Middleware,
  MiddlewareContext,
  MiddlewareResult,

  // Plugin entry (what OpenClaw's loader imports)
  SapienceMiddlewarePlugin,
  SapienceMiddlewareManifest,
} from '@sapience-ai-corporation/openclaw-middleware-suite';

Base pipeline contract:

interface Middleware {
  readonly name: string;
  readonly version: string;
  initialize(config: Record<string, unknown>): Promise<void>;

  // Tool-call pipeline
  beforeToolCall?(context: MiddlewareContext): Promise<MiddlewareResult>;
  afterToolCall?(context: MiddlewareContext, result: unknown): Promise<void>;

  // OpenClaw lifecycle events (implement only the surfaces you need)
  beforeAgentStart?(context: AgentStartContext): Promise<AgentStartResult | void>;
  beforeModelResolve?(context: ModelResolveContext): Promise<ModelResolveResult | void>;
  beforeMessageWrite?(
    context: MessageWriteContext
  ): MessageWriteResult | undefined | Promise<MessageWriteResult | undefined>;

  // Lifecycle / reporting
  getStatus(): { enabled: boolean; stats?: Record<string, unknown> };
  shutdown?(): Promise<void>;
}

interface MiddlewareResult {
  block: boolean;
  reason?: string;
  modifiedParams?: Record<string, unknown>;
  /** First-class "force human approval" signal — guardrail WARN and PII
   *  ESCALATE both surface here so orchestrators read one consistent field. */
  escalate?: boolean;
  escalateReason?: string;
  metadata?: Record<string, unknown>;
}

Every lifecycle context (MiddlewareContext, AgentStartContext, ModelResolveContext, MessageWriteContext) extends a shared LifecycleContext base, so session-scoped fields (sessionKey, agentId, runId, metadata) live in the same place regardless of which event fired.

MiddlewareResult.reason (pipeline-level) is distinct from BeforeToolCallResult.blockReason (the OpenClaw before_tool_call hook return contract). See the note under HITL → Programmatic API for why the two coexist.

Middleware Subpaths

| Middleware | Subpath | Primary class | |---|---|---| | HITL | @sapience-ai-corporation/openclaw-middleware-suite/hitl | HitlMiddleware | | Context Editing | @sapience-ai-corporation/openclaw-middleware-suite/context-editing | ContextEditingMiddleware | | Model Routing | @sapience-ai-corporation/openclaw-middleware-suite/model-routing | ModelRoutingMiddleware | | Guardrail | @sapience-ai-corporation/openclaw-middleware-suite/guardrail | GuardrailMiddleware | | PII Sanitizer | @sapience-ai-corporation/openclaw-middleware-suite/pii-sanitizer | PiiSanitizerMiddleware | | Tool Call Limit | @sapience-ai-corporation/openclaw-middleware-suite/tool-call-limit | ToolCallLimitMiddleware |

Subpath exports require Node ≥ 18 and TypeScript ≥ 4.7 with moduleResolution: "node16" | "nodenext" | "bundler". A typesVersions fallback is provided in package.json for consumers on legacy moduleResolution: "node".

Precedence at `initialize(config)`

Every middleware in the suite resolves config the same way:

DEFAULTS  <  inline config  <  sapience-ai-suite.json disk overlay

Three things to know about how this plays out for embedded consumers:

The disk overlay is optional but always wins when present. If ~/.openclaw/sapience-ai-suite/sapience-ai-suite.json exists (because the user ran sai init, or the dashboard wrote to it, or another instance saved earlier), its values shadow whatever you pass inline. Apps that want fully reproducible config should ensure the file isn't on disk (see Hermetic embedding below).
The plugin runtime always passes {} (or { pluginApi } for Context Editing). That's why plugin behavior is "defaults + disk, nothing else" — there's no inline layer to compete with the dashboard's writes.
Three configuration paths. Each middleware supports the same three ways to set its config:
- Inline at initialize(config) — pass a partial of the middleware's config type. Applied on top of defaults, below the disk overlay.
- In-process updateConfig(partial) — patches the running instance in memory; no disk I/O, no cross-process visibility. Best for embedded apps that don't want to touch sapience-ai-suite.json.
- Disk-backed Store.update() / Store.save() + reload — persists to sapience-ai-suite.json, survives restarts, propagates to other plugin instances watching the same file.

Best Practices

The five callouts below apply to all six middlewares; they're the things that bite embedded consumers most often. The per-middleware sections after this only call out the bits that deviate from these defaults.

Hermetic Embedding

The disk overlay is what lets the dashboard and sai CLI manage the plugin transparently — but for an embedded app that ships its own config, it's a footgun: a sapience-ai-suite.json left on the deployment host will silently override your inline initialize() values. Two ways out:

Option 1 — No file on disk. If sapience-ai-suite.json doesn't exist on the deployment machine, there's no overlay → inline wins fully:

await mw.initialize({ dryRunMode: true });   // applies; nothing shadowing it

Option 2 — updateConfig after init. Patches the resolved config in memory, bypassing the disk overlay regardless of whether the file exists:

await mw.initialize({});                     // pulls disk if any
mw.updateConfig({ dryRunMode: true });       // overrides disk in memory

Pick option 2 if you don't know whether a sapience-ai-suite.json exists on the deployment machine — it works in both modes. The MR proxy port and a few other "constructed-at-start" fields are special-cased; see the per-middleware notes.

`updateConfig` for In-Process Patches

mw.updateConfig(partial) is the primary in-process knob. It patches this.config directly, no I/O, no file-watcher event. Use it whenev

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

The Problem

The Solution

Quick Start

1. Install

2. Configure

How the Plugin Works

Middlewares

:shield: Human-in-the-Loop (HITL)

Why It Exists

How It Works

Features — vs. ClawReins

CLI

:brain: Context Editing

Why It Exists

How It Works

ICC Pipeline (Intelligent Context Compression)

Features — vs. OpenClaw's Built-in Compaction

CLI

:zap: Model Routing

Why It Exists

How It Works

Tier System

Features — vs. Manifest

CLI

:lock: Guardrail

Why It Exists

Detection Layers

Guard Modules

Key Features

Features — vs. OpenClaw Shield & OpenGuardrails

What We Adopted From Each

Configuration

CLI

:detective: PII Sanitizer

Why It Exists

Detection Patterns

DLP Actions

Key Features

CLI

:bar_chart: Tool Call Limit

Why It Exists

Enforcement Model

Default Budgets

Key Features

vs. OpenClaw tools.loopDetection

CLI

Update

Uninstall

Quick Start

Root Surface

Middleware Subpaths

Precedence at initialize(config)

Best Practices

Hermetic Embedding

updateConfig for In-Process Patches

vs. OpenClaw `tools.loopDetection`

Precedence at `initialize(config)`

`updateConfig` for In-Process Patches