openclaw-agentic-security
v1.5.1
Published
Security gateway for AI agent API calls with interceptor hooks and runtime policy validation
Downloads
219
Maintainers
Readme
@openclaw/agentic-security
Security isolation layer for LLM-powered applications. Prevents secrets leaking, prompt injection, data exfiltration, and unauthorized tool execution.
Quick Start
import Anthropic from "@anthropic-ai/sdk";
import { wrapAnthropic } from "@openclaw/agentic-security";
const client = wrapAnthropic(new Anthropic());
// Secrets are redacted before reaching the LLM — use client normallyOpenAI equivalent:
import OpenAI from "openai";
import { wrapOpenAI } from "@openclaw/agentic-security";
const client = wrapOpenAI(new OpenAI());Installation
npm install @openclaw/agentic-security
# Peer deps (only what you use):
npm install @anthropic-ai/sdk # for Anthropic
npm install openai # for OpenAIWhy Use This?
The Threat Model
LLM-powered applications face security risks that traditional web security tools do not address.
Secrets leaking into LLM prompts. API keys, database credentials, tokens, and PII flow into prompt content through user inputs, tool outputs, and retrieved documents. Once in the prompt, secrets travel to the LLM provider's infrastructure and may appear in model responses. @openclaw/agentic-security scans every request with entropy-based detection and regex patterns before the SDK sends it. Secrets are redacted or the request is rejected — configurable per policy.
Prompt injection via tool outputs and user content. Attacker-controlled content (web pages, files, user messages) can contain instructions that override the system prompt or escalate privileges. The library runs heuristic detection for role confusion, delimiter breaking, and encoding attacks on every request.
Data exfiltration through tool results and network calls. LLMs can be manipulated into exfiltrating data through tool calls, network requests, or by embedding sensitive content in responses. The library monitors egress channels, filters allowed domains, and detects DNS tunneling. PII anonymization tokenizes sensitive values before they reach the LLM and restores them after.
Unauthorized tool execution. Claude Code and similar agents execute tools (Bash, file writes, web search) at the LLM's direction. Without controls, a compromised prompt can instruct the agent to run arbitrary commands. The library intercepts every tool call, checks it against an allowlist/denylist, validates parameters against a schema, and applies RBAC trust levels before execution.
No audit trail. Without observability, you cannot prove what the LLM did, detect anomalies, or meet compliance obligations. The library emits structured audit log entries and OpenTelemetry spans for every security event.
Comparison to Alternatives
| Approach | What it misses | | ----------------------------- | --------------------------------------------------------------------------------------------------------------------------------- | | Scan input before sending | Does not cover secrets in LLM responses, tool outputs, or retrieved content | | Request-level WAF | Does not understand LLM semantics, cannot intercept tool calls, no session isolation | | Other LLM security libraries | None provide session-per-workspace isolation + tool interception for Claude Code | | Rolling your own interceptors | Interceptor priority ordering is subtle and error-prone; PII tokenization with de-anonymization is complex to implement correctly |
What This Library Does NOT Do
- It is not a network firewall. Use your infrastructure's egress controls for that.
- It is not a full compliance platform. It produces audit logs; you store and review them.
- It is not a WAF. It does not inspect HTTP traffic outside LLM SDK calls.
- It does not prevent all prompt injection. Heuristic detection covers known patterns; novel attacks may pass.
Feature Overview
| Capability | Description |
| ----------------------------- | ---------------------------------------------------------------------------------------- |
| Secret detection | Entropy + regex scanning of requests and responses; redact or reject mode |
| PII anonymization | Format-preserving tokenization (name, email, phone, SSN, card); de-anonymize on response |
| Prompt injection detection | Heuristic detection for role confusion, delimiter attacks, encoding tricks |
| Output validation | Schema validation and content sanitization for SQL/XSS/command injection patterns |
| Tool security (RBAC) | Allowlist/denylist, parameter schema validation, trust levels per tool |
| Network control | Egress domain filtering, DNS tunneling detection, raw IP blocking |
| Session hardening | Workspace-scoped session IDs, expiry, replay detection |
| Health check + graceful drain | healthCheck() for readiness probes; drain() for SIGTERM handling |
| OTel observability | Span per interceptor, structured audit log entries, Prometheus metrics |
| Compliance presets | HIPAA, SOC 2, GDPR preset factories with all options pre-configured |
Security Presets
Three built-in security levels and three industry-specific compliance presets. Each preset's full option values are documented in docs/user-guide/index.md.
| Preset | Secret Redaction | Prompt Injection | Output Filtering | Network Control | Tool Security | Observability | Use Case |
| ------------------------ | ---------------- | ---------------- | ---------------- | --------------- | ------------- | ------------- | ------------------------- |
| createMinimalPolicy() | ✓ (redact) | | | | | | Prototyping, evaluation |
| createBalancedPolicy() | ✓ (redact) | ✓ (warn) | ✓ (warn) | | | ✓ | Development, testing |
| createStrictPolicy() | ✓ (reject) | ✓ (reject) | ✓ (sanitize) | ✓ (enforce) | ✓ | ✓ | Production, high-security |
| createHIPAAPolicy() | ✓ (reject) | ✓ (reject) | ✓ (sanitize) | ✓ (enforce) | ✓ | ✓ (PHI) | Healthcare data |
| createSOC2Policy() | ✓ (reject) | ✓ (reject) | ✓ (sanitize) | ✓ (enforce) | ✓ | ✓ (audit) | Enterprise compliance |
| createGDPRPolicy() | ✓ (redact) | ✓ (warn) | ✓ (warn) | | | ✓ (erasure) | European data processing |
Modes:
- redact: Replace secrets/PII with placeholders, continue
- warn: Log violation, continue
- reject: Block request/response on violation
- sanitize: Remove dangerous patterns, continue
Use a preset as a starting point and override specific fields:
import { createStrictPolicy, wrapAnthropic } from "@openclaw/agentic-security";
import Anthropic from "@anthropic-ai/sdk";
const policy = createStrictPolicy();
policy.networkControl.egress.allowedDomains = ["api.myservice.com"];
const client = wrapAnthropic(new Anthropic(), policy);Claude Code CLI Integration
createClaudeCodeSecurity() provides a purpose-built wrapper for Claude Code agents. It intercepts tool calls before execution and sanitizes tool results before they re-enter the conversation. Each workspace session is isolated by a derived session ID so replay attacks across workspaces are blocked.
import { createClaudeCodeSecurity, deriveSessionId } from "@openclaw/agentic-security";
const security = createClaudeCodeSecurity({ model: "claude-sonnet-4-6" });
const sessionId = deriveSessionId({ workspacePath: process.cwd() });
// Before the tool runs:
const result = await security.interceptToolCall({
sessionId,
toolType: "bash",
toolName: "Bash",
input: { command: "ls" },
timestamp: Date.now(),
});
if (result.action === "reject") throw new Error(result.errorMessage);
// Execute the tool, then scan the output:
const out = await security.interceptToolResult({
sessionId,
toolType: "bash",
toolName: "Bash",
output: stdout,
durationMs: 50,
timestamp: Date.now(),
});
const safe = out.action === "redact" ? out.sanitizedOutput : stdout;Call security.healthCheck() before accepting traffic and security.drain() on SIGTERM.
For the full integration pattern — model presets, session configuration, OTel wiring — see docs/user-guide/index.md.
Health Check and Drain
Use healthCheck() to implement a readiness probe before your process accepts traffic. Use drain() on SIGTERM to let active sessions complete before shutdown.
// Before accepting traffic:
const health = security.healthCheck();
// {
// rateLimitHeadroom: 1,
// sessionCount: 0,
// killSwitchActive: true,
// circuitBreakerState: "closed"
// }// On SIGTERM — drain active sessions before shutdown:
process.on("SIGTERM", () => {
security.drain();
process.exit(0);
});Architecture Overview
The security library operates as an interceptor pipeline that wraps SDK clients. Every request and response flows through a prioritized chain of security checks.
graph LR
A[User Input] --> B[Interceptor Pipeline]
B --> C[Request Hooks]
C --> D[SDK Client]
D --> E[LLM API]
E --> F[Response]
F --> G[Response Hooks]
G --> H[User Output]
subgraph "Request Flow (Ascending Priority)"
C --> C1[1: Kill Switch]
C1 --> C2[3: Rate Limiting]
C2 --> C3[5: Tool Security]
C3 --> C4[7: Network Control]
C4 --> C5[10: Context Isolation]
C5 --> C6[20: Prompt Injection Detection]
C6 --> C7[30: PII Anonymization]
C7 --> C8[100: Secret Scanning]
end
subgraph "Response Flow (Descending Priority)"
G --> G1[200: Audit Logging]
G1 --> G2[100: Secret Scanning]
G2 --> G3[90: Schema Validation]
G3 --> G4[80: PII De-Anonymization]
G4 --> G5[70: Content Sanitization]
endInterceptor Priority Stack
| Priority | Interceptor | Request | Response | Purpose | | -------- | -------------------- | ------- | -------- | ------------------------------------------ | | 1 | Kill Switch | ✓ | | Emergency halt mechanism | | 3 | Rate Limiting | ✓ | | Prevent DoS/abuse | | 5 | Tool Security | ✓ | | RBAC, allowlist/denylist, parameter checks | | 7 | Network Control | ✓ | | Egress filtering, DNS security | | 10 | Context Isolation | ✓ | | Provider-specific wrapping | | 20 | Prompt Injection | ✓ | | Detect role confusion, delimiter attacks | | 30 | PII Anonymization | ✓ | | Tokenize sensitive data (request) | | 70 | Content Sanitization | | ✓ | Remove SQL/XSS/command injection patterns | | 80 | PII De-Anonymization | | ✓ | Restore original PII (response) | | 90 | Schema Validation | | ✓ | Enforce response structure | | 100 | Secret Scanning | ✓ | ✓ | Detect leaked credentials | | 200 | Audit Logging | ✓ | ✓ | Record all security events |
Lower priority runs first for requests, last for responses.
Documentation
- docs/user-guide/ — full integration patterns, split by topic
- docs/user-guide/index.md — overview, preset option values, and links to each topic
- docs/user-guide/session-hardening.md — workspace isolation, session expiry, replay detection
- docs/user-guide/pii-anonymization.md — format-preserving tokenization, vault retention, de-anonymization
- docs/user-guide/ner-bridge.md — subprocess NER bridge architecture, configuration, data flow
- docs/user-guide/otel.md — span names, attributes, example trace output, Prometheus metrics
- docs/api-reference.md — complete API reference for all exported functions and types
- docs/setup-guide.md — preset configuration reference with all option values
- docs/integration-guide.md — Express, Koa, raw API integration patterns
- docs/security-model.md — threat model deep dive and OWASP LLM Top 10 mapping
Development Prerequisites
To build and test this package from a clone of the repository:
- Node.js >= 22 — Required. Use nvm or fnm to manage Node versions.
- pnpm — Required for workspace-level installs from the repo root (
npm install -g pnpm). The package itself can be used with npm or yarn as a consumer.
No other system dependencies are required for the core package. Optional features (NER subprocess bridge, OTel) have peer dependency requirements documented in the user guide.
Contributing
Contributions are welcome. Please open an issue or PR on GitHub.
License
MIT License — see LICENSE for details.
Built with OpenClaw — Security-first LLM application framework.
