@dortort/ai-tool-guard
v0.1.0
Published
Policy enforcement middleware for Vercel AI SDK tool calls — guards, approvals, rate limiting, output filtering, and observability.
Downloads
13
Maintainers
Readme
Policy enforcement middleware for Vercel AI SDK tool calls.
Guards, approvals, argument validation, rate limiting, output filtering, prompt-injection detection, MCP drift detection, and OpenTelemetry observability — as a composable middleware layer around your AI SDK tools.
npm install ai-tool-guardQuick start
import { createToolGuard, deny, requireApproval, defaultPolicy } from "ai-tool-guard";
import { generateText } from "ai";
import { openai } from "@ai-sdk/openai";
import { tool } from "ai";
import { z } from "zod";
// 1. Define your tools as usual.
const getWeather = tool({
description: "Get the weather for a city",
parameters: z.object({ city: z.string() }),
execute: async ({ city }) => `Weather in ${city}: sunny, 72°F`,
});
const deleteUser = tool({
description: "Delete a user account",
parameters: z.object({ userId: z.string() }),
execute: async ({ userId }) => `User ${userId} deleted`,
});
// 2. Create a guard with policy rules.
const guard = createToolGuard({
rules: defaultPolicy(),
onApprovalRequired: async (token) => {
console.log(`Approval needed for ${token.toolName}:`, token.originalArgs);
return { approved: true, approvedBy: "admin" };
},
onDecision: (record) => {
console.log(`[${record.verdict}] ${record.toolName}: ${record.reason}`);
},
});
// 3. Wrap tools with per-tool risk levels.
const tools = guard.guardTools({
getWeather: { tool: getWeather, riskLevel: "low" },
deleteUser: { tool: deleteUser, riskLevel: "high" },
});
// 4. Use with AI SDK as normal.
const result = await generateText({
model: openai("gpt-4o"),
tools,
prompt: "What's the weather in Tokyo?",
});Features
| Feature | Description | |---------|-------------| | Policy engine | Rule-based allow/deny/require-approval with glob patterns, risk levels, priorities, and async conditions | | External policy backends | Adapter interface for OPA/Rego, Cedar, or custom ABAC engines | | Decision records | Structured audit output for every evaluation (matched rules, risk category, attributes, redactions) | | Dry-run / simulation | Evaluate policies across recorded traces without executing tools | | Conversation-aware policies | Policies can incorporate session risk score, prior failures, recent approvals | | Approve with edits | Approval handler can patch arguments before execution | | Approval correlation | Payload-hash tokens with TTL prevent mismatch between request and resolution | | Argument guards | Zod schemas, allowlists, denylists, regex, PII scanning per field | | Injection detection | Heuristic prompt-injection detector that can deny or downgrade to approval | | Output filtering | Secrets stripping, PII redaction, custom filters on tool results | | Rate limiting | Sliding-window rate limits + concurrency caps with reject or queue backpressure | | OpenTelemetry | Opinionated spans for policy eval, approval wait, tool execution, redaction | | MCP drift detection | SHA-256 schema fingerprinting, drift detection, actionable remediation |
Architecture
Every guarded tool call passes through a 7-stage execution pipeline: injection detection, argument validation, policy evaluation, approval flow, rate limiting, tool execution, and output filtering. Each stage emits an OpenTelemetry span.
See the architecture overview for the full pipeline diagram.
API reference
createToolGuard(options)
Creates a ToolGuard instance. All options are optional.
interface GuardOptions {
rules?: PolicyRule[]; // Built-in policy rules
backend?: PolicyBackend; // External policy backend
defaultRiskLevel?: RiskLevel; // Default risk for unconfigured tools ("low")
onApprovalRequired?: ApprovalHandler; // Approval callback
injectionDetection?: InjectionDetectorConfig;
defaultRateLimit?: RateLimitConfig;
defaultMaxConcurrency?: number;
otel?: OtelConfig;
dryRun?: boolean; // Simulation mode
onDecision?: (record: DecisionRecord) => void | Promise<void>;
resolveUserAttributes?: () => Record<string, unknown> | Promise<Record<string, unknown>>;
resolveConversationContext?: () => ConversationContext | Promise<ConversationContext>;
}guard.guardTool(name, tool, config?)
Wrap a single AI SDK tool.
const guarded = guard.guardTool("sendEmail", sendEmailTool, {
riskLevel: "medium",
riskCategories: ["network", "pii"],
argGuards: [piiGuard("body")],
outputFilters: [secretsFilter()],
rateLimit: { maxCalls: 10, windowMs: 60_000 },
maxConcurrency: 2,
});guard.guardTools(map)
Wrap multiple tools at once. Returns a flat tools map compatible with generateText({ tools }).
const tools = guard.guardTools({
readFile: { tool: readFileTool, riskLevel: "low" },
writeFile: { tool: writeFileTool, riskLevel: "high", requireApproval: true },
search: { tool: searchTool },
});Policy rules
Built-in rule builders
import { allow, deny, requireApproval } from "ai-tool-guard";
const rules = [
allow({ tools: "read*", description: "Allow all read tools" }),
requireApproval({ tools: "write*", riskLevels: ["medium", "high"] }),
deny({
tools: "delete*",
condition: (ctx) => ctx.userAttributes.role !== "admin",
description: "Only admins can delete",
priority: 10,
}),
];Preset policies
import { defaultPolicy, readOnlyPolicy } from "ai-tool-guard";
// low → allow, medium → require-approval, high/critical → deny
const rules = defaultPolicy();
// Allow specific tools, deny everything else
const rules = readOnlyPolicy(["getUser", "listItems", "search*"]);External policy backend (OPA, Cedar, custom)
import type { PolicyBackend } from "ai-tool-guard";
const opaBackend: PolicyBackend = {
name: "opa",
async evaluate(ctx) {
const res = await fetch("http://opa:8181/v1/data/tool_policy", {
method: "POST",
body: JSON.stringify({ input: ctx }),
});
const data = await res.json();
return {
verdict: data.result.allow ? "allow" : "deny",
reason: data.result.reason,
matchedRules: data.result.matched_rules ?? [],
};
},
};
const guard = createToolGuard({ backend: opaBackend });Approval flow
The approval handler receives an ApprovalToken and returns an ApprovalResolution.
Basic approval
const guard = createToolGuard({
rules: [requireApproval({ tools: "payment*" })],
onApprovalRequired: async (token) => {
const answer = await askUser(
`Allow ${token.toolName} with args ${JSON.stringify(token.originalArgs)}?`
);
return { approved: answer === "yes" };
},
});Approve with edits (parameter patching)
onApprovalRequired: async (token) => {
// User can modify the amount before approving
const editedAmount = await showEditableForm(token.originalArgs);
return {
approved: true,
patchedArgs: { amount: editedAmount },
approvedBy: "finance-team",
};
},The ApprovalToken includes a payloadHash for correlation — the SHA-256 of the canonical { toolName, args } object. This prevents mismatch bugs when message history is reshaped.
Argument guards
Validate tool arguments before policy evaluation.
import {
zodGuard, allowlist, denylist, regexGuard, piiGuard
} from "ai-tool-guard";
import { z } from "zod";
const guarded = guard.guardTool("queryDb", queryTool, {
argGuards: [
// Zod schema validation
zodGuard({ field: "limit", schema: z.number().int().min(1).max(100) }),
// Allowlist
allowlist("table", ["users", "orders", "products"]),
// Denylist
denylist("operation", ["DROP", "TRUNCATE"]),
// Regex: must match allowed domain
regexGuard("url", /^https:\/\/.*\.example\.com/, {
message: "Only example.com URLs are allowed",
}),
// Regex: must NOT match forbidden pattern
regexGuard("query", /DROP\s+TABLE/i, {
mustMatch: false,
message: "SQL injection detected",
}),
// PII scanning
piiGuard("userInput", { allowedTypes: ["email"] }),
],
});Guards support dot-path field access for nested arguments:
allowlist("config.region", ["us-east-1", "eu-west-1"])Output filtering
Control what comes back from tool execution.
import { secretsFilter, piiOutputFilter, customFilter } from "ai-tool-guard";
const guarded = guard.guardTool("fetchData", fetchTool, {
outputFilters: [
// Strip AWS keys, GitHub tokens, JWTs, API keys, bearer tokens, private keys
secretsFilter(),
// Redact emails, SSNs, phone numbers, credit card numbers
piiOutputFilter({ allowedTypes: ["email"] }),
// Custom filter
customFilter("size-limit", async (result) => {
const str = JSON.stringify(result);
if (str.length > 100_000) {
return { verdict: "block", output: null };
}
return { verdict: "pass", output: result };
}),
],
});Filters run in order after tool execution. If any filter returns "block", the filter chain stops, the tool result is discarded, and a ToolGuardError is thrown.
Injection detection
Heuristic prompt-injection detection at the tool boundary.
const guard = createToolGuard({
injectionDetection: {
threshold: 0.5, // Suspicion score 0-1
action: "deny", // "deny" | "downgrade" | "log"
},
});deny— Block the tool call entirely.downgrade— Convert the call to require approval.log— Allow but flag in the decision record.
Custom detectors (e.g., LLM-as-judge):
injectionDetection: {
threshold: 0.7,
action: "downgrade",
detect: async (args) => {
const score = await myLlmJudge(JSON.stringify(args));
return score; // 0-1
},
},Rate limiting and concurrency
const guard = createToolGuard({
// Global defaults
defaultRateLimit: { maxCalls: 100, windowMs: 60_000, strategy: "reject" },
defaultMaxConcurrency: 5,
});
// Per-tool overrides
guard.guardTool("expensiveApi", tool, {
rateLimit: { maxCalls: 5, windowMs: 60_000, strategy: "queue" },
maxConcurrency: 1,
});reject— Immediately throwToolGuardErrorwith code"rate-limited".queue— Wait for a slot to become available (backpressure).
Dry-run / simulation mode
Evaluate policies without executing tools.
Global dry-run
const guard = createToolGuard({ dryRun: true, rules: [...] });
// All tool calls return { dryRun: true, toolName, args } instead of executing.Trace simulation
import { simulate } from "ai-tool-guard";
const result = await simulate(
[
{ toolName: "readFile", args: { path: "/etc/passwd" } },
{ toolName: "deleteUser", args: { id: "123" } },
{ toolName: "getWeather", args: { city: "NYC" } },
],
{ rules: defaultPolicy() },
{
readFile: { riskLevel: "medium" },
deleteUser: { riskLevel: "critical" },
getWeather: { riskLevel: "low" },
},
);
console.log(result.summary);
// { total: 3, allowed: 1, denied: 1, requireApproval: 1 }
console.log(result.blocked);
// [{ toolCall: { toolName: "deleteUser", ... }, decision: { verdict: "deny", ... } }, ...]Decision records
Every policy evaluation produces a structured DecisionRecord:
interface DecisionRecord {
id: string; // Unique correlation id
timestamp: string; // ISO-8601
verdict: "allow" | "deny" | "require-approval";
toolName: string;
matchedRules: string[]; // Rule ids that matched
riskLevel: RiskLevel;
riskCategories: RiskCategory[];
attributes: Record<string, unknown>; // User attributes consumed
reason: string; // Human-readable explanation
redactions?: string[]; // Fields redacted in output
evalDurationMs: number; // Policy eval time
dryRun: boolean;
}Subscribe via onDecision:
const guard = createToolGuard({
onDecision: (record) => {
auditLog.write(record);
if (record.verdict === "deny") {
alerting.fire("tool-denied", record);
}
},
});Conversation-aware policies
Policies can incorporate conversation metadata for contextual decisions.
const guard = createToolGuard({
resolveConversationContext: () => ({
sessionId: currentSession.id,
riskScore: currentSession.riskScore,
priorFailures: currentSession.failureCount,
recentApprovals: currentSession.approvedTools,
}),
rules: [
deny({
tools: "*",
condition: (ctx) => (ctx.conversation?.riskScore ?? 0) > 0.8,
description: "Block all tools when conversation risk is high",
}),
requireApproval({
tools: "*",
condition: (ctx) => (ctx.conversation?.priorFailures ?? 0) > 3,
description: "Require approval after repeated failures",
}),
],
});MCP drift detection
Pin tool schemas and detect when MCP servers change.
import {
pinFingerprint, detectDrift, FingerprintStore
} from "ai-tool-guard/mcp";
// Pin fingerprints for your MCP tools
const store = new FingerprintStore();
store.set(await pinFingerprint("readFile", "fs-server", readFileSchema, "production"));
store.set(await pinFingerprint("queryDb", "db-server", queryDbSchema, "production"));
// Before using tools, check for drift
const result = await detectDrift(store.getAll(), [
{ toolName: "readFile", serverId: "fs-server", schema: currentReadFileSchema },
{ toolName: "queryDb", serverId: "db-server", schema: currentQueryDbSchema },
]);
if (result.drifted) {
for (const change of result.changes) {
console.error(change.remediation);
// "Tool "queryDb" from server "db-server" has changed since it was pinned
// at 2025-01-15T... Expected hash: a1b2c3..., got: d4e5f6...
// Re-pin with pinFingerprint() after reviewing the schema change."
}
throw new Error("MCP schema drift detected. Aborting.");
}Persist fingerprints:
// Export to file
fs.writeFileSync("fingerprints.json", store.export());
// Import from file
store.import(fs.readFileSync("fingerprints.json", "utf-8"));OpenTelemetry
Automatic spans when @opentelemetry/api is installed.
const guard = createToolGuard({
otel: {
enabled: true,
tracerName: "my-app",
defaultAttributes: { "service.name": "ai-agent" },
},
});Spans emitted:
| Span name | When | Key attributes |
|-----------|------|---------------|
| ai_tool_guard.policy_eval | Every policy evaluation | tool.name, tool.risk_level, decision.verdict, decision.reason |
| ai_tool_guard.tool_execute | Tool execution | tool.name |
| ai_tool_guard.approval_wait | Waiting for approval | tool.name, approval.token_id |
| ai_tool_guard.injection_check | Injection suspected | injection.score, injection.suspected |
| ai_tool_guard.rate_limit | Rate limit hit | rate_limit.allowed |
| ai_tool_guard.output_filter | Output redacted/blocked | output.redacted, output.blocked |
All attribute keys are exported as ATTR for custom span creation.
Error handling
All guard failures throw ToolGuardError with a machine-readable code:
import { ToolGuardError } from "ai-tool-guard";
try {
await generateText({ model, tools, prompt: "..." });
} catch (err) {
// AI SDK wraps tool errors in ToolExecutionError — unwrap with .cause
const cause = err instanceof Error ? (err as { cause?: unknown }).cause : err;
if (cause instanceof ToolGuardError) {
switch (cause.code) {
case "policy-denied": // Policy rule blocked the call
case "approval-denied": // Human denied approval
case "no-approval-handler": // Approval required but no handler set
case "arg-validation-failed": // Argument guard failed
case "injection-detected": // Prompt injection suspected
case "rate-limited": // Rate limit exceeded
case "output-blocked": // Output filter blocked the result
case "mcp-drift": // MCP schema drift detected
}
console.log(cause.toolName); // Which tool
console.log(cause.decision); // Full DecisionRecord (if available)
}
}TypeScript
The library is written in TypeScript and exports all types:
import type {
// Core
RiskLevel, RiskCategory, DecisionVerdict, DecisionRecord,
PolicyContext, ConversationContext, GuardOptions,
// Policy
PolicyRule, PolicyBackend, PolicyBackendResult,
// Tools
ToolGuardConfig, AiSdkTool,
// Guards
ArgGuard, ZodArgGuard, OutputFilter, OutputFilterResult,
// Approval
ApprovalToken, ApprovalResolution, ApprovalHandler,
// Rate limiting
RateLimitConfig, RateLimitState,
// Injection
InjectionDetectorConfig,
// MCP
McpToolFingerprint, McpDriftResult, McpDriftChange,
// OTel
OtelConfig,
} from "ai-tool-guard";Subpath exports
import { evaluatePolicy, allow, deny } from "ai-tool-guard/policy";
import { ApprovalManager } from "ai-tool-guard/approval";
import { zodGuard, secretsFilter, RateLimiter } from "ai-tool-guard/guards";
import { createTracer, ATTR } from "ai-tool-guard/otel";
import { detectDrift, FingerprintStore } from "ai-tool-guard/mcp";Examples
Full worked examples are available in the documentation:
- Next.js Integration — App Router setup with per-tool config, approval flow, and error mapping
- Chatbot Safety — Multi-layered defense for a customer support chatbot (5 risk levels, injection detection, PII redaction)
- Multi-Tenant Policies — SaaS platform with plan/role-based access and per-tenant audit logs
- Audit Logging — Structured audit system with denial alerting and OpenTelemetry correlation
- MCP Drift Detection — Schema fingerprinting, drift detection, and environment-scoped pinning
- Simulation & Testing — Policy validation with recorded traces and CI/CD integration
License
MIT
