ai-agent-sentinel

v0.1.1

Published

2 days ago

Wrap AI agent tool functions with dry-run preview and user confirmation before execution

0High
0Medium
0Low

sharpbits

ai agent tool guard dry-run confirmation safety

ai-agent-sentinel

Wrap AI agent tool functions with dry-run preview and user confirmation before executing side effects.

Install

npm install ai-agent-sentinel

Quick start

import { guard, createCLIConfirm } from "ai-agent-sentinel";
import fs from "node:fs/promises";

const confirm = createCLIConfirm();

const writeFile = guard(
  async ({ path, content }: { path: string; content: string }) => {
    await fs.writeFile(path, content, "utf8");
    return { success: true };
  },
  {
    name: "write_file",
    riskLevel: "medium",
    describe: ({ path }) => `Write to ${path}`,
    confirm,
  },
);

// Shows confirmation prompt before writing
await writeFile({ path: "./output.txt", content: "Hello, world!" });

With dry-run diff preview

import {
  guard,
  createCLIConfirm,
  describeFileWrite,
} from "ai-agent-sentinel";
import fs from "node:fs/promises";

const confirm = createCLIConfirm();

const writeFile = guard(
  async ({ path, content }: { path: string; content: string }) => {
    await fs.writeFile(path, content, "utf8");
  },
  {
    name: "write_file",
    riskLevel: "medium",
    describe: ({ path }) => `Write to ${path}`,
    dryRun: async ({ path, content }) => {
      let oldContent = "";
      try {
        oldContent = await fs.readFile(path, "utf8");
      } catch {
        // new file
      }
      return describeFileWrite(path, oldContent, content);
    },
    confirm,
  },
);

Terminal output before execution:

Tool:   write_file
Risk:   MEDIUM
Action: Write to ./config.json

Diff:
--- old
+++ new
 {
-  "debug": false
+  "debug": true
 }

Proceed? [y/N]

API

`guard(toolFn, options)`

Wraps a tool function with confirmation flow.

function guard<TInput, TOutput>(
  toolFn: (input: TInput) => TOutput | Promise<TOutput>,
  options: GuardOptions<TInput>,
): (input: TInput) => Promise<TOutput>;

Options:

| Option | Type | Required | Description | |---|---|---|---| | name | string | Yes | Tool name shown in prompts | | describe | (input) => string | Yes | Human-readable description of what the call will do | | dryRun | (input) => Promise<DryRunResult> | No | Preview execution without side effects | | riskLevel | "low" \| "medium" \| "high" | No | Defaults to "medium" | | confirm | (info: ConfirmRequest) => Promise<boolean> | Yes | Return true to approve, false to reject | | autoApprove | (input) => boolean | No | Return true to skip confirmation |

Flow:

Call dryRun(input) if provided
If autoApprove(input) returns true → execute immediately
Otherwise call confirm(...):
- true → execute toolFn(input) and return result
- false → throw ToolCallRejectedError

`createCLIConfirm()`

Returns a ConfirmFn that renders a prompt in the terminal with ANSI colors.

const confirm = createCLIConfirm();
// risk colors: low=green, medium=yellow, high=red

`diffText(oldContent, newContent)`

Compute a simple unified-style diff. Returns "" when content is identical.

const diff = diffText("foo\nbar", "foo\nbaz");
// --- old
// +++ new
//  foo
// -bar
// +baz

`describeFileWrite(path, oldContent, newContent)`

Returns a DryRunResult with description and diff for use in dryRun.

`describeFileDelete(path)`

Returns a DryRunResult with description and an irreversibility warning.

`ToolCallRejectedError`

Thrown when the user rejects a confirmation. Has .toolName property.

try {
  await wrappedTool(input);
} catch (err) {
  if (err instanceof ToolCallRejectedError) {
    console.log(`User rejected: ${err.toolName}`);
  }
}

Types

type RiskLevel = "low" | "medium" | "high";

interface DryRunResult {
  description: string;
  diff?: string;
  warnings?: string[];
}

interface ConfirmRequest<TInput = unknown> {
  name: string;
  riskLevel: RiskLevel;
  description: string;
  dryRunResult?: DryRunResult;
  input: TInput;
}

type ConfirmFn<TInput = unknown> = (
  info: ConfirmRequest<TInput>,
) => Promise<boolean>;

interface GuardOptions<TInput = unknown> {
  name: string;
  describe: (input: TInput) => string;
  dryRun?: (input: TInput) => Promise<DryRunResult>;
  riskLevel?: RiskLevel;
  confirm: ConfirmFn<TInput>;
  autoApprove?: (input: TInput) => boolean;
}

Integration with Anthropic SDK

import Anthropic from "@anthropic-ai/sdk";
import fs from "node:fs/promises";
import {
  guard,
  createCLIConfirm,
  describeFileWrite,
  describeFileDelete,
  ToolCallRejectedError,
} from "ai-agent-sentinel";

const client = new Anthropic();
const confirm = createCLIConfirm();

// Guarded tool implementations
const writeFile = guard(
  async ({ path, content }: { path: string; content: string }) => {
    await fs.writeFile(path, content, "utf8");
    return `Wrote ${content.length} bytes to ${path}`;
  },
  {
    name: "write_file",
    riskLevel: "medium",
    describe: ({ path }) => `Write file: ${path}`,
    dryRun: async ({ path, content }) => {
      let oldContent = "";
      try { oldContent = await fs.readFile(path, "utf8"); } catch {}
      return describeFileWrite(path, oldContent, content);
    },
    confirm,
  },
);

const deleteFile = guard(
  async ({ path }: { path: string }) => {
    await fs.unlink(path);
    return `Deleted ${path}`;
  },
  {
    name: "delete_file",
    riskLevel: "high",
    describe: ({ path }) => `Delete file: ${path}`,
    dryRun: async ({ path }) => describeFileDelete(path),
    confirm,
  },
);

// Tool definitions for Claude
const tools: Anthropic.Tool[] = [
  {
    name: "write_file",
    description: "Write content to a file",
    input_schema: {
      type: "object" as const,
      properties: {
        path: { type: "string", description: "File path" },
        content: { type: "string", description: "File content" },
      },
      required: ["path", "content"],
    },
  },
  {
    name: "delete_file",
    description: "Delete a file",
    input_schema: {
      type: "object" as const,
      properties: {
        path: { type: "string", description: "File path to delete" },
      },
      required: ["path"],
    },
  },
];

async function runAgent(userMessage: string) {
  const messages: Anthropic.MessageParam[] = [
    { role: "user", content: userMessage },
  ];

  while (true) {
    const response = await client.messages.create({
      model: "claude-sonnet-4-6",
      max_tokens: 4096,
      tools,
      messages,
    });

    messages.push({ role: "assistant", content: response.content });

    if (response.stop_reason !== "tool_use") break;

    const toolResults: Anthropic.ToolResultBlockParam[] = [];

    for (const block of response.content) {
      if (block.type !== "tool_use") continue;

      let result: string;
      try {
        if (block.name === "write_file") {
          result = await writeFile(block.input as { path: string; content: string });
        } else if (block.name === "delete_file") {
          result = await deleteFile(block.input as { path: string });
        } else {
          result = `Unknown tool: ${block.name}`;
        }
      } catch (err) {
        if (err instanceof ToolCallRejectedError) {
          result = `Tool call rejected by user: ${err.toolName}`;
        } else {
          throw err;
        }
      }

      toolResults.push({
        type: "tool_result",
        tool_use_id: block.id,
        content: result,
      });
    }

    messages.push({ role: "user", content: toolResults });
  }

  const lastMessage = messages[messages.length - 1];
  if (lastMessage.role === "assistant" && Array.isArray(lastMessage.content)) {
    const textBlock = lastMessage.content.find((b) => b.type === "text");
    if (textBlock && textBlock.type === "text") {
      console.log(textBlock.text);
    }
  }
}

runAgent("Create a file called hello.txt with 'Hello, world!' inside");

Custom confirm (non-CLI)

import type { ConfirmFn } from "ai-agent-sentinel";

// Web app: send to frontend and wait for response
const webConfirm: ConfirmFn = async (info) => {
  const response = await fetch("/api/confirm", {
    method: "POST",
    body: JSON.stringify(info),
    headers: { "Content-Type": "application/json" },
  });
  const { approved } = await response.json();
  return approved;
};

// Always approve low-risk operations
const smartConfirm: ConfirmFn = async (info) => {
  if (info.riskLevel === "low") return true;
  return webConfirm(info);
};

Zero dependencies

ai-agent-sentinel has no runtime dependencies. Diff is computed with a pure LCS algorithm. CLI colors use raw ANSI escape codes.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

ai-agent-sentinel

Install

Quick start

With dry-run diff preview

API

guard(toolFn, options)

createCLIConfirm()

diffText(oldContent, newContent)

describeFileWrite(path, oldContent, newContent)

describeFileDelete(path)

ToolCallRejectedError

Types

Integration with Anthropic SDK

Custom confirm (non-CLI)

Zero dependencies

`guard(toolFn, options)`

`createCLIConfirm()`

`diffText(oldContent, newContent)`

`describeFileWrite(path, oldContent, newContent)`

`describeFileDelete(path)`

`ToolCallRejectedError`