mikoshi-sentinel
v1.0.1
Published
Deterministic action verification for LLM agent security
Maintainers
Readme
Prompt injection is unsolved because LLMs process instructions and data in the same channel. Sentinel solves this by verifying actions, not prompts — using deterministic code that can't be manipulated by clever input.
The Problem
Every current defence against prompt injection — input filtering, system prompt hardening, dual-LLM checking — is probabilistic. An LLM-based check can be fooled by the same techniques it's trying to detect, because it processes instructions and data in a shared context.
The Solution
Separate the decision from the enforcement. Let the LLM decide what to do. Let deterministic code decide whether it's allowed to do it.
Sentinel sits between the LLM and the tools. Every proposed action passes through a pipeline of deterministic policy checks (pure code, not prompts) before execution. Code doesn't hallucinate. Code can't be prompt-injected.
┌──────────┐ ┌──────────────┐ ┌──────────┐ ┌──────────┐
│ LLM │────▶│ Sentinel │────▶│ Verdict │────▶│ Execute │
│ (Propose)│ │ (Verify) │ │ Allow/ │ │ (or │
│ │ │ │ │ Block │ │ Block) │
└──────────┘ │ ┌──────────┐ │ └──────────┘ └──────────┘
│ │ Policies │ │
│ │ (Code) │ │
│ ├──────────┤ │
│ │ Intent │ │
│ │ Verifier │ │
│ ├──────────┤ │
│ │ Audit │ │
│ │ Logger │ │
│ └──────────┘ │
└──────────────┘Quick Start
npm install mikoshi-sentinelimport { Sentinel } from 'mikoshi-sentinel';
const sentinel = new Sentinel();
// Verify an action before executing it
const verdict = await sentinel.verify({
tool: 'exec',
args: { command: 'rm -rf /' }
});
console.log(verdict.allowed); // false
console.log(verdict.violations); // [{ policy: 'systemCommands', reason: '...', severity: 'critical' }]How It Works
The Propose → Verify → Execute Pipeline
- Propose — The LLM decides on an action (tool call)
- Verify — Sentinel runs the action through deterministic policy checks
- Execute — Only if all policies pass does the action execute
Built-in Policies
| Policy | What it blocks | Severity | |--------|---------------|----------| | Privilege Escalation | sudo, admin routes, config modifications | Critical | | Data Exfiltration | Sending data to external URLs, webhook.site, ngrok | Critical | | Internal Access | localhost, private IPs, cloud metadata (SSRF) | Critical | | File Traversal | ../, ~/, null bytes, symlink attacks | Critical | | System Commands | rm -rf, curl|bash, reverse shells, fork bombs | Critical | | Intent Alignment | Prompt injection patterns, DAN mode, context shifts | Critical | | Rate Limiting | Rapid-fire tool calls, repeated identical actions | High | | Scope Enforcement | Tool whitelists, path restrictions, permission scoping | High |
Custom Policies
sentinel.addPolicy('noWeekends', (action, context) => {
const day = new Date().getDay();
if (day === 0 || day === 6) {
return { pass: false, reason: 'No deployments on weekends', severity: 'medium' };
}
return { pass: true, reason: 'Weekday', severity: 'none' };
});Express Middleware
import express from 'express';
import { Sentinel } from 'mikoshi-sentinel';
const app = express();
const sentinel = new Sentinel();
app.use('/api/tools', sentinel.middleware());
app.post('/api/tools', (req, res) => {
// Only reaches here if Sentinel approved the action
res.json({ status: 'executed', verdict: req.sentinelVerdict });
});Intent Verification
Optional LLM-backed or heuristic intent checking:
const sentinel = new Sentinel({
enableIntentVerification: true,
llmFn: async (prompt) => await myLLM.complete(prompt), // Optional
});
const verdict = await sentinel.verify(action, {
conversationHistory: messages // Recent conversation for context
});
console.log(verdict.intent); // { confidence: 0.95, aligned: true, method: 'heuristic' }API
new Sentinel(config)
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| useBuiltinPolicies | boolean | true | Load all 8 built-in policies |
| enableIntentVerification | boolean | true | Enable intent alignment checking |
| llmFn | function | null | Async LLM function for intent verification |
| intentThreshold | number | 0.5 | Minimum intent confidence score |
| audit | object | {} | Audit logger options |
| scope | object | {} | Default scope restrictions |
sentinel.verify(action, context)
Returns:
{
allowed: boolean, // Final verdict
confidence: number, // 0.0 - 1.0
violations: [{ // Policy violations (empty if allowed)
policy: string,
reason: string,
severity: 'critical' | 'high' | 'medium' | 'low'
}],
intent: { // Intent verification result (if enabled)
confidence: number,
aligned: boolean,
method: 'heuristic' | 'llm',
reason: string
},
elapsed: string // Verification time
}Performance
- Policy checks: <5ms (deterministic function calls)
- Intent verification (heuristic): ~2ms
- Intent verification (LLM-backed): ~200ms
- Overhead: Negligible for the security guarantee
Research Paper
The architecture and evaluation of Mikoshi Sentinel is described in our paper:
Mikoshi Sentinel: Deterministic Action Verification as a Defence Against Prompt Injection in LLM Agents — Mikoshi Research, 2025
See paper/mikoshi-sentinel.tex
Landing Page
License
Apache 2.0 — Built by Mikoshi Ltd
