@stylusnexus/agentarmor-ml
v0.1.2
Published
ML classifier add-on for Agent Armor. Downloads and runs a DeBERTa-v3-small ONNX model for agent trap detection.
Maintainers
Readme
@stylusnexus/agentarmor-ml
ML classifier add-on for Agent Armor. Runs a DeBERTa-v3-small ONNX model locally for deeper agent trap detection that catches threats regex patterns miss.
Why Use the ML Classifier?
Regex-based detection handles the obvious attacks: hidden HTML instructions, known jailbreak patterns, blatant exfiltration triggers. But sophisticated attacks use natural language to manipulate agent behavior through biased framing, subtle persona shifts, or contextual learning traps. These don't have a regex signature.
The ML classifier catches what patterns can't. It's trained on the full AI Agent Traps taxonomy, runs locally (no API calls, no data leaves your machine), and adds meaningful detection coverage on the semantic manipulation categories where regex falls short.
Install
npm install @stylusnexus/agentarmor @stylusnexus/agentarmor-mlUsage
import { AgentArmor } from '@stylusnexus/agentarmor';
const armor = await AgentArmor.create({
ml: { enabled: true },
});
const result = await armor.scan(content);
// ML-detected threats have source: 'ml'
result.threats.filter(t => t.source === 'ml');How It Works
On first use, the model (~140MB quantized ONNX) is downloaded from HuggingFace and cached locally:
- macOS:
~/Library/Caches/agentarmor/v1/ - Linux:
~/.cache/agentarmor/v1/ - Custom: Set
AGENTARMOR_CACHE_DIRor passml.modelDirin config
Subsequent runs load from cache with no network calls.
Configuration
const armor = await AgentArmor.create({
ml: {
enabled: true,
// Point to a local model directory (skips download)
modelDir: './models/agentarmor',
// Behavior when model is unavailable
onUnavailable: 'warn-and-skip', // 'throw' | 'warn-and-skip' | 'silent-skip'
// Download options
download: {
timeoutMs: 120_000,
retries: 2,
onProgress: (received, total) => {
console.log(`${Math.round(received / total * 100)}%`);
},
},
},
});CLI
Pre-download the model or manage the cache:
# Download model to cache (or custom directory)
agentarmor-ml download
agentarmor-ml download --dir ./models
# Show cache location and file sizes
agentarmor-ml cache-info
# Remove cached model
agentarmor-ml clear-cacheInference Details
- Tokenizes input to 512 tokens (WordPiece)
- Runs ONNX inference with INT8 quantization via
onnxruntime-node - Applies sigmoid on logits with strictness-based thresholds:
strict=0.3,balanced=0.5,permissive=0.7 scan()(sync) returns empty — ML inference is async-only viascanAsync()
Deployment Notes
- AWS Lambda: 140MB model + ~40MB onnxruntime = ~180MB, fits the 250MB limit but is tight. Use
modelDirto bundle the model in your deployment package. - Vercel Edge: Not supported (ONNX runtime requires Node.js native bindings).
Requirements
- Node.js >= 18
- Peer dependency:
@stylusnexus/agentarmor >= 0.2.0
License
MIT
