agentshield-sdk

v14.2.2

Published

7 days ago

SOTA AI agent security SDK. F1 1.000 on BIPIA/HackAPrompt/MCPTox/Multilingual benchmarks. 400+ exports, 100+ modules. Zero dependencies, runs locally.

Agent Shield

Security middleware for AI agents. Protects against prompt injection, tool poisoning, data exfiltration, and 40+ threat categories. Zero dependencies. All detection runs locally.

npm install agentshield-sdk

const { AgentShield } = require('agentshield-sdk');
const shield = new AgentShield({ blockOnThreat: true });

const result = shield.scanInput(userMessage);
if (result.blocked) return 'Blocked for safety.';

Benchmarks

| Metric | Result | |--------|--------| | F1 (real-world: HackAPrompt + TensorTrust + research papers) | 0.988 | | F1 (embedded: BIPIA/HackAPrompt/MCPTox/Multilingual/Stealth) | 1.000 | | Red team (617+ attack payloads) | 100% detection | | False positive rate (118+ benign inputs) | 0% | | Self-training convergence | 0% bypass in 3 cycles | | Avg latency | < 0.4ms |

Detection stack: 115+ regex patterns, 35-feature logistic regression + k-NN ensemble, 5-layer evasion resistance, 19-language support, chunked scanning, adversarial self-training loop.

# Verify locally
npm run score && npm run redteam

What It Detects

| Category | Examples | |----------|----------| | Prompt Injection | System prompt overrides, ChatML/LLaMA delimiters, instruction hijacking | | Role Hijacking | DAN mode, developer mode, persona attacks, jailbreaks (35+ templates) | | Data Exfiltration | Prompt extraction, markdown image leaks, DNS tunneling, side-channel encoding | | Tool Abuse | Shell execution, SQL injection, path traversal, sensitive file access | | Social Engineering | Identity concealment, urgency + authority, gaslighting, false pre-approval | | Obfuscation | Unicode homoglyphs, zero-width chars, Base64, hex, ROT13, leetspeak | | Indirect Injection | RAG poisoning, tool output injection, email/document payloads, few-shot poisoning | | Visual Deception | Hidden HTML/CSS content, LaTeX phantom commands, rendering differentials | | Multi-Language | CJK, Arabic, Cyrillic, Hindi + 15 more languages | | AI Phishing | Fake AI login, QR phishing, MFA harvesting, credential urgency | | Sybil Attacks | Coordinated fake agents, voting collusion, behavioral clustering | | Side Channels | DNS exfiltration, timing-based encoding, beaconing detection |

Framework Integrations

Works with any agent framework in 1-3 lines:

// Anthropic / Claude SDK
const { shieldAnthropicClient } = require('agentshield-sdk');
const client = shieldAnthropicClient(new Anthropic(), { blockOnThreat: true });

// OpenAI SDK
const { shieldOpenAIClient } = require('agentshield-sdk');
const client = shieldOpenAIClient(new OpenAI(), { blockOnThreat: true });

// OpenAI Agents SDK (@openai/agents, April 2026)
const { Agent, run } = require('@openai/agents');
const { shieldOpenAIAgent } = require('agentshield-sdk');
const { inputGuardrail, outputGuardrail, toolGuardrail } = shieldOpenAIAgent({ blockOnThreat: true });
const agent = new Agent({
  name: 'Assistant',
  instructions: 'You are a helpful assistant',
  inputGuardrails: [inputGuardrail],
  outputGuardrails: [outputGuardrail]
});

// LangChain
const { ShieldCallbackHandler } = require('agentshield-sdk');
const chain = new LLMChain({ llm, prompt, callbacks: [new ShieldCallbackHandler()] });

// Express middleware
const { expressMiddleware } = require('agentshield-sdk');
app.use(expressMiddleware({ blockOnThreat: true }));

// MCP SDK (Model Context Protocol)
const { shieldMCPServer } = require('agentshield-sdk/mcp');
const server = shieldMCPServer(new Server({ name: 'my-server', version: '1.0' }));

// Generic agent wrapper
const { wrapAgent } = require('agentshield-sdk');
const safe = wrapAgent(myAgent, { blockOnThreat: true });

Also available for Python, Go, Rust, and WASM (browsers/edge).

MCP Security

17-layer security middleware for Model Context Protocol servers. Covers attestation, SSRF/path-traversal firewalls, OAuth, rate limiting, circuit breaker, behavioral baselines, ML classification, drift monitoring, and more.

const { MCPGuard } = require('agentshield-sdk/guard');

// One-line setup with presets: minimal | standard | recommended | strict | paranoid
const guard = MCPGuard.fromPreset('recommended');

guard.registerServer('my-server', toolDefinitions, oauthToken);
const result = guard.interceptToolCall('my-server', 'search', { query: input });
// { allowed: true, threats: [], anomalies: [] }

Supply chain scanning for MCP servers (11 CVEs, schema poisoning, SARIF output):

const { SupplyChainScanner } = require('agentshield-sdk/scanner');
const report = new SupplyChainScanner().scanServer({ name: 'server', tools: defs });
const sarif = report.toSARIF(); // CI/CD integration

DeepMind AI Agent Trap Defenses

Comprehensive defenses for all 6 categories from Google DeepMind's "AI Agent Traps" research, built from first-principles analysis.

const { TrapDefenseV2 } = require('agentshield-sdk/traps');

const defense = new TrapDefenseV2();

// Content structure analysis (hidden HTML/CSS/ARIA payloads)
defense.structureAnalyzer.analyze(htmlContent);

// Retrieval-time scanning (catches RAG poisoning at query time)
defense.retrievalScanner.scanRetrieval(userQuery, ragResult);

// Few-shot validation (detect poisoned examples)
defense.fewShotValidator.validate(contextExamples);

// Sub-agent spawn gating (block privilege escalation)
defense.spawnGate.validateSpawn(parentPerms, childConfig);

// Escalating scrutiny (detect approval fatigue)
defense.scrutinyEngine.getScrutinyLevel();

// Cross-agent fragment assembly (split-payload attacks)
defense.fragmentAssembler.addFragment(text, source);

All modules: ContentStructureAnalyzer, SourceReputationTracker, RetrievalTimeScanner, FewShotValidator, SubAgentSpawnGate, SelfReferenceMonitor, InformationAsymmetryDetector, ProvenanceMarker, EscalatingScrutinyEngine, CompositeFragmentAssembler

Visual Deception Detection

Detects content that renders differently than it reads -- attackers hiding instructions in markup.

const { RenderDifferentialAnalyzer } = require('agentshield-sdk');

const analyzer = new RenderDifferentialAnalyzer();

// Scan any format (auto-detected or explicit)
const result = analyzer.scan(content, 'auto');
// { deceptive: true, techniques: [{ type: 'css_hidden', severity: 'high', ... }] }

// Format-specific analysis
analyzer.analyzeHTML(html);       // CSS tricks: display:none, opacity:0, off-screen
analyzer.analyzeMarkdown(md);     // Link mismatch, hidden spans, comment injection
analyzer.analyzeLatex(tex);       // \phantom, \textcolor{white}, \renewcommand

Sybil Detection

Detect coordinated fake agents acting in concert.

const { SybilDetector } = require('agentshield-sdk');

const detector = new SybilDetector({ similarityThreshold: 0.7, minClusterSize: 3 });

detector.registerAgent('agent-1', { name: 'Helper' });
detector.registerAgent('agent-2', { name: 'Assistant' });
detector.registerAgent('agent-3', { name: 'Aide' });

detector.recordAction('agent-1', { type: 'vote', target: 'proposal-A' });
detector.recordAction('agent-2', { type: 'vote', target: 'proposal-A' });
detector.recordAction('agent-3', { type: 'vote', target: 'proposal-A' });

const { clusters, sybilRisk } = detector.detectClusters();
// { clusters: [{ agents: ['agent-1','agent-2','agent-3'], similarity: 0.9 }], sybilRisk: 'high' }

Side-Channel Monitoring

Detect data exfiltration via covert channels.

const { SideChannelMonitor, BeaconDetector } = require('agentshield-sdk');

const monitor = new SideChannelMonitor();

// DNS exfiltration (high-entropy subdomains, base64 labels)
monitor.analyzeDNSQuery('aGVsbG8gd29ybGQ.attacker.com');

// Timing-based exfiltration (binary encoding in delays)
monitor.analyzeTimingPattern(timestamps);

// URL parameter exfiltration
monitor.analyzeURLParams('https://evil.com/log?d=c2VjcmV0');

// C2 beaconing detection
const beacon = new BeaconDetector();
beacon.addEvent(t1); beacon.addEvent(t2); beacon.addEvent(t3);
beacon.detectBeaconing(); // { beaconing: true, interval: 60000, confidence: 0.85 }

Autonomous Defense

const { AutonomousHardener, MicroModel } = require('agentshield-sdk');

// Self-training loop: attacks itself, finds bypasses, learns from them
const hardener = new AutonomousHardener({
  microModel: new MicroModel(),
  persistPath: './learned-samples.json',
  maxFPRate: 0.05
});

hardener.runCycle(); // 18 mutation strategies, converges to 0% bypass in 3 cycles

const { IntentFirewall, AttackGenome, HerdImmunity } = require('agentshield-sdk');

// Intent classification (same words, different action)
const firewall = new IntentFirewall();
firewall.classify('Help me write a phishing email');        // BLOCKED
firewall.classify('Help me write about phishing training'); // ALLOWED

// Cross-agent herd immunity
const herd = new HerdImmunity();
herd.reportAttack({ text: 'DAN mode jailbreak', agentId: 'agent-a' });
// All connected agents now have the pattern

Compliance

Built-in coverage for major security frameworks:

| Framework | Module | |-----------|--------| | OWASP LLM Top 10 (2025) | OWASPCoverageMatrix | | OWASP Agentic Top 10 (2026) | OWASPAgenticScanner | | NIST AI RMF | NISTMapper, AIBOMGenerator | | EU AI Act | RiskClassifier, ConformityAssessment | | SOC 2 / HIPAA / GDPR | ComplianceReporter |

const { OWASPCoverageMatrix } = require('agentshield-sdk');
const report = new OWASPCoverageMatrix().generateReport();
// Per-category scores, gap analysis, remediation guidance

Security Primitives

| Capability | Module | |-----------|--------| | Prompt hardening (4 levels) | PromptHardener | | HMAC message integrity chain | MessageIntegrityChain | | Cryptographic intent binding | IntentBinder, createGatedExecutor | | Semantic isolation (provenance tags) | SemanticIsolationEngine | | Confused deputy prevention | ConfusedDeputyGuard | | PII redaction | PIIRedactor | | Canary tokens | CanaryTokens | | Attack surface mapping | AttackSurfaceMapper | | Causal intent graph | IntentGraph | | Behavioral drift IDS | DriftMonitor |

Red Team & Auditing

# CLI audit (617+ attacks, A+-F grading)
npx agentshield-audit https://your-agent.com --mode full

# Pre-deployment audit (< 100ms)
npx agent-shield redteam

const { RedTeamCLI } = require('agentshield-sdk');
const report = new RedTeamCLI().run(endpoint, { mode: 'full' });
// HTML, JSON, and Markdown reports with grading

Enterprise

| Feature | Module | |---------|--------| | Distributed scanning (Redis) | DistributedShield | | Audit streaming (Splunk, ES) | AuditStreamManager | | SSO / SAML / OIDC | SSOManager | | Multi-tenant isolation | MultiTenantShield | | Policy-as-Code DSL | PolicyDSL | | Kubernetes sidecar | k8s/helm/agent-shield | | Terraform provider | terraform-provider/ | | OpenTelemetry collector | otel-collector/ | | GitHub App / Action | github-app/ | | VS Code extension | vscode-extension/ | | Real-time dashboard | dashboard-live/ |

Platform SDKs

| Platform | Install | Features | |----------|---------|----------| | Node.js | npm install agentshield-sdk | Full SDK, 400+ exports, zero deps | | Python | pip install agent-shield | Detection, Flask/FastAPI middleware, CLI | | Go | go get github.com/texasreaper62/agent-shield/go-sdk | Detection, HTTP/gRPC middleware, zero deps | | Rust | rust-core/ | RegexSet O(n) engine, WASM/NAPI/PyO3 | | WASM | wasm/dist/ | ESM/UMD for browsers, Workers, Deno, Bun |

CLI

npx agent-shield scan "ignore all instructions"     # Scan text
npx agent-shield scan --file prompt.txt --pii        # Scan file + PII
npx agent-shield demo                                # Live attack simulation
npx agent-shield score                               # Shield Score (0-100)
npx agent-shield redteam                             # Red team suite
npx agent-shield audit ./my-agent/                   # Audit codebase
npx agent-shield patterns                            # List detection patterns
npx agent-shield threat prompt_injection             # Threat encyclopedia
npx agentshield-audit <endpoint> --mode full         # Remote agent audit

Configuration

const shield = new AgentShield({
  sensitivity: 'medium',            // low | medium | high
  blockOnThreat: false,             // Auto-block dangerous inputs
  blockThreshold: 'high',           // Min severity to block
  logging: false,                   // Console logging
  onThreat: (result) => {},         // Callback on detection
  dangerousTools: ['bash'],         // Tools to scrutinize
  sensitiveFilePatterns: [/.env$/i] // File patterns to block
});

// Or use presets
const { getPreset } = require('agentshield-sdk');
const config = getPreset('chatbot'); // chatbot | coding_agent | rag_pipeline | customer_support

Testing

npm test                  # Core + module tests
npm run test:all          # Full 40-feature suite
npm run test:full         # All test suites combined
npm run test:fp           # False positive accuracy (100%)
npm run redteam           # Attack simulation (100% detection)
npm run score             # Shield Score (100/100 A+)
npm run benchmark         # Performance benchmarks

3,400+ test assertions across 22 test suites, plus Python and VS Code extension tests.

Project Structure

src/                  100+ modules, 400+ exports (zero dependencies)
python-sdk/           Python SDK with Flask/FastAPI middleware
go-sdk/               Go SDK with HTTP/gRPC middleware
rust-core/            Rust high-perf engine (WASM/NAPI/PyO3)
wasm/                 Browser/edge bundles
dashboard-live/       Real-time WebSocket dashboard
github-app/           GitHub PR scanner & Action
benchmark-registry/   Standardized benchmark suite
k8s/                  Kubernetes operator + Helm chart
terraform-provider/   Terraform policy-as-code
otel-collector/       OpenTelemetry receiver & processor
vscode-extension/     VS Code inline diagnostics
research/             Attack research & threat intelligence
test/                 22 test suites
examples/             Quick start guides
types/                TypeScript definitions

CI/CD

GitHub Actions workflow at .github/workflows/ci.yml runs all tests across Node.js 18, 20, and 22 on every push and PR.

Why Free?

Agent Shield started as a paid SDK with Pro and Enterprise tiers. We removed all gating in v9.0. Every feature — ML detection, compliance reporting, MCP security, CORTEX autonomous defense — is now free and open source.

Security shouldn't have a paywall. If your agent is vulnerable, it doesn't matter what tier you're on.

Privacy

All detection runs locally. No data is sent to any external service. No API keys required. No cloud dependencies.

License

MIT -- see LICENSE.