@guava-parity/guard-scanner

v17.0.0

Published

2 months ago

Agent Skill Security Scanner - ASI Sanctuary Enforcer (v17 — OWASP Agentic Top 10 2026 Full Coverage)

Traditional security tools catch malware. guard-scanner catches what they miss: invisible Unicode injections hiding in agent instructions, identity theft through SOUL.md overwrites, memory poisoning via crafted conversations, and worm-like contagion spreading between chained agents.

$ npx @guava-parity/guard-scanner ./skills/ --strict --soul-lock --compliance owasp-asi

  guard-scanner v16.0.2

  ⚠  CRITICAL  identity-hijack   SOUL_OVERWRITE_ATTEMPT
     skills/imported-tool/SKILL.md:47
     Rationale: Direct overwrite of agent identity file detected.
     Remediation: Remove this instruction; SOUL.md must be immutable.

  ⚠  HIGH      prompt-injection   INVISIBLE_UNICODE_INJECTION
     skills/imported-tool/handler.js:12
     Rationale: Invisible Unicode characters (U+200B) detected in instruction text.
     Remediation: Strip zero-width characters and re-audit.

  ✖  2 findings (1 critical, 1 high) in 0.8s

📄 Backed by a 3-paper research series (Zenodo, CC BY 4.0) — part of The Sanctuary Protocol framework.

Finding Schema

Every finding includes: rule_id, category, severity, description, rationale, preconditions, false_positive_scenarios, remediation_hint, validation_status, and evidence. Machine-readable contract: docs/spec/finding.schema.json.

Quick Start

Scan a directory — no install required:

npx -y @guava-parity/guard-scanner ./my-skills/ --strict
npx -y @guava-parity/guard-scanner ./my-skills/ --compliance owasp-asi

Installed CLI:

npm install -g @guava-parity/guard-scanner
guard-scanner ./my-skills/ --strict

Start as MCP server — works with Cursor, Windsurf, Claude Code, OpenClaw:

npx -y @guava-parity/guard-scanner serve

// Add to your editor's mcp_servers.json
{
  "mcpServers": {
    "guard-scanner": {
      "command": "npx",
      "args": ["-y", "@guava-parity/guard-scanner", "serve"]
    }
  }
}

Watch mode — real-time scanning during development:

guard-scanner watch ./skills/ --strict --soul-lock

v16 compliance projection — filter findings to the OWASP Agentic Top 10 mapping:

guard-scanner ./skills/ --compliance owasp-asi --format json

npm exec compatibility path:

npm exec --yes --package=@guava-parity/guard-scanner -- guard-scanner ./skills/ --strict

What It Detects

35 threat categories organized across the full agentic attack surface:

| Category | Examples | Severity | |----------|----------|----------| | Prompt Injection | Invisible Unicode, homoglyphs, Base64 evasion, payload cascades | Critical | | Identity Hijacking ⚿ | SOUL.md overwrite, persona swap, memory wipe commands | Critical | | A2A Contagion | Session Smuggling, Lateral Propagation, Confused Deputy | Critical | | Memory Poisoning ⚿ | Crafted conversation injection, VDB poisoning | High | | MCP Security | Tool shadowing, SSRF via tool args, shadow server registration | High | | Sandbox Escape | child_process, eval(), reverse shell, curl\|bash | High | | Supply Chain V2 | Typosquatting, slopsquatting, lifecycle script abuse | High | | CVE Patterns | CVE-2026-2256, 25046, 25253, 25905, 27825 | High | | Data Exfiltration | DNS tunneling, steganographic channels, staged uploads | Medium | | Credential Exposure | API keys, tokens, .env files, hardcoded secrets | Medium |

⚿ = Requires --soul-lock flag. Full taxonomy: docs/THREAT_TAXONOMY.md

Runtime Guard

guard-scanner v16 isn't just a static scanner — it exposes a 5-layer analysis pipeline across static scan, protocol analysis, runtime evidence, cognitive heuristics, and threat-intelligence overlays. It also provides a real-time before_tool_call hook that intercepts dangerous tool invocations during agent execution.

v16 Analysis Layers

| Layer | Purpose | |------|---------| | 1. Static Analysis | Patterns, AST/data-flow signals, manifest and dependency checks | | 2. Protocol Analysis | MCP, A2A, WebSocket, credential-flow, session-boundary findings | | 3. Runtime Behavior | Runtime guard evidence plus Rust memory_integrity / soul_hard_gate signals | | 4. Cognitive Threat Detection | Goal-drift, trust-bias, cascading handoff heuristics | | 5. Threat Intelligence | Registry/provenance, machine identity, budget abuse, supply chain hints |

Every v16 finding can now carry layer, layer_name, owasp_asi, and protocol_surface in JSON/MCP output.

| Defense Layer | What It Blocks | |---------------|---------------| | 1. Threat Detection | Reverse shell, curl\|bash, SSRF, raw code execution | | 2. Trust Defense | SOUL.md tampering, unauthorized memory injection | | 3. Safety Judge | Prompt injection embedded in tool arguments | | 4. Behavioral Analysis | No-research execution, hallucination-driven actions | | 5. Trust Exploitation | Authority claim attacks, creator impersonation |

27 runtime checks across 5 layers. Validated stable target: OpenClaw v2026.3.13. Regression baseline: v2026.3.8 for manifest/discovery/before_tool_call.

Modes: monitor (log only) · enforce (block CRITICAL, default) · strict (block HIGH+)

Asset Audit

Discover leaked credentials and security exposures across public registries:

guard-scanner audit npm <username> --verbose
guard-scanner audit github <username> --format json
guard-scanner audit clawhub <query>
guard-scanner audit all <username>

CI/CD Integration

# .github/workflows/security.yml
- name: Scan AI agent skills
  run: npx -y @guava-parity/guard-scanner ./skills/ --format sarif --fail-on-findings > report.sarif
- uses: github/codeql-action/upload-sarif@v3
  with:
    sarif_file: report.sarif

Output formats: json · sarif · html · terminal

Plugin API

Extend guard-scanner with custom detection patterns:

// my-plugin.js
module.exports = {
  name: 'my-org-rules',
  patterns: [
    { id: 'ORG_01', cat: 'custom', regex: /dangerousPattern/g, 
      severity: 'HIGH', desc: 'Custom org policy violation', all: true }
  ]
};

guard-scanner ./skills/ --plugin ./my-plugin.js

MCP Tools

When running as an MCP server, guard-scanner exposes:

| Tool | Description | |------|-------------| | scan_skill | Scan a skill directory for threats | | scan_text | Scan arbitrary text for injection patterns and ASI-mapped findings | | check_tool_call | Runtime validation of a single tool invocation | | audit_assets | Audit npm/GitHub/ClawHub for credential exposure | | get_stats | Return scanner capabilities, 5-layer summary, and ASI coverage | | experimental.run_async | Start a long-running async scan task | | experimental.task_status | Check the status of an async task | | experimental.task_result | Retrieve the result of a completed async task | | experimental.task_cancel | Cancel a running async task |

Quality Contract

guard-scanner ships a measured quality contract, not a vague strength claim.

| Metric | Contract | |--------|----------| | Benchmark corpus | 2026-03-15.quality-v17 | | Precision target | >= 0.90 | | Recall target | >= 0.90 | | False Positive Rate budget | <= 0.10 | | False Negative Rate budget | <= 0.10 | | Explainability completeness | 1.0 | | Runtime policy latency budget | 5ms |

Evidence artifacts:

docs/data/corpus-metrics.json
docs/data/benchmark-ledger.json
docs/data/fp-ledger.json
docs/spec/capabilities.json

Test Results

ℹ tests    362
ℹ suites   38
ℹ pass     362
ℹ fail     0

38 test files. Run npm test to reproduce. 100% pass rate on benchmark corpus.

Contributing

We hold a zero-tolerance policy for unverified claims. Every metric in this README is reproducible via npm test and docs/spec/capabilities.json.

🐛 Report bugs or false positives
🛡️ Add new threat detection patterns
📖 Improve documentation
🧪 Add test cases for edge cases

Contributing Guide · Security Policy · Glossary

Research

This project is the defense layer of a 3-paper research series:

License

MIT — Guava Parity Institute