skillshield

v2.1.0

Published

13 days ago

Runtime security for AI Agent Skills — Scan, sandbox & enforce. Detect prompt injection, memory poisoning, supply chain attacks. 72+ patterns, 14 categories. The firewall Snyk and Cisco don't build.

Downloads

364

  ███████╗██╗  ██╗██╗██╗     ██╗     ███████╗██╗  ██╗██╗███████╗██╗     ██████╗
  ██╔════╝██║ ██╔╝██║██║     ██║     ██╔════╝██║  ██║██║██╔════╝██║     ██╔══██╗
  ███████╗█████╔╝ ██║██║     ██║     ███████╗███████║██║█████╗  ██║     ██║  ██║
  ╚════██║██╔═██╗ ██║██║     ██║     ╚════██║██╔══██║██║██╔══╝  ██║     ██║  ██║
  ███████║██║  ██╗██║███████╗███████╗███████║██║  ██║██║███████╗███████╗██████╔╝
  ╚══════╝╚═╝  ╚═╝╚═╝╚══════╝╚══════╝╚══════╝╚═╝  ╚═╝╚═╝╚══════╝╚══════╝╚═════╝

Runtime Security for AI Agent Skills — Scan, Sandbox & Enforce.

The first open-source tool that scans AND stops malicious AI skills at runtime. Network interception, filesystem jail, kill switch, and cryptographic audit trail — in one developer-first CLI.

The Problem

"The industry has invested in watching. It hasn't invested in stopping." — Bessemer Venture Partners

Every existing tool for AI skill security does the same thing: scan before install, then hope for the best. Snyk agent-scan, Cisco skill-scanner, VirusTotal — they all stop at detection. Once a skill passes their checks (or bypasses them), there's zero protection at runtime.

Meanwhile: 36% of ClawHub skills have security flaws. 12% are actual malware. And the most dangerous attacks — sleeper agents, time-delayed exfiltration, polymorphic payloads — are invisible to pre-install scanners.

The Solution: SkillShield

SkillShield is the first tool that combines pre-execution scanning with runtime enforcement in a single CLI. It doesn't just detect threats — it prevents them from executing.

npm install -g skillshield

# Scan a skill (72+ patterns, 14 threat categories)
skillshield scan suspicious-skill.md

# Scan + Shield + Execute (the full pipeline)
skillshield run my-skill.md --input "Hello world"

# Save cryptographic audit trail for compliance
skillshield run my-skill.md --audit-file trail.json

How It Works

┌──────────────────────────────────────────────────────────────┐
│                    skillshield run                            │
├──────────┬───────────────────┬──────────────┬────────────────┤
│  PHASE 1 │     PHASE 2       │   PHASE 3    │    PHASE 4     │
│  SCAN    │     SHIELD        │   EXECUTE    │    REPORT      │
│          │                   │              │                │
│ 72+      │ Network Policy    │ Enforcement  │ Shield Report  │
│ patterns │ Filesystem Jail   │ wrapper      │ Audit chain    │
│ 14 cats  │ Kill Switch       │ injected     │ Violations     │
│          │ Audit Trail       │              │ Chain hash     │
└──────────┴───────────────────┴──────────────┴────────────────┘

Phase 1: Pre-Scan (SkillGuard)

72+ regex patterns across 14 threat categories — including 3 categories nobody else detects:

| Category | Patterns | What It Catches | |----------|---------|----------------| | Memory Poisoning | 7 | SOUL.md/MEMORY.md manipulation, sleeper agents, cross-session persistence | | Prompt Injection | 6 | Ignore instructions, fake [SYSTEM] tags, context reset, privilege escalation | | Sensitive Data | 10 | OpenAI/Anthropic/AWS/Groq keys, JWT tokens, private keys, SSNs, credit cards | | Supply Chain | 6 | npm/pip install in skills, pipe-to-shell, postinstall hooks, remote imports | | Code Injection | 8 | eval(), exec(), spawn(), dynamic require, innerHTML, child_process | | Data Exfiltration | 8 | fetch POST, XMLHttpRequest, curl, wget, sendBeacon, cloud storage copy | | Credential Theft | 7 | process.env, .ssh/.aws files, .env files, hardcoded passwords, git credentials | | File System Abuse | 7 | rm -rf, chmod, disk destruction, fs.writeFile to system paths | | Crypto Mining | 4 | Mining pools, wallet addresses, coinhive, WebWorker mining | | Keylogger | 4 | keydown/keyup listeners, clipboard access, keyboard simulation | | Obfuscation | 4 | Base64 decode, String.fromCharCode, hex/unicode escapes | | Network Abuse | 4 | Port scanning, DNS exfiltration, SSRF, SSH/Telnet | | Privilege Escalation | 2 | sudo/su, SUID/SGID bits | | Malware | 4 | Reverse shells, fork bombs, encoded PowerShell, exploitation frameworks |

Phase 2: Runtime Shield (The Differentiator)

This is what makes SkillShield unique. Four enforcement layers activate during skill execution:

Network Policy Engine — Default-deny networking. Skills can only reach explicitly allowed domains. Blocks known malicious domains (ngrok.io, webhook.site, requestbin.com) and crypto mining pools. Intercepts dns.lookup and https.request at the Node.js level.

# Only allow specific domains
skillshield run my-skill.md --allow-domains api.openai.com,github.com

Filesystem Jail — Skills cannot read or write sensitive paths. Protects ~/.ssh, ~/.aws, .env, SOUL.md, MEMORY.md, IDENTITY.md, private keys, and credentials. Monkey-patches fs.readFileSync, fs.writeFileSync, and fs.unlinkSync.

Kill Switch — Real-time monitoring of skill output. If the skill produces malicious patterns during execution (not just in source code), SkillShield kills the process immediately. Triggers on: timeout (60s default), memory limit (512MB), output flooding (10MB), critical threat patterns, or max violation count.

Cryptographic Audit Trail — Every action during execution (scan, network request, file access, kill switch activation) is recorded in a SHA-256 hash-chained log. Each entry links to the previous via hash, creating a tamper-evident chain. Export to JSON for compliance.

# Save the full audit trail
skillshield run my-skill.md --audit-file audit.json

# The audit trail is hash-chained (blockchain-style)
# Tampering with any entry breaks the chain verification

Phase 4: Shield Report

After every execution, SkillShield prints a complete security report:

  ────────────────────────────────────────────────────
  SHIELD REPORT
  ────────────────────────────────────────────────────
  Status: CLEAN EXECUTION

  Pre-Scan Score:  95/100 (APPROVED)
  Network:         0 violations
  Filesystem:      0 violations
  Runtime Threats: 0 detected
  Duration:        1247ms

  Audit Chain:     6 entries
  Latest Hash:     a3f8b2c1d4e5f6a7b8c9...
  Chain Integrity: VERIFIED
  ────────────────────────────────────────────────────

Why Not Just Use...

| Tool | What It Does | What It Doesn't Do | |------|-------------|-------------------| | Snyk agent-scan | LLM judges + regex, pre-install | No runtime enforcement. Scan-only. | | Cisco skill-scanner | YARA + AST + policy engine | No runtime enforcement. Pre-install only. | | NVIDIA OpenShell | Linux runtime sandboxing | Enterprise-only. Linux-only. No pre-scan. | | Aegis | LLM API call proxy | Only intercepts API calls, not filesystem/network. | | rohitg00/skillkit | 46 rules + skill translation | No runtime. No enforcement. No audit trail. | | SkillShield | Scan + Network + Filesystem + Kill Switch + Audit | The full pipeline in one CLI. Cross-platform. |

Security Badge

Show the world your skills are verified:

skillshield badge my-skill.md                    # Generate badge
skillshield badge my-skill.md --output README.md  # Auto-append to README

| Score | Badge | Status | |-------|-------|--------| | 90-100 | | SAFE | | 80-89 | | APPROVED | | 50-79 | | REVIEW REQUIRED | | 0-49 | | BLOCKED |

Architecture

skillshield/
├── src/
│   ├── guard/          # SkillGuard — 72+ threat patterns, 14 categories
│   ├── shield/         # Runtime enforcement engine
│   │   ├── network-policy.ts   # DNS interception + domain allowlist
│   │   ├── filesystem-jail.ts  # Sensitive path protection + fs monkey-patch
│   │   ├── runtime-monitor.ts  # Kill switch + real-time output scanning
│   │   ├── audit-trail.ts      # SHA-256 hash-chained audit log
│   │   └── index.ts            # SkillShield orchestrator
│   ├── sandbox/        # Process + Docker sandbox with shell:false isolation
│   ├── core/           # SKILL.md parser (Zod validated), runtime engine
│   ├── router/         # Multi-model router — 11 providers, 39+ models
│   ├── cli/            # CLI: scan, run, badge, init, search, install, deploy
│   ├── hub/            # ClawHub client + local skill registry
│   ├── channels/       # WhatsApp, Telegram, Discord, Slack adapters
│   ├── tools/          # Tool system (search, extract, crawl)
│   └── i18n/           # EN, ES, ZH, PT translations
├── .github/workflows/  # GitHub Action for automated scanning
├── examples/           # Example skills
└── tests/              # Test suite

CLI Reference

# Scanning
skillshield scan <skill.md>              # Full security audit
skillshield scan <skill.md> --json       # JSON output for CI/CD

# Runtime (Scan + Shield + Execute)
skillshield run <skill> --input "..."    # Full pipeline
skillshield run <skill> --no-shield      # Scan only, no enforcement
skillshield run <skill> --no-scan        # Skip pre-scan (not recommended)
skillshield run <skill> --timeout 30000  # Custom timeout (ms)
skillshield run <skill> --max-memory 256 # Custom memory limit (MB)
skillshield run <skill> --allow-domains api.openai.com,github.com
skillshield run <skill> --audit-file trail.json
skillshield run <skill> --verbose        # Show all shield activity

# Badge
skillshield badge <skill.md>             # Generate shields.io badge
skillshield badge <skill.md> --output README.md

# Skill management
skillshield init                         # Interactive setup
skillshield search "data analysis"       # Find skills
skillshield install <name>               # Install from hub
skillshield list                         # List installed

Contributing

We welcome contributions! The most impactful areas right now:

New threat patterns — Found a new attack vector? Add it to src/guard/patterns.ts
Shield bypass testing — Try to break the runtime enforcement. If you succeed, file an issue.
CI/CD integrations — GitHub Actions, GitLab CI, Jenkins plugins
Platform-specific enforcement — Windows, macOS, Linux edge cases

git clone https://github.com/artefactforge/skillshield.git
cd skillshield
npm install
npm run build

License

MIT License - See LICENSE for details.

Built by ArtefactForge

The industry invested in watching. We invested in stopping.