skillshield
v2.1.0
Published
Runtime security for AI Agent Skills — Scan, sandbox & enforce. Detect prompt injection, memory poisoning, supply chain attacks. 72+ patterns, 14 categories. The firewall Snyk and Cisco don't build.
Downloads
364
Maintainers
Readme
███████╗██╗ ██╗██╗██╗ ██╗ ███████╗██╗ ██╗██╗███████╗██╗ ██████╗
██╔════╝██║ ██╔╝██║██║ ██║ ██╔════╝██║ ██║██║██╔════╝██║ ██╔══██╗
███████╗█████╔╝ ██║██║ ██║ ███████╗███████║██║█████╗ ██║ ██║ ██║
╚════██║██╔═██╗ ██║██║ ██║ ╚════██║██╔══██║██║██╔══╝ ██║ ██║ ██║
███████║██║ ██╗██║███████╗███████╗███████║██║ ██║██║███████╗███████╗██████╔╝
╚══════╝╚═╝ ╚═╝╚═╝╚══════╝╚══════╝╚══════╝╚═╝ ╚═╝╚═╝╚══════╝╚══════╝╚═════╝Runtime Security for AI Agent Skills — Scan, Sandbox & Enforce.
The first open-source tool that scans AND stops malicious AI skills at runtime. Network interception, filesystem jail, kill switch, and cryptographic audit trail — in one developer-first CLI.
The Problem
"The industry has invested in watching. It hasn't invested in stopping." — Bessemer Venture Partners
Every existing tool for AI skill security does the same thing: scan before install, then hope for the best. Snyk agent-scan, Cisco skill-scanner, VirusTotal — they all stop at detection. Once a skill passes their checks (or bypasses them), there's zero protection at runtime.
Meanwhile: 36% of ClawHub skills have security flaws. 12% are actual malware. And the most dangerous attacks — sleeper agents, time-delayed exfiltration, polymorphic payloads — are invisible to pre-install scanners.
The Solution: SkillShield
SkillShield is the first tool that combines pre-execution scanning with runtime enforcement in a single CLI. It doesn't just detect threats — it prevents them from executing.
npm install -g skillshield
# Scan a skill (72+ patterns, 14 threat categories)
skillshield scan suspicious-skill.md
# Scan + Shield + Execute (the full pipeline)
skillshield run my-skill.md --input "Hello world"
# Save cryptographic audit trail for compliance
skillshield run my-skill.md --audit-file trail.jsonHow It Works
┌──────────────────────────────────────────────────────────────┐
│ skillshield run │
├──────────┬───────────────────┬──────────────┬────────────────┤
│ PHASE 1 │ PHASE 2 │ PHASE 3 │ PHASE 4 │
│ SCAN │ SHIELD │ EXECUTE │ REPORT │
│ │ │ │ │
│ 72+ │ Network Policy │ Enforcement │ Shield Report │
│ patterns │ Filesystem Jail │ wrapper │ Audit chain │
│ 14 cats │ Kill Switch │ injected │ Violations │
│ │ Audit Trail │ │ Chain hash │
└──────────┴───────────────────┴──────────────┴────────────────┘Phase 1: Pre-Scan (SkillGuard)
72+ regex patterns across 14 threat categories — including 3 categories nobody else detects:
| Category | Patterns | What It Catches | |----------|---------|----------------| | Memory Poisoning | 7 | SOUL.md/MEMORY.md manipulation, sleeper agents, cross-session persistence | | Prompt Injection | 6 | Ignore instructions, fake [SYSTEM] tags, context reset, privilege escalation | | Sensitive Data | 10 | OpenAI/Anthropic/AWS/Groq keys, JWT tokens, private keys, SSNs, credit cards | | Supply Chain | 6 | npm/pip install in skills, pipe-to-shell, postinstall hooks, remote imports | | Code Injection | 8 | eval(), exec(), spawn(), dynamic require, innerHTML, child_process | | Data Exfiltration | 8 | fetch POST, XMLHttpRequest, curl, wget, sendBeacon, cloud storage copy | | Credential Theft | 7 | process.env, .ssh/.aws files, .env files, hardcoded passwords, git credentials | | File System Abuse | 7 | rm -rf, chmod, disk destruction, fs.writeFile to system paths | | Crypto Mining | 4 | Mining pools, wallet addresses, coinhive, WebWorker mining | | Keylogger | 4 | keydown/keyup listeners, clipboard access, keyboard simulation | | Obfuscation | 4 | Base64 decode, String.fromCharCode, hex/unicode escapes | | Network Abuse | 4 | Port scanning, DNS exfiltration, SSRF, SSH/Telnet | | Privilege Escalation | 2 | sudo/su, SUID/SGID bits | | Malware | 4 | Reverse shells, fork bombs, encoded PowerShell, exploitation frameworks |
Phase 2: Runtime Shield (The Differentiator)
This is what makes SkillShield unique. Four enforcement layers activate during skill execution:
Network Policy Engine — Default-deny networking. Skills can only reach explicitly allowed domains. Blocks known malicious domains (ngrok.io, webhook.site, requestbin.com) and crypto mining pools. Intercepts dns.lookup and https.request at the Node.js level.
# Only allow specific domains
skillshield run my-skill.md --allow-domains api.openai.com,github.comFilesystem Jail — Skills cannot read or write sensitive paths. Protects ~/.ssh, ~/.aws, .env, SOUL.md, MEMORY.md, IDENTITY.md, private keys, and credentials. Monkey-patches fs.readFileSync, fs.writeFileSync, and fs.unlinkSync.
Kill Switch — Real-time monitoring of skill output. If the skill produces malicious patterns during execution (not just in source code), SkillShield kills the process immediately. Triggers on: timeout (60s default), memory limit (512MB), output flooding (10MB), critical threat patterns, or max violation count.
Cryptographic Audit Trail — Every action during execution (scan, network request, file access, kill switch activation) is recorded in a SHA-256 hash-chained log. Each entry links to the previous via hash, creating a tamper-evident chain. Export to JSON for compliance.
# Save the full audit trail
skillshield run my-skill.md --audit-file audit.json
# The audit trail is hash-chained (blockchain-style)
# Tampering with any entry breaks the chain verificationPhase 4: Shield Report
After every execution, SkillShield prints a complete security report:
────────────────────────────────────────────────────
SHIELD REPORT
────────────────────────────────────────────────────
Status: CLEAN EXECUTION
Pre-Scan Score: 95/100 (APPROVED)
Network: 0 violations
Filesystem: 0 violations
Runtime Threats: 0 detected
Duration: 1247ms
Audit Chain: 6 entries
Latest Hash: a3f8b2c1d4e5f6a7b8c9...
Chain Integrity: VERIFIED
────────────────────────────────────────────────────Why Not Just Use...
| Tool | What It Does | What It Doesn't Do | |------|-------------|-------------------| | Snyk agent-scan | LLM judges + regex, pre-install | No runtime enforcement. Scan-only. | | Cisco skill-scanner | YARA + AST + policy engine | No runtime enforcement. Pre-install only. | | NVIDIA OpenShell | Linux runtime sandboxing | Enterprise-only. Linux-only. No pre-scan. | | Aegis | LLM API call proxy | Only intercepts API calls, not filesystem/network. | | rohitg00/skillkit | 46 rules + skill translation | No runtime. No enforcement. No audit trail. | | SkillShield | Scan + Network + Filesystem + Kill Switch + Audit | The full pipeline in one CLI. Cross-platform. |
Security Badge
Show the world your skills are verified:
skillshield badge my-skill.md # Generate badge
skillshield badge my-skill.md --output README.md # Auto-append to README| Score | Badge | Status |
|-------|-------|--------|
| 90-100 | | SAFE |
| 80-89 |
| APPROVED |
| 50-79 |
| REVIEW REQUIRED |
| 0-49 |
| BLOCKED |
Architecture
skillshield/
├── src/
│ ├── guard/ # SkillGuard — 72+ threat patterns, 14 categories
│ ├── shield/ # Runtime enforcement engine
│ │ ├── network-policy.ts # DNS interception + domain allowlist
│ │ ├── filesystem-jail.ts # Sensitive path protection + fs monkey-patch
│ │ ├── runtime-monitor.ts # Kill switch + real-time output scanning
│ │ ├── audit-trail.ts # SHA-256 hash-chained audit log
│ │ └── index.ts # SkillShield orchestrator
│ ├── sandbox/ # Process + Docker sandbox with shell:false isolation
│ ├── core/ # SKILL.md parser (Zod validated), runtime engine
│ ├── router/ # Multi-model router — 11 providers, 39+ models
│ ├── cli/ # CLI: scan, run, badge, init, search, install, deploy
│ ├── hub/ # ClawHub client + local skill registry
│ ├── channels/ # WhatsApp, Telegram, Discord, Slack adapters
│ ├── tools/ # Tool system (search, extract, crawl)
│ └── i18n/ # EN, ES, ZH, PT translations
├── .github/workflows/ # GitHub Action for automated scanning
├── examples/ # Example skills
└── tests/ # Test suiteCLI Reference
# Scanning
skillshield scan <skill.md> # Full security audit
skillshield scan <skill.md> --json # JSON output for CI/CD
# Runtime (Scan + Shield + Execute)
skillshield run <skill> --input "..." # Full pipeline
skillshield run <skill> --no-shield # Scan only, no enforcement
skillshield run <skill> --no-scan # Skip pre-scan (not recommended)
skillshield run <skill> --timeout 30000 # Custom timeout (ms)
skillshield run <skill> --max-memory 256 # Custom memory limit (MB)
skillshield run <skill> --allow-domains api.openai.com,github.com
skillshield run <skill> --audit-file trail.json
skillshield run <skill> --verbose # Show all shield activity
# Badge
skillshield badge <skill.md> # Generate shields.io badge
skillshield badge <skill.md> --output README.md
# Skill management
skillshield init # Interactive setup
skillshield search "data analysis" # Find skills
skillshield install <name> # Install from hub
skillshield list # List installedContributing
We welcome contributions! The most impactful areas right now:
- New threat patterns — Found a new attack vector? Add it to
src/guard/patterns.ts - Shield bypass testing — Try to break the runtime enforcement. If you succeed, file an issue.
- CI/CD integrations — GitHub Actions, GitLab CI, Jenkins plugins
- Platform-specific enforcement — Windows, macOS, Linux edge cases
git clone https://github.com/artefactforge/skillshield.git
cd skillshield
npm install
npm run buildLicense
MIT License - See LICENSE for details.
Built by ArtefactForge
The industry invested in watching. We invested in stopping.
