clawhub-scanner

v1.0.0

Published

4 months ago

Security scanner for ClawHub / OpenClaw AI skills — static analysis + LLM-based evaluation

0High
0Medium
0Low

ken-chy129

clawhub openclaw security scanner skill ai agent

clawhub-scanner

中文文档

Implements the same security scanning logic as ClawHub's built-in Security Scan — run the exact same checks locally before publishing your skill. Zero dependencies.

How It Relates to ClawHub

ClawHub has a built-in Security Scan that reviews every skill before it reaches users. This package implements the same multi-layer scanning logic, so you can run the exact same checks locally — catch issues before publishing, or integrate them into your own CI/CD pipeline:

Static Regex Scan — Fast, offline pattern matching against known malicious signatures (dangerous exec, data exfiltration, crypto mining, prompt injection, etc.)
LLM Security Evaluation — An LLM-as-judge evaluator that assesses the skill across 5 security dimensions, checking whether the skill's actual behavior is coherent with its stated purpose
Prompt Injection Detection — Pre-scan filters that catch attempts to manipulate the LLM evaluator itself

This is not a generic security scanner. It is purpose-built for the ClawHub / OpenClaw skill format and understands the specific threat model of AI agent plugins: skills that can instruct AI agents to run shell commands, access credentials, read files, and send data over the network.

Features

Static Regex Scan — Detects dangerous patterns without any API key:
- Shell command execution (child_process, exec, spawn)
- Dynamic code execution (eval, new Function)
- Crypto mining indicators (stratum+tcp, coinhive, xmrig)
- Data exfiltration (file read + network send)
- Credential harvesting (env var access + network send)
- Obfuscated code (hex sequences, long base64 payloads)
- Prompt injection patterns in markdown
- Suspicious URLs in config files (URL shorteners, raw IPs)
LLM Security Evaluation — 5-dimension deep analysis:
- Purpose & capability alignment
- Instruction scope analysis
- Install mechanism risk
- Credential proportionality
- Persistence & privilege assessment
Prompt Injection Detection — Identifies common injection patterns:
- "Ignore previous instructions"
- "You are now a..."
- System prompt override attempts
- Hidden base64 blocks
- Unicode control characters

Install

# Global install
npm install -g clawhub-scanner

# Or use directly with npx
npx clawhub-scanner --help

# Or install as project dependency
npm install clawhub-scanner

CLI Usage

# Static scan (no API key needed)
clawhub-scan --static ./my-skill

# Full scan (static + LLM evaluation)
OPENAI_API_KEY=sk-xxx clawhub-scan ./my-skill

# Scan multiple skills at once
clawhub-scan --static ./skill-a ./skill-b ./skill-c

# JSON output (for CI/CD pipelines)
clawhub-scan --static --json ./my-skill

# Save results to a directory
clawhub-scan --static --output ./results ./my-skill

# Use a custom LLM model
clawhub-scan --model gpt-4o ./my-skill

CLI Options

| Option | Description | |--------|-------------| | --static | Static scan only (no LLM, no API key needed) | | --json | Output results as JSON | | --output <dir> | Save individual JSON results to directory | | --model <name> | LLM model name (default: gpt-5-mini) | | --api-key <key> | OpenAI API key (default: OPENAI_API_KEY env) |

Environment Variables

| Variable | Description | |----------|-------------| | OPENAI_API_KEY | Required for LLM evaluation | | OPENAI_EVAL_MODEL | Override default LLM model |

Library Usage

import {
  scanSkill,
  scanSkillContent,
  runStaticScan,
  detectInjectionPatterns,
} from 'clawhub-scanner'

// Scan a skill directory
const result = await scanSkill('/path/to/skill', { staticOnly: true })
console.log(result.staticStatus)       // 'clean' | 'suspicious' | 'malicious'
console.log(result.staticScan.findings) // Array of findings

// Scan raw content (no disk I/O)
const scan = runStaticScan(skillMdContent, [
  { path: 'helper.js', content: 'const { exec } = require("child_process")...' },
])
// => { status: 'suspicious', reasonCodes: [...], findings: [...] }

// Detect prompt injection patterns
const signals = detectInjectionPatterns('Ignore all previous instructions')
// => ['ignore-previous-instructions']

// Full scan with LLM evaluation
const fullResult = await scanSkill('/path/to/skill', {
  apiKey: 'sk-xxx',
  model: 'gpt-4o',
})
console.log(fullResult.verdict)    // 'benign' | 'suspicious' | 'malicious'
console.log(fullResult.confidence) // 'high' | 'medium' | 'low'

API Reference

`scanSkill(dirPath, options?)` → `Promise<ScanResult | null>`

Scan a skill directory. Returns null if no SKILL.md is found.

`scanSkillContent(skillMd, files?, options?)` → `Promise<ScanResult>`

Scan from raw content without reading from disk.

`runStaticScan(skillMd, files)` → `StaticScanResult`

Run regex-based static analysis only. Synchronous, no API key needed.

`detectInjectionPatterns(text)` → `string[]`

Detect prompt injection patterns in text content.

Scan Result Format

{
  "skill": "my-skill",
  "timestamp": "2026-03-13T08:00:00.000Z",
  "staticScan": {
    "status": "suspicious",
    "reasonCodes": ["suspicious.dangerous_exec"],
    "findings": [
      {
        "code": "suspicious.dangerous_exec",
        "severity": "critical",
        "file": "helper.js",
        "line": 5,
        "message": "Shell command execution detected.",
        "evidence": "exec('rm -rf /')"
      }
    ]
  },
  "injectionSignals": [],
  "llmResult": {
    "verdict": "suspicious",
    "confidence": "high",
    "summary": "...",
    "dimensions": { ... },
    "user_guidance": "..."
  }
}

Reason Codes

| Code | Severity | Description | |------|----------|-------------| | suspicious.dangerous_exec | critical | Shell command execution via child_process | | suspicious.dynamic_code_execution | critical | eval() or new Function() usage | | malicious.crypto_mining | critical | Crypto mining indicators | | malicious.env_harvesting | critical | Env var access + network send | | suspicious.potential_exfiltration | warn | File read + network send | | suspicious.obfuscated_code | warn | Obfuscated/encoded payloads | | suspicious.prompt_injection_instructions | warn | Prompt injection in markdown | | suspicious.install_untrusted_source | warn | URL shorteners or raw IPs |

How It Works — The ClawHub Security Pipeline

┌─────────────────────────────────────────────────────────┐
│                   Skill Published                       │
│              (SKILL.md + code files)                    │
└─────────────────┬───────────────────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────────────────┐
│  Layer 1: Static Regex Scan                             │
│  ─────────────────────────                              │
│  • Pattern match against known malicious signatures     │
│  • Check code files for dangerous APIs                  │
│  • Check configs for suspicious URLs                    │
│  • Check markdown for prompt injection                  │
│  Result: clean / suspicious / malicious                 │
└─────────────────┬───────────────────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────────────────┐
│  Layer 2: Prompt Injection Pre-filter                   │
│  ────────────────────────────────                       │
│  • Detect "ignore previous instructions" patterns       │
│  • Catch system prompt override attempts                │
│  • Flag hidden base64 blocks & unicode control chars    │
│  • Feed signals to LLM as adversarial context           │
└─────────────────┬───────────────────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────────────────┐
│  Layer 3: LLM-as-Judge Evaluation                       │
│  ────────────────────────────                           │
│  5-dimension coherence analysis:                        │
│  1. Purpose ↔ Capability alignment                      │
│  2. Instruction scope boundaries                        │
│  3. Install mechanism risk                              │
│  4. Credential proportionality                          │
│  5. Persistence & privilege assessment                  │
│                                                         │
│  Verdict: benign / suspicious / malicious               │
│  + confidence level + plain-language user guidance      │
└─────────────────────────────────────────────────────────┘

The key insight of this scanner is coherence detection, not malware classification. A child_process.exec() call is normal in a deployment skill but suspicious in a markdown formatter. The LLM evaluator understands context and asks: "does this capability belong here?"

What is a Skill?

A skill is a plugin for AI agents (like Claude Code or OpenClaw). Each skill contains:

SKILL.md — Instructions that tell the AI agent what to do
Optional code files (.js, .py, .sh, etc.)
Optional metadata (YAML frontmatter) declaring dependencies, env vars, install specs

Skills can be powerful — and dangerous. They can instruct AI agents to run shell commands, access your credentials, read your files, and send data over the network. Always scan before installing.

Related Projects

ClawHub — The skill marketplace where this scanner runs in production
OpenClaw — Open-source AI agent framework
anygen-skills — A collection of AI skills scanned by this tool

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

clawhub-scanner

How It Relates to ClawHub

Features

Install

CLI Usage

CLI Options

Environment Variables

Library Usage

API Reference

scanSkill(dirPath, options?) → Promise<ScanResult | null>

scanSkillContent(skillMd, files?, options?) → Promise<ScanResult>

runStaticScan(skillMd, files) → StaticScanResult

detectInjectionPatterns(text) → string[]

Scan Result Format

Reason Codes

How It Works — The ClawHub Security Pipeline

What is a Skill?

Related Projects

License

`scanSkill(dirPath, options?)` → `Promise<ScanResult | null>`

`scanSkillContent(skillMd, files?, options?)` → `Promise<ScanResult>`

`runStaticScan(skillMd, files)` → `StaticScanResult`

`detectInjectionPatterns(text)` → `string[]`