agent-red-team
v0.1.0
Published
Security scanner for AI coding agents — test your setup against prompt injection, credential exposure, identity tampering, and more.
Maintainers
Readme
agent-red-team
How secure is your AI coding agent? Find out in 3 seconds.
npx agent-red-teamZero-install security scanner that tests your AI coding agent setup against real attack vectors. Works with Claude Code, Cursor, OpenClaw, NemoClaw, and any MCP-connected agent.
What It Tests
| Category | Weight | What's Checked | |---|---|---| | Injection Resistance | 25% | Prompt injection scanning, input validation, base64 rescan, unicode normalization | | Credential Exposure | 20% | SSH keys, AWS creds, .env files, sandbox isolation, env var leaks | | Identity Tampering | 15% | CLAUDE.md, .cursorrules, SOUL.md write protection, file ownership | | Behavioral Evasion | 20% | Session tracking, trifecta detection (read-process-exfil), policy escalation | | Network Isolation | 10% | Sandbox network rules, SSRF protection, domain allowlisting, egress proxy | | Audit Integrity | 10% | Audit logging, hash chain integrity, tamper-evident config, export API |
Sample Output
Agent Red Team v0.1.0
Target: Claude Code (settings.json, CLAUDE.md, mcp-configured)
Runtime: darwin
INJECTION RESISTANCE ████████████░░░░ 60/100
✗ Injection scanner configured (high)
✓ Input validation present
✗ Base64 decode and rescan (medium)
✗ Unicode normalization (medium)
✓ Injection pattern coverage
CREDENTIAL EXPOSURE █████████████████ 95/100
✓ SSH keys protected
✓ AWS credentials protected
✗ Project .env file exposure (medium)
✓ Sandbox blocks credential reads
IDENTITY TAMPERING ████████████░░░░ 67/100
✗ CLAUDE.md write-protected (high)
✓ Identity file guard active
✓ Identity file ownership
BEHAVIORAL EVASION ░░░░░░░░░░░░░░░░ 0/100
✗ Session tracking configured (high)
✗ Trifecta detection (high)
✗ Policy escalation (high)
✗ Multi-step chain guard (medium)
NETWORK ISOLATION ████████░░░░░░░░ 50/100
✓ Sandbox network restrictions
✗ SSRF protection (high)
✗ Domain allowlisting (medium)
✗ Egress proxy configured (medium)
AUDIT INTEGRITY ░░░░░░░░░░░░░░░░ 0/100
✗ Audit logging enabled (high)
✗ Hash chain integrity (high)
✗ Tamper-evident configuration (medium)
✗ Audit export API available (low)
---
OVERALL SCORE ██████████░░░░░░ 63/100
Grade: C
Share: "My AI agent scored 63/100 on agent-red-team ⚠️"CLI Reference
Usage: agent-red-team [options]
Options:
--target <agent> Override auto-detection (claude-code|openclaw|nemoclaw|cursor)
--mcp-url <url> MCP server URL for generic testing
--active Enable active probing mode
--json Output JSON report to stdout
--category <name> Run only one category:
injection | credentials | identity
behavioral | network | audit
--verbose Show individual test details and explanations
--no-color Disable colored output
-V, --version Output version number
-h, --help Display helpExamples
# Scan everything (auto-detects your agent)
npx agent-red-team
# Target a specific agent
npx agent-red-team --target claude-code
# Only check credential exposure
npx agent-red-team --category credentials
# Verbose output with all details
npx agent-red-team --verbose
# JSON output for CI pipelines
npx agent-red-team --json > report.json
# Active probing mode (attempts real attack sequences)
npx agent-red-team --activeScoring Methodology
Each category produces a score from 0-100 based on the ratio of passed checks.
Categories that cannot be tested (e.g., no agent detected for that check) are marked N/A and their weight is redistributed proportionally across testable categories.
Letter grades:
| Grade | Score Range | |---|---| | A+ | 90-100 | | A | 80-89 | | B | 70-79 | | C | 60-69 | | D | 50-59 | | D- | 40-49 | | F | 0-39 |
The CLI exits with code 1 if the overall score is below 40 (grade F), making it suitable for CI gates.
How to Improve Your Score
- Injection Resistance -- Deploy Gatekeeper for injection scanning middleware
- Credential Exposure -- Use sandbox profiles, restrict file permissions, avoid env var secrets
- Identity Tampering -- Make CLAUDE.md and .cursorrules read-only, use Gatekeeper's identity guard
- Behavioral Evasion -- Enable session tracking with trifecta detection
- Network Isolation -- Configure sandbox network rules and egress proxy
- Audit Integrity -- Enable tamper-evident audit logging with hash chains
Contributing
Contributions are welcome. Please open an issue first to discuss what you would like to change.
# Development setup
git clone https://github.com/knowledge2ai/agent-red-team.git
cd agent-red-team
npm install
npm run build
npm test
# Run locally
node dist/cli.js --verboseWhen adding new attack checks:
- Create or modify the relevant module in
src/attacks/ - Each check returns an
AttackResultwithname,passed,severity, anddetail - Add tests in
test/ - Run
npm testbefore submitting a PR
License
Apache 2.0
Built by the team at Knowledge2 -- enterprise AI infrastructure with live knowledge feeds.
