xswarm-ai-sanitize
v2.0.0
Published
Secret detection for AI agents — 600+ patterns, plugins for LangChain, LlamaIndex, Vercel AI, OpenClaw, Nanobot
Maintainers
Readme
xswarm-ai-sanitize
Secret detection for AI agents — 600+ patterns, plugins for LangChain, LlamaIndex, Vercel AI, OpenClaw, Nanobot, and more.
Why This Matters
AI agents are increasingly given access to sensitive data sources: email inboxes, cloud storage, internal documents, and databases. This creates a critical security vulnerability:
User: "Search my emails for 'deployment'"
Agent: *searches Gmail*
Email contains: "Deploy with AWS_KEY=AKIAIOSFODNN7EXAMPLE"
Agent: *stores in memory/logs*
→ API key now persists in agent memory foreverxswarm-ai-sanitize sits between your AI agent and external data sources to automatically detect and redact secrets before they reach your agent's memory.
Quick Start
Interactive Setup Wizard
npx xswarm-ai-sanitize initThis launches an interactive wizard that:
- Detects which AI frameworks you have installed
- Shows integration options for each framework
- Provides copy-paste code examples
- Auto-installs plugins where supported (OpenClaw)
File/Directory Scanning
# Scan a directory for secrets
npx xswarm-ai-sanitize detect src/
# Scan with JSON output (perfect for CI/CD)
npx xswarm-ai-sanitize detect --json .
# Scan specific files
npx xswarm-ai-sanitize detect config.yml .envExit code 1 if secrets found (useful for pre-commit hooks and CI/CD pipelines).
Text Sanitization
# Redact secrets from piped input
cat file.txt | npx xswarm-ai-sanitize sanitize
# Or from a file
npx xswarm-ai-sanitize sanitize config.yml
# Block mode (exit 1 if secrets found)
npx xswarm-ai-sanitize sanitize --block .envFramework Integrations
| Framework | Plugin | Status |
|-----------|--------|--------|
| LangChain | xswarm-ai-sanitize/plugins/langchain | ✅ Ready |
| LlamaIndex | xswarm-ai-sanitize/plugins/llamaindex | ✅ Ready |
| Vercel AI SDK | xswarm-ai-sanitize/plugins/vercel-ai | ✅ Ready |
| OpenClaw | xswarm-ai-sanitize/plugins/openclaw | ✅ Ready |
| Nanobot | xswarm-ai-sanitize/plugins/nanobot | ✅ Ready |
| xSwarm | xswarm-ai-sanitize/plugins/xswarm | 🔜 Coming |
LangChain
import { createSanitizeCallback, wrapTool } from 'xswarm-ai-sanitize/plugins/langchain';
// Option 1: Use callback handler
const chain = new LLMChain({
llm,
prompt,
callbacks: [createSanitizeCallback({ mode: 'sanitize' })]
});
// Option 2: Wrap individual tools
const safeTool = wrapTool(myTool, { mode: 'sanitize' });LlamaIndex
import { createSanitizePostprocessor } from 'xswarm-ai-sanitize/plugins/llamaindex';
const queryEngine = index.asQueryEngine({
nodePostprocessors: [createSanitizePostprocessor({ mode: 'sanitize' })]
});Vercel AI SDK
import { sanitizeMiddleware, sanitizeTool } from 'xswarm-ai-sanitize/plugins/vercel-ai';
// Option 1: Use middleware
const result = await generateText({
model,
prompt,
experimental_middleware: sanitizeMiddleware({ mode: 'sanitize' })
});
// Option 2: Wrap tools
const tools = {
searchEmails: sanitizeTool(emailSearchTool, { mode: 'sanitize' })
};OpenClaw
import createSanitizePlugin from 'xswarm-ai-sanitize/plugins/openclaw';
export default createSanitizePlugin({ mode: 'sanitize' });Nanobot (MCP)
import { createSanitizeFilter } from 'xswarm-ai-sanitize/plugins/nanobot';
export default createSanitizeFilter({ mode: 'sanitize' });CLI Commands
detect - Scan Files/Directories
Scan files or directories for secrets with detailed reporting.
# Scan a directory (respects .gitignore)
npx xswarm-ai-sanitize detect src/
# Scan specific files
npx xswarm-ai-sanitize detect config.yml .env secrets.txt
# JSON output for CI/CD pipelines
npx xswarm-ai-sanitize detect --json . > report.json
# Ignore .gitignore patterns
npx xswarm-ai-sanitize detect --no-gitignore .Output Format:
src/config.js:23:15 [CRITICAL] aws_access_key_id
...const AWS_KEY = "AKIA...EXAMPLE"...
.env:5:12 [HIGH] database_url_postgres
DATABASE_URL=postgres://user:pass@host/db
2 secret(s) foundJSON Output:
{
"version": "1.0.0",
"timestamp": "2026-02-06T12:00:00.000Z",
"summary": {
"totalFiles": 2,
"totalFindings": 2,
"criticalCount": 1,
"highCount": 1
},
"results": [...]
}Exit Codes:
0- No secrets found (safe)1- Secrets detected (use in pre-commit hooks)
sanitize - Redact Secrets from Text
Process text from stdin or files and redact secrets.
# Pipe text through sanitizer
cat .env | npx xswarm-ai-sanitize sanitize -q
# From file
npx xswarm-ai-sanitize sanitize config.yml
# Block mode (exit 1 if secrets found, don't redact)
npx xswarm-ai-sanitize sanitize --block --secrets 1 .envOptions:
-b, --block- Block mode (exit 1 if secrets found)-s, --secrets N- Block threshold for secret count (default: 3)-q, --quiet- Suppress statistics
init - Interactive Setup Wizard
Launch the interactive wizard to integrate with AI frameworks.
npx xswarm-ai-sanitize initDetects installed frameworks and provides integration instructions.
Node.js API
import sanitize from 'xswarm-ai-sanitize';
// BLOCK Mode - Reject content with too many secrets
const result = sanitize(emailContent, {
mode: 'block',
blockThreshold: {
secrets: 3, // Block if 3+ secrets found
highSeverity: 1 // Always block high-severity threats
}
});
if (result.blocked) {
throw new Error(`Secrets detected: ${result.reason}`);
}
// SANITIZE Mode - Always clean, never block
const result = sanitize(content, { mode: 'sanitize' });
console.log(result.sanitized); // Secrets replaced with [REDACTED:type]Detection Capabilities
Secret Patterns (600+)
| Category | Examples | Count | |----------|----------|-------| | AI/ML Providers | OpenAI, Anthropic, Hugging Face, Groq, Cohere | 25+ | | Cloud Providers | AWS, Azure, GCP, DigitalOcean, Linode, Vultr | 40+ | | Version Control | GitHub, GitLab, Bitbucket, Gitea | 25+ | | CI/CD | CircleCI, Travis, Jenkins, Buildkite, Vercel | 25+ | | Payment | Stripe, PayPal, Square, Plaid, Coinbase | 25+ | | Communication | Slack, Discord, Telegram, Twilio, SendGrid | 30+ | | Databases | MongoDB, PostgreSQL, MySQL, Redis, Supabase | 30+ | | Auth/Identity | Auth0, Okta, Clerk, Keycloak, Firebase | 20+ | | And more... | CRM, Analytics, Maps, Blockchain, IoT, etc. | 300+ |
Entropy Analysis
Detects high-randomness strings that may be secrets without known prefixes:
- Shannon entropy calculation (threshold: 4.5)
- Minimum length filter (16 chars)
- Used as secondary validation for generic patterns
CI/CD Integration
GitHub Actions
name: Secret Detection
on: [push, pull_request]
jobs:
scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Scan for secrets
run: npx xswarm-ai-sanitize detect --json . > report.json
- name: Upload report
if: failure()
uses: actions/upload-artifact@v3
with:
name: secret-scan-report
path: report.jsonPre-commit Hook
#!/bin/sh
# .git/hooks/pre-commit
# Scan staged files for secrets
git diff --cached --name-only | xargs npx xswarm-ai-sanitize detect
if [ $? -ne 0 ]; then
echo "❌ Commit blocked: secrets detected"
exit 1
fiGitLab CI
secret_scan:
stage: test
script:
- npx xswarm-ai-sanitize detect --json . > report.json
artifacts:
when: on_failure
paths:
- report.jsonKey Features
- Zero Dependencies — Uses only Node.js built-ins
- Fully Synchronous — No async, no Promises, no network calls
- Fast — <5ms for typical documents
- Privacy-First — All processing local, zero external API calls
Performance
- 1KB content: <1ms
- 10KB content: <5ms
- 100KB content: <50ms
- Pattern compilation: one-time at module load
Installation
npm install xswarm-ai-sanitizeTesting
npm testMigration from v1.x to v2.0
Breaking Changes:
- Default behavior changed: Running
npx xswarm-ai-sanitizewithout arguments now shows help instead of running the wizard - New command required: The wizard is now behind the
initcommand
Migration:
| v1.x | v2.0 |
|------|------|
| npx xswarm-ai-sanitize (wizard) | npx xswarm-ai-sanitize init |
| cat file \| npx xswarm-ai-sanitize (still works!) | cat file \| npx xswarm-ai-sanitize sanitize |
| N/A | npx xswarm-ai-sanitize detect src/ (new!) |
Backward Compatibility:
- Piped input without explicit
sanitizecommand still works for compatibility - All Node.js API functions remain unchanged
- All framework plugins remain unchanged
New Features in v2.0:
detectcommand for file/directory scanning- JSON output for CI/CD integration
- Structured finding reports with file:line:column
- .gitignore support for directory scanning
- Binary file filtering
License
MIT
