ai-warden
v1.1.1
Published
AI security scanner - Detect prompt injection attacks and PII with user settings
Maintainers
Readme
🛡️ AI-Warden
Production-ready AI security scanner for Node.js and Python. Detect prompt injection attacks and PII leaks with dual-mode operation.
🎯 Two Modes, One Package
AI-Warden works in two modes to fit your needs:
🆓 Offline Mode (Free Forever)
Fast local pattern matching. No API key required. Perfect for:
- CI/CD pipelines and pre-commit hooks
- Privacy-sensitive applications (no data leaves your server)
- Quick local validation
- Testing and development
🚀 API Mode (Subscription)
Full Aegis 3-layer cascade protection via our API. Includes:
- Self-learning Vector DB (958+ attack patterns, growing daily)
- ML-powered semantic detection
- LLM validation for zero-day threats
- User-configurable settings
- Real-time pattern updates
- PII masking preferences
Get your API key: ai-warden.io/signup
Free tier: 5,000 validations/month (no credit card required)
📦 Installation
npm install ai-warden🚀 Quick Start
Offline Mode (Free)
No signup required. Works completely offline with local pattern matching.
const AIWarden = require('ai-warden');
// No API key = Offline mode
const scanner = new AIWarden();
// Fast local validation (<1ms)
const result = scanner.scan('Ignore all previous instructions');
console.log(result.safe); // false
console.log(result.riskScore); // 85
console.log(result.patterns); // ['instruction_override']What you get in offline mode:
- ✅ 100+ prompt injection patterns
- ✅ 34+ PII detection patterns (email, SSN, credit cards, IBAN, IP)
- ✅ Risk scoring (0-1000)
- ✅ Pattern categorization
- ✅ Works completely offline
- ✅ <1ms response time
- ✅ Zero cost
API Mode (Subscription)
Get full Aegis cascade protection with Vector DB, ML, and LLM validation.
const AIWarden = require('ai-warden');
// With API key = API mode
const warden = new AIWarden(process.env.AI_WARDEN_API_KEY);
// Full Aegis cascade validation
const result = await warden.validate('Ignore all previous instructions');
console.log(result.blocked); // true
console.log(result.layer); // 'vector_db'
console.log(result.confidence); // 0.95
console.log(result.layer_name); // 'PERIMETER DEFENSE'What you get in API mode:
- ✅ All offline features PLUS:
- ✅ Self-learning Vector DB (semantic similarity)
- ✅ ML-powered detection (ProtectAI deberta model)
- ✅ LLM validation (Azure OpenAI gpt-4o-mini)
- ✅ User settings (custom whitelist, masking preferences)
- ✅ Real-time pattern updates
- ✅ Auto-capture of new attack variants
- ✅ 95% of requests complete in <1ms (Vector DB)
Pricing:
- FREE: 5,000 validations/month
- STARTER: €19/month (50K validations)
- GROWTH: €89/month (500K validations)
- ENTERPRISE: €599/month (unlimited)
📚 Usage Examples
Offline Mode (scan)
const AIWarden = require('ai-warden');
const scanner = new AIWarden();
// Basic scan
const result = scanner.scan('User input text');
if (!result.safe) {
console.log('⚠️ Threat detected');
console.log('Risk score:', result.riskScore);
console.log('Patterns:', result.patterns);
console.log('Severity:', result.severity); // 'LOW', 'MEDIUM', 'HIGH', 'CRITICAL'
}
// With options
const strictResult = scanner.scan('Text to check', {
mode: 'strict', // 'strict' | 'balanced' | 'permissive'
threshold: 75, // Custom risk threshold
verbose: true // Detailed output
});API Mode (validate)
const AIWarden = require('ai-warden');
const warden = new AIWarden(process.env.AI_WARDEN_API_KEY);
try {
// Full Aegis cascade validation
const result = await warden.validate('User input text');
if (result.blocked) {
return res.status(400).json({
error: 'Input rejected by security scanner',
reason: result.reason
});
}
// Process safe input (use cleanText if PII masking enabled)
processUserInput(result.cleanText || result.text);
} catch (error) {
if (error.message.includes('API key required')) {
console.error('Please sign up at https://ai-warden.io/signup');
} else if (error.message.includes('API unavailable')) {
// Fallback to offline mode
const result = scanner.scan('User input text');
}
}Hybrid Approach (Best Practice)
Combine both modes for optimal performance and cost:
const AIWarden = require('ai-warden');
const scanner = new AIWarden();
const warden = new AIWarden(process.env.AI_WARDEN_API_KEY);
async function validateInput(text) {
// Step 1: Fast local pre-filter (offline, free)
const quickCheck = scanner.scan(text);
if (quickCheck.riskScore > 200) {
// Obviously malicious, reject immediately
return { blocked: true, reason: 'High-risk patterns detected' };
}
if (quickCheck.riskScore < 50) {
// Obviously safe, accept immediately
return { blocked: false, text };
}
// Step 2: Borderline case - send to API for deep analysis
const deepCheck = await warden.validate(text);
return deepCheck;
}
// This approach saves API calls while maintaining security
const result = await validateInput(userInput);PII Detection & Handling (3 Modes)
AI-Warden v1.0.3+ includes powerful PII detection with 3 handling modes:
ignore- Detect PII but don't modify text (just report findings)mask- Replace PII with labeled placeholders ([EMAIL],[SSN], etc.)remove- Remove PII completely from text
const { PIIDetector, PII_MODES } = require('ai-warden/src/pii');
const text = 'Contact: [email protected], SSN: 123-45-6789, Card: 5425-2334-3010-9903';
// Mode 1: IGNORE (detect only, don't modify)
const detector1 = new PIIDetector({ mode: PII_MODES.IGNORE });
const result1 = detector1.detect(text);
console.log(result1.hasPII); // true
console.log(result1.count); // 3
console.log(result1.findings); // Array of detected PII
console.log(result1.modified); // Original text (unchanged)
// Mode 2: MASK (replace with labels)
const detector2 = new PIIDetector({ mode: PII_MODES.MASK });
const result2 = detector2.detect(text);
console.log(result2.modified);
// "Contact: [EMAIL], SSN: [SSN], Card: [CREDIT_CARD]"
// Mode 3: REMOVE (delete PII completely)
const detector3 = new PIIDetector({ mode: PII_MODES.REMOVE });
const result3 = detector3.detect(text);
console.log(result3.modified);
// "Contact: , SSN: , Card: "
// Convenience methods (use default mode from config)
const detector = new PIIDetector();
console.log(detector.hasPII(text)); // true
console.log(detector.maskPII(text)); // Quick mask
console.log(detector.removePII(text)); // Quick removeSupported PII types:
- Credit Cards (Visa, Mastercard, Amex, Discover) - with Luhn validation
- US SSN (Social Security Numbers)
- Emails (RFC 5322 compliant)
- Phone numbers (US + international formats)
- IPv4/IPv6 addresses
- Swedish Personnummer, Norwegian Fødselsnummer, Danish CPR, Finnish Henkilötunnus
- IBAN (European bank accounts)
- US Passports & Driver Licenses
🎮 CLI Usage
AI-Warden includes a command-line tool for file, directory, and skill repo scanning.
# Install globally
npm install -g ai-warden🆓 Offline Commands (Free — no API key needed)
These run entirely on your machine. No data leaves your system.
# Scan local files and directories
aiwarden scan file.txt
aiwarden scan ./src
aiwarden scan ./src --mode strict --verbose
aiwarden scan ./src --interactive
# Scan remote skill repos (local pattern matching)
aiwarden scan-skill https://github.com/user/skill --offline
aiwarden scan-skill https://github.com/user/skill --offline --json
aiwarden scan-skill https://github.com/user/skill --offline --strict🔑 API Key Setup (required for API-powered features)
# Login (opens browser, saves key locally)
aiwarden login
# Check your current key and tier
aiwarden whoami
# Or set key manually
export AI_WARDEN_API_KEY=sk_live_xxxGet your API key: ai-warden.io/signup (free tier: 5,000 validations/month)
🚀 API-Powered Commands (requires API key)
Full Judge Mars ML analysis — near-zero false positives, deeper detection.
# Validate text input
aiwarden validate "User input text"
aiwarden validate "Text to check" --json
# Scan skill repos with ML engine
aiwarden scan-skill https://github.com/user/skill
aiwarden scan-skill https://github.com/user/skill --json --strictCLI Options
Scan options:
--mode <strict|balanced|permissive>- Detection sensitivity--verbose- Detailed output--interactive- Interactive whitelist mode--ignore-file <path>- Custom .aiwardenignore file
Scan-skill options:
--offline- Use local scanner only (free, no API key)--json- Machine-readable JSON output--strict- Exit code 1 unless verdict is SAFE
🔍 Skill Scanner — Example Output
🔍 AI-Warden Skill Scan
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Skill: smart-web-search
Source: github.com/davidme6/smart-web-search
Files: 4 scanned
Mode: offline
LICENSE ✅ Safe (0.00)
README.md ❌ CRITICAL (1.00)
├─ P102: Data Forwarding Instructions [CRITICAL] — "Email**: [email protected]"
└─ H003: Excessive External URLs [LOW] — "Found 11 external URLs"
SKILL.md ✅ Safe (0.19)
└─ H003: Excessive External URLs [LOW] — "Found 20 external URLs"
_meta.json ✅ Safe (0.00)
Verdict: ❌ DANGEROUS
Trust Score: 0/100
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━Verdicts
| Verdict | Trust Score | Meaning | |---------|-------------|---------| | ✅ SAFE | 70-100 | No threats detected | | ⚠️ WARNING | 25-69 | Suspicious patterns found, review recommended | | ❌ DANGEROUS | 0-24 | Active threats detected, do not install |
Offline vs API Mode
| | Offline (free) | API (metered) | |---|---|---| | Detection | Regex patterns | Judge Mars ML + patterns | | Speed | Instant | ~150ms/file | | False positives | Higher | Lower | | Zero-day threats | ❌ | ✅ | | Requires API key | No | Yes |
🔧 Configuration
Constructor Options
const warden = new AIWarden('sk_live_xxx', {
apiUrl: 'https://api.ai-warden.io', // API endpoint
mode: 'balanced', // Scanner mode
threshold: 150, // Custom risk threshold
verbose: false, // Verbose logging
context: 'user' // Content context
});Scanner Modes
| Mode | Threshold | Use Case |
|------|-----------|----------|
| strict | 75 | High-security apps (financial, healthcare) |
| balanced | 150 | General production use (default) |
| permissive | 250 | Creative AI apps, lower false positives |
API Methods
scan(text, options) - Offline Mode
Local pattern matching. No API key required.
scanner.scan(text, {
mode: 'balanced',
threshold: 150,
verbose: false
});Returns:
{
safe: boolean,
riskScore: number, // 0-1000
patterns: string[], // Matched pattern names
severity: string, // 'SAFE', 'LOW', 'MEDIUM', 'HIGH', 'CRITICAL'
findings: object[], // Detailed findings
piiFindings: object[] // Detected PII
}validate(text, options) - API Mode
Full Aegis cascade via API. Requires API key.
await warden.validate(text, {
threatModel: 'prompt_injection',
context: 'user'
});Returns:
{
safe: boolean,
blocked: boolean,
layer: string, // 'vector_db' | 'pattern' | 'ml' | 'llm'
layer_name: string, // Human-readable layer name
confidence: number, // 0.0-1.0
reason: string, // Block reason
cleanText: string, // PII-masked text (if enabled)
appliedSettings: object // User settings applied
}Throws: Error if no API key provided
detectPII(text, options) - PII Detection
Detect personally identifiable information.
scanner.detectPII(text, {
types: ['email', 'ssn', 'credit_card'] // Optional filter
});Returns:
{
types: string[], // PII types found
findings: object[] // Detailed findings with positions
}maskPII(text, findings, options) - PII Masking
Mask detected PII in text.
scanner.maskPII(text, findings, {
maskChar: '*',
preserveLength: true
});🎯 Use Cases
1. Production API Input Validation
app.post('/api/chat', async (req, res) => {
const { message } = req.body;
// Validate with API
const result = await warden.validate(message);
if (result.blocked) {
return res.status(400).json({
error: 'Message rejected',
reason: result.reason
});
}
// Safe to send to LLM
const response = await openai.chat.completions.create({
messages: [{ role: 'user', content: result.cleanText }]
});
res.json({ response: response.choices[0].message.content });
});2. CI/CD Pre-commit Hook
#!/bin/bash
# .git/hooks/pre-commit
npx aiwarden scan ./prompts --mode strict
if [ $? -ne 0 ]; then
echo "❌ Prompt injection detected in prompts/"
exit 1
fi3. Privacy-First PII Scrubbing
const scanner = new AIWarden();
function sanitizeUserData(data) {
const pii = scanner.detectPII(data);
if (pii.findings.length > 0) {
return scanner.maskPII(data, pii.findings);
}
return data;
}
// Logs safe to store
const cleanLog = sanitizeUserData(userMessage);
db.logs.insert({ message: cleanLog });4. Real-time Chat Moderation
// Fast pre-filter with offline mode
const quickCheck = scanner.scan(message);
if (quickCheck.riskScore > 200) {
socket.emit('message_blocked', { reason: 'Security policy' });
return;
}
// Deep check with API (async, doesn't block user)
warden.validate(message).then(result => {
if (result.blocked) {
moderationQueue.add({ message, user, result });
}
});🔐 Supported PII Types
| Type | Examples | Validation | |------|----------|------------| | Email | [email protected] | RFC 5322 | | Phone | +1-555-123-4567 | International formats | | SSN (US) | 123-45-6789 | Checksum | | SSN (SE) | 19900101-1234 | Luhn algorithm | | Credit Card | 4532-1111-2222-3333 | Luhn algorithm | | IBAN | DE89370400440532013000 | Mod-97 checksum | | IP Address | 192.168.1.1 | IPv4 & IPv6 | | API Keys | sk_live_xxx | Common patterns |
📊 Performance
| Mode | Avg Response Time | API Calls | Cost | |------|-------------------|-----------|------| | Offline (scan) | <1ms | 0 | FREE | | API (validate) - Vector DB | 50-80ms | 1 | ~€0.001 | | API (validate) - Pattern | <1ms | 1 | ~€0.001 | | API (validate) - ML | ~400ms | 1 | ~€0.002 | | API (validate) - LLM | ~1200ms | 1 | ~€0.005 |
Aegis Cascade Intelligence:
- 60% of attacks caught by Vector DB (50-80ms)
- 35% caught by Pattern layer (<1ms)
- 4% require ML validation (~400ms)
- 1% require LLM validation (~1200ms)
Result: 95% of requests complete in <1ms!
🛡️ Security Best Practices
- Never trust user input - Always validate before sending to LLMs
- Use hybrid approach - Local pre-filter + API for borderline cases
- Mask PII - Enable PII masking in your dashboard settings
- Monitor false positives - Use interactive whitelist mode in dev
- Keep patterns updated - Run
npm update ai-wardenregularly - Rate limit - Protect your API quota with rate limiting
- Log blocked attempts - Track attack patterns in your logs
🔗 Links
- Website: ai-warden.io
- Dashboard: ai-warden.io/dashboard
- Pricing: ai-warden.io/pricing
- NPM Package: npmjs.com/package/ai-warden
- GitHub: github.com/ai-warden/scanner
- Support: [email protected]
📝 License
MIT License - see LICENSE file for details
🙏 Credits
Built with ❤️ by the AI-Warden team
Powered by:
- ProtectAI - ML detection model
- Azure OpenAI - LLM validation
- FAISS - Vector similarity search
