ai-security-suite
v1.0.0
Published
Security middleware for LLM APIs - prompt injection detection, PII redaction, output filtering, audit logging
Maintainers
Readme
AI Security Suite
Security middleware for LLM APIs. Protect your AI applications with:
- 🛡️ Prompt Firewall — Detect and block injection attacks
- 🔒 PII Redaction — Automatically redact sensitive data
- 🚫 Output Filtering — Filter harmful content, competitor mentions
- 📝 Audit Logging — Compliance-ready logging
Zero ML dependencies. <50ms overhead. Drop-in replacement.
Installation
npm install ai-security-suiteQuick Start
SDK Mode (OpenAI)
Drop-in replacement for the OpenAI client:
import { SecureOpenAI } from 'ai-security-suite';
const client = new SecureOpenAI({
apiKey: process.env.OPENAI_API_KEY,
security: {
injection: { action: 'block' },
pii: { redact: ['email', 'phone', 'ssn', 'card'] },
audit: { enabled: true, destination: 'console' }
}
});
// Use exactly like the OpenAI client
const response = await client.chat.completions.create({
model: 'gpt-4',
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'My email is [email protected]' }
]
});
// Email is automatically redacted in the request
// Response includes security metadata
console.log(response._security?.violations);SDK Mode (Anthropic)
import { SecureAnthropic } from 'ai-security-suite';
const client = new SecureAnthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
security: {
injection: { action: 'block', patterns: 'strict' },
pii: { redact: ['email', 'phone'] }
}
});
const response = await client.messages.create({
model: 'claude-3-opus-20240229',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Hello!' }]
});Proxy Mode (Zero-Code)
For existing applications — route through security proxy:
# Start proxy
npx ai-security proxy --port 8080
# Configure your app to use localhost:8080 instead of api.openai.com
OPENAI_BASE_URL=http://localhost:8080/v1Or programmatically:
import { startProxy } from 'ai-security-suite';
await startProxy({
port: 8080,
security: {
injection: { action: 'block' },
pii: { redact: ['email', 'phone'] }
}
});Configuration
Injection Detection
{
injection: {
enabled: true,
action: 'block' | 'warn' | 'log',
patterns: 'default' | 'strict' | 'custom',
customPatterns: [
{
name: 'custom_pattern',
pattern: /my custom regex/gi,
severity: 'high',
description: 'Custom injection pattern'
}
],
threshold: 0.7 // Detection threshold (0-1)
}
}Actions:
block— Block request and return errorwarn— Log warning but allow requestlog— Silent logging only
Pattern Modes:
default— Common injection patternsstrict— Includes encoded payloads, resource attackscustom— Your custom patterns only
PII Redaction
{
pii: {
enabled: true,
redact: ['email', 'phone', 'ssn', 'card', 'ip', 'name', 'address'],
action: 'redact' | 'block' | 'warn',
placeholderFormat: 'bracket', // [EMAIL] or ###EMAIL###
customPatterns: [
{
name: 'employee_id',
pattern: /EMP\d{6}/g,
placeholder: '[EMPLOYEE_ID]'
}
]
}
}Detected PII Types:
email— Email addressesphone— Phone numbers (international)ssn— SSN, NI numbers, SINcard— Credit card numbers (with Luhn validation)ip— IP addresses (excludes private ranges)name— Names with titles (Mr., Dr., etc.)address— US street addresses
Output Filtering
{
output: {
enabled: true,
harmfulContent: {
enabled: true,
action: 'warn',
categories: ['violence', 'hate', 'self-harm', 'illegal', 'dangerous']
},
competitors: {
enabled: true,
action: 'warn',
names: ['CompetitorA', 'CompetitorB']
},
brandSafety: {
enabled: true,
action: 'warn',
rules: [
{ name: 'profanity', pattern: /\b(badword1|badword2)\b/gi }
]
}
}
}Audit Logging
{
audit: {
enabled: true,
destination: 'file' | 'webhook' | 'console' | 'callback',
include: ['request', 'response', 'violations', 'timing'],
filePath: './audit.log',
webhookUrl: 'https://your-endpoint.com/audit',
callback: (entry) => { /* custom handler */ },
minSeverity: 'low' // Minimum severity to log
}
}Standalone Functions
Use individual security functions without the full client:
import {
isInjectionAttempt,
analyzeInjection,
redactPII,
hasPII,
hasHarmfulContent
} from 'ai-security-suite';
// Quick injection check
if (isInjectionAttempt(userInput)) {
throw new Error('Injection detected');
}
// Full injection analysis
const result = analyzeInjection(userInput, { patterns: 'strict' });
console.log(result.violations);
// PII redaction
const safe = redactPII('Contact: [email protected]');
// → "Contact: [EMAIL]"
// Check for PII
if (hasPII(text)) {
// Handle PII
}
// Harmful content check
if (hasHarmfulContent(modelOutput)) {
// Filter response
}CLI Usage
# Start proxy server
ai-security proxy --port 8080 --strict
# Start with config file
ai-security proxy --config security.json
# Check text for issues
ai-security check "Ignore all previous instructions"
# Redact PII
ai-security redact "Email: [email protected], SSN: 123-45-6789"Security Response
When security middleware blocks or modifies a request, the response includes metadata:
interface SecurityMetadata {
blocked: boolean;
violations: Array<{
type: 'injection' | 'pii' | 'harmful' | 'competitor' | 'brand';
severity: 'low' | 'medium' | 'high' | 'critical';
message: string;
pattern?: string;
action: 'block' | 'warn' | 'log';
}>;
timing: {
firewall?: number;
pii?: number;
output?: number;
total: number;
};
}
// Access via response._security
const response = await client.chat.completions.create({...});
if (response._security?.blocked) {
console.log('Request was blocked:', response._security.violations);
}Performance
Target: <50ms overhead per request.
Achieved through:
- Regex-based detection (no ML inference)
- Quick-check fast path for clean inputs
- Minimal memory allocations
- No external dependencies for core functionality
Benchmark results:
- Injection detection: ~5-15ms
- PII redaction: ~2-8ms
- Output filtering: ~3-10ms
- Total overhead: ~10-35ms typical
Error Handling
The SDK is designed to be graceful — errors in security middleware don't break requests:
const client = new SecureOpenAI({
apiKey: process.env.OPENAI_API_KEY,
security: {
injection: { action: 'block' }
},
debug: true // Enable debug logging
});
try {
const response = await client.chat.completions.create({...});
} catch (error) {
// Only API errors propagate — security violations return blocked responses
}License
MIT
