@bernierllc/content-validator
v1.2.2
Published
Content validation engine with comprehensive rules and sanitization
Readme
@bernierllc/content-validator
Comprehensive content validation engine with security scanning, sanitization, and multi-format support.
Installation
npm install @bernierllc/content-validatorFeatures
- Multi-format Support: Validates plain text, HTML, JSON, Markdown, XML, YAML, and CSV
- Security Scanning: Detects XSS, SQL injection, code injection, PII, and secrets
- Content Sanitization: Removes malicious content while preserving formatting
- Content Analysis: Analyzes structure, language, links, media, and accessibility
- Event-driven: Subscribe to validation events for real-time monitoring
- Customizable: Pre-configured validators for common use cases
- Type-safe: Full TypeScript support with strict typing
Usage
Basic Validation
import { ContentValidator, createContentValidator } from '@bernierllc/content-validator';
// Create a validator with default configuration
const validator = createContentValidator(['basic', 'security']);
// Validate content
const result = await validator.validate('<p>Hello, world!</p>');
if (result.isValid) {
console.log('Content is valid!');
console.log('Sanitized:', result.sanitized);
} else {
console.log('Validation errors:', result.errors);
}Pre-configured Validators
import { validators } from '@bernierllc/content-validator';
// Basic validator for general text content (minimal security)
const basicValidator = validators.basic();
// Security-focused validator for user-generated content
const securityValidator = validators.security();
// CMS validator for published blog/article content
const cmsValidator = validators.cms();
// Code validator for development content (preserves formatting)
const codeValidator = validators.code();
const result = await securityValidator.validate(userContent);Custom Configuration
import { ContentValidator } from '@bernierllc/content-validator';
const validator = new ContentValidator({
rulesets: ['basic', 'security', 'content-quality'],
maxContentLength: 1024 * 1024, // 1MB
timeout: 5000,
includeWarnings: true,
includeSanitized: true,
includeMetadata: true,
sanitization: {
enabled: true,
preserveFormatting: true,
removeComments: true,
normalizeWhitespace: true,
allowedTags: ['p', 'strong', 'em', 'a'],
allowedAttributes: ['href', 'title'],
maxLength: undefined
},
security: {
checkXSS: true,
checkSQLInjection: true,
checkCodeInjection: true,
checkMaliciousUrls: true,
checkPhishing: true,
scanBinaryContent: false,
maxFileSize: 10 * 1024 * 1024, // 10MB
checkPII: true,
checkSecrets: true
}
});
const result = await validator.validate(content);Content Analysis
// Analyze content structure, language, links, media, and accessibility
const analysis = await validator.analyze(content);
console.log('Structure:', analysis.structure);
console.log('Language quality:', analysis.language);
console.log('Links found:', analysis.links);
console.log('Accessibility score:', analysis.accessibility.score);Aspect-specific Validation
// Validate specific aspects of content
const securityResult = await validator.validateAspect(content, 'security');
const structureResult = await validator.validateAspect(content, 'structure');
const a11yResult = await validator.validateAspect(content, 'accessibility');
const languageResult = await validator.validateAspect(content, 'language');Event Monitoring
// Subscribe to validation events
validator.on((event) => {
console.log('Event:', event.type);
if (event.type === 'security-threat-detected') {
console.log('Security threat:', event.data);
}
});
// Validate with event monitoring
const result = await validator.validate(content);Content Type Detection
// Automatically detect content type
const contentType = validator.detectContentType(content);
console.log('Detected type:', contentType);
// Output: 'html', 'json', 'markdown', 'xml', 'yaml', 'csv', or 'plain-text'API Reference
ContentValidator
Main validation class.
Methods
validate(content: string, contentId?: string): Promise<ValidationResult>- Validate content with comprehensive rule checkinganalyze(content: string): Promise<ContentAnalysis>- Analyze content structure, language, links, media, and accessibilityvalidateAspect(content: string, aspect: string): Promise<Partial<ValidationResult>>- Validate specific aspect (security, structure, accessibility, language)detectContentType(content: string): ContentType- Detect content typeupdateConfig(config: Partial<ContentValidatorConfig>): void- Update validator configurationon(callback: (event: ValidationEvent) => void): void- Subscribe to validation events
createContentValidator
Factory function for creating validators with common configurations.
createContentValidator(
rulesets?: string[],
options?: Partial<ContentValidatorConfig>
): ContentValidatorPre-configured Validators
validators.basic()- Basic validator for general text contentvalidators.security()- Security-focused validator for user-generated contentvalidators.cms()- CMS validator for published blog/article contentvalidators.code()- Code validator for development content
Types
ValidationResult
interface ValidationResult {
isValid: boolean;
contentType: ContentType;
errors: ValidationError[];
warnings: ValidationWarning[];
sanitized?: string;
metadata?: ValidationMetadata;
}ValidationError
interface ValidationError {
code: string;
message: string;
line?: number;
column?: number;
severity: 'critical' | 'high' | 'medium' | 'low';
suggestion?: string;
}ValidationWarning
interface ValidationWarning {
code: string;
message: string;
line?: number;
column?: number;
suggestion?: string;
}ContentAnalysis
interface ContentAnalysis {
structure: StructureAnalysis;
language: LanguageAnalysis;
links: LinkAnalysis;
media: MediaAnalysis;
accessibility: AccessibilityAnalysis;
}Configuration Options
Sanitization
enabled: boolean- Enable content sanitizationpreserveFormatting: boolean- Preserve original formattingremoveComments: boolean- Remove HTML/code commentsnormalizeWhitespace: boolean- Normalize whitespaceallowedTags?: string[]- Allowed HTML tagsallowedAttributes?: string[]- Allowed HTML attributesmaxLength?: number- Maximum content length after sanitization
Security
checkXSS: boolean- Check for XSS vulnerabilitiescheckSQLInjection: boolean- Check for SQL injection patternscheckCodeInjection: boolean- Check for code injectioncheckMaliciousUrls: boolean- Check for suspicious URLscheckPhishing: boolean- Check for phishing attemptsscanBinaryContent: boolean- Scan binary content for malwaremaxFileSize: number- Maximum file size for binary scanningcheckPII: boolean- Check for personally identifiable informationcheckSecrets: boolean- Check for API keys, tokens, credentials
Integration Status
- Logger: Not applicable - Core validation utility
- Docs-suite: Ready - Comprehensive API documentation with examples
- NeverHub: Not applicable - Core validation utility
Dependencies
@bernierllc/markdown-detector- Markdown flavor detection@bernierllc/crypto-utils- Cryptographic utilities for secret detection
License
Copyright (c) 2025 Bernier LLC. All rights reserved.
This file is licensed to the client under a limited-use license. The client may use and modify this code only within the scope of the project it was delivered for. Redistribution or use in other products or commercial offerings is not permitted without written consent from Bernier LLC.
