@bernierllc/content-validator

v1.2.2

Published

2 months ago

Content validation engine with comprehensive rules and sanitization

0High
0Medium
0Low

alikhan410

mkbernier

content validation sanitization security atomic core

@bernierllc/content-validator

Comprehensive content validation engine with security scanning, sanitization, and multi-format support.

Installation

npm install @bernierllc/content-validator

Features

Multi-format Support: Validates plain text, HTML, JSON, Markdown, XML, YAML, and CSV
Security Scanning: Detects XSS, SQL injection, code injection, PII, and secrets
Content Sanitization: Removes malicious content while preserving formatting
Content Analysis: Analyzes structure, language, links, media, and accessibility
Event-driven: Subscribe to validation events for real-time monitoring
Customizable: Pre-configured validators for common use cases
Type-safe: Full TypeScript support with strict typing

Usage

Basic Validation

import { ContentValidator, createContentValidator } from '@bernierllc/content-validator';

// Create a validator with default configuration
const validator = createContentValidator(['basic', 'security']);

// Validate content
const result = await validator.validate('<p>Hello, world!</p>');

if (result.isValid) {
  console.log('Content is valid!');
  console.log('Sanitized:', result.sanitized);
} else {
  console.log('Validation errors:', result.errors);
}

Pre-configured Validators

import { validators } from '@bernierllc/content-validator';

// Basic validator for general text content (minimal security)
const basicValidator = validators.basic();

// Security-focused validator for user-generated content
const securityValidator = validators.security();

// CMS validator for published blog/article content
const cmsValidator = validators.cms();

// Code validator for development content (preserves formatting)
const codeValidator = validators.code();

const result = await securityValidator.validate(userContent);

Custom Configuration

import { ContentValidator } from '@bernierllc/content-validator';

const validator = new ContentValidator({
  rulesets: ['basic', 'security', 'content-quality'],
  maxContentLength: 1024 * 1024, // 1MB
  timeout: 5000,
  includeWarnings: true,
  includeSanitized: true,
  includeMetadata: true,
  sanitization: {
    enabled: true,
    preserveFormatting: true,
    removeComments: true,
    normalizeWhitespace: true,
    allowedTags: ['p', 'strong', 'em', 'a'],
    allowedAttributes: ['href', 'title'],
    maxLength: undefined
  },
  security: {
    checkXSS: true,
    checkSQLInjection: true,
    checkCodeInjection: true,
    checkMaliciousUrls: true,
    checkPhishing: true,
    scanBinaryContent: false,
    maxFileSize: 10 * 1024 * 1024, // 10MB
    checkPII: true,
    checkSecrets: true
  }
});

const result = await validator.validate(content);

Content Analysis

// Analyze content structure, language, links, media, and accessibility
const analysis = await validator.analyze(content);

console.log('Structure:', analysis.structure);
console.log('Language quality:', analysis.language);
console.log('Links found:', analysis.links);
console.log('Accessibility score:', analysis.accessibility.score);

Aspect-specific Validation

// Validate specific aspects of content
const securityResult = await validator.validateAspect(content, 'security');
const structureResult = await validator.validateAspect(content, 'structure');
const a11yResult = await validator.validateAspect(content, 'accessibility');
const languageResult = await validator.validateAspect(content, 'language');

Event Monitoring

// Subscribe to validation events
validator.on((event) => {
  console.log('Event:', event.type);

  if (event.type === 'security-threat-detected') {
    console.log('Security threat:', event.data);
  }
});

// Validate with event monitoring
const result = await validator.validate(content);

Content Type Detection

// Automatically detect content type
const contentType = validator.detectContentType(content);

console.log('Detected type:', contentType);
// Output: 'html', 'json', 'markdown', 'xml', 'yaml', 'csv', or 'plain-text'

API Reference

ContentValidator

Main validation class.

Methods

validate(content: string, contentId?: string): Promise<ValidationResult> - Validate content with comprehensive rule checking
analyze(content: string): Promise<ContentAnalysis> - Analyze content structure, language, links, media, and accessibility
validateAspect(content: string, aspect: string): Promise<Partial<ValidationResult>> - Validate specific aspect (security, structure, accessibility, language)
detectContentType(content: string): ContentType - Detect content type
updateConfig(config: Partial<ContentValidatorConfig>): void - Update validator configuration
on(callback: (event: ValidationEvent) => void): void - Subscribe to validation events

createContentValidator

Factory function for creating validators with common configurations.

createContentValidator(
  rulesets?: string[],
  options?: Partial<ContentValidatorConfig>
): ContentValidator

Pre-configured Validators

validators.basic() - Basic validator for general text content
validators.security() - Security-focused validator for user-generated content
validators.cms() - CMS validator for published blog/article content
validators.code() - Code validator for development content

Types

ValidationResult

interface ValidationResult {
  isValid: boolean;
  contentType: ContentType;
  errors: ValidationError[];
  warnings: ValidationWarning[];
  sanitized?: string;
  metadata?: ValidationMetadata;
}

ValidationError

interface ValidationError {
  code: string;
  message: string;
  line?: number;
  column?: number;
  severity: 'critical' | 'high' | 'medium' | 'low';
  suggestion?: string;
}

ValidationWarning

interface ValidationWarning {
  code: string;
  message: string;
  line?: number;
  column?: number;
  suggestion?: string;
}

ContentAnalysis

interface ContentAnalysis {
  structure: StructureAnalysis;
  language: LanguageAnalysis;
  links: LinkAnalysis;
  media: MediaAnalysis;
  accessibility: AccessibilityAnalysis;
}

Configuration Options

Sanitization

enabled: boolean - Enable content sanitization
preserveFormatting: boolean - Preserve original formatting
removeComments: boolean - Remove HTML/code comments
normalizeWhitespace: boolean - Normalize whitespace
allowedTags?: string[] - Allowed HTML tags
allowedAttributes?: string[] - Allowed HTML attributes
maxLength?: number - Maximum content length after sanitization

Security

checkXSS: boolean - Check for XSS vulnerabilities
checkSQLInjection: boolean - Check for SQL injection patterns
checkCodeInjection: boolean - Check for code injection
checkMaliciousUrls: boolean - Check for suspicious URLs
checkPhishing: boolean - Check for phishing attempts
scanBinaryContent: boolean - Scan binary content for malware
maxFileSize: number - Maximum file size for binary scanning
checkPII: boolean - Check for personally identifiable information
checkSecrets: boolean - Check for API keys, tokens, credentials

Integration Status

Logger: Not applicable - Core validation utility
Docs-suite: Ready - Comprehensive API documentation with examples
NeverHub: Not applicable - Core validation utility

Dependencies

@bernierllc/markdown-detector - Markdown flavor detection
@bernierllc/crypto-utils - Cryptographic utilities for secret detection

License

This file is licensed to the client under a limited-use license. The client may use and modify this code only within the scope of the project it was delivered for. Redistribution or use in other products or commercial offerings is not permitted without written consent from Bernier LLC.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@bernierllc/content-validator

Installation

Features

Usage

Basic Validation

Pre-configured Validators

Custom Configuration

Content Analysis

Aspect-specific Validation

Event Monitoring

Content Type Detection

API Reference

ContentValidator

Methods

createContentValidator

Pre-configured Validators

Types

ValidationResult

ValidationError

ValidationWarning

ContentAnalysis

Configuration Options

Sanitization

Security

Integration Status

Dependencies

License