word-sensor
v2.0.0
Published
A powerful and flexible word filtering library for JavaScript/TypeScript with advanced features like regex patterns, statistics, and batch processing
Maintainers
Readme
WordSensor v2.0.0 🚀
WordSensor is a powerful and flexible word filtering library for JavaScript/TypeScript. It helps you detect, replace, or remove forbidden words from text with advanced features like regex patterns, statistics, batch processing, and more.
✨ Features
- 🔍 Advanced Detection: Detect prohibited words with precise positioning
- 🚫 Multiple Filtering Modes: Replace, remove, or highlight forbidden words
- 🎭 Smart Masking: Full, partial, or smart masking options
- 📊 Statistics & Analytics: Track detections and get detailed insights
- 🔧 Regex Support: Use custom regex patterns for complex filtering
- 📦 Batch Processing: Process multiple texts efficiently
- 🎯 Preset Filters: Ready-to-use profanity, spam, and phishing filters
- 🔄 Custom Replacers: Create custom replacement functions
- 📈 Real-time Monitoring: Log and track all detections
- 🌐 API Integration: Load forbidden words from external APIs
- 📁 File Support: Import word lists from files
- ⚡ High Performance: Optimized for speed and memory efficiency
- 🎨 Emoji Replacers: Replace words with emojis
- 🔒 Word Boundaries: Configurable word boundary detection
- 📝 TypeScript Support: Full TypeScript definitions included
📦 Installation
npm install word-sensoror
yarn add word-sensor🚀 Quick Start
Basic Usage
import { WordSensor } from 'word-sensor';
// Create a sensor with forbidden words
const sensor = new WordSensor({
words: ['badword', 'offensive', 'rude'],
maskChar: '*',
caseInsensitive: true,
logDetections: true
});
// Filter text
const result = sensor.filter('This is a badword test.');
console.log(result); // "This is a ******* test."Using Preset Filters
import { createProfanityFilter, createSpamFilter, createPhishingFilter } from 'word-sensor';
// Create specialized filters
const profanityFilter = createProfanityFilter();
const spamFilter = createSpamFilter();
const phishingFilter = createPhishingFilter();
// Use them
console.log(profanityFilter.filter('This is badword content.')); // "This is ******* content."
console.log(spamFilter.filter('Buy now! Free money!')); // "#### now! #### money!"📚 API Reference
WordSensor Class
Constructor
new WordSensor(config?: WordSensorConfig)Configuration Options:
words?: string[]- Initial list of forbidden wordsmaskChar?: string- Character used for masking (default: "*")caseInsensitive?: boolean- Case-insensitive matching (default: true)logDetections?: boolean- Enable detection logging (default: false)enableRegex?: boolean- Enable regex pattern support (default: false)wordBoundary?: boolean- Use word boundaries (default: true)customReplacer?: (word: string, context: string) => string- Custom replacement function
Core Methods
filter(text: string, mode?: "replace" | "remove" | "highlight", maskType?: "full" | "partial" | "smart"): string
Filter text with specified mode and masking type.
// Replace with full masking
sensor.filter('This is badword.'); // "This is *******."
// Remove forbidden words
sensor.filter('This is badword.', 'remove'); // "This is ."
// Highlight forbidden words
sensor.filter('This is badword.', 'highlight'); // "This is [FILTERED: badword]."
// Smart masking
sensor.filter('This is badword.', 'replace', 'smart'); // "This is b****d."detect(text: string): string[]
Detect all forbidden words in text.
const detected = sensor.detect('This contains badword and offensive content.');
console.log(detected); // ["badword", "offensive"]detectWithPositions(text: string): Array<{word: string, start: number, end: number}>
Detect forbidden words with their positions.
const positions = sensor.detectWithPositions('This badword is offensive.');
console.log(positions);
// [
// { word: "badword", start: 5, end: 12 },
// { word: "offensive", start: 16, end: 25 }
// ]Word Management
// Add words
sensor.addWord('newbadword', '###'); // With custom mask
sensor.addWords(['word1', 'word2']);
// Remove words
sensor.removeWord('badword');
sensor.removeWords(['word1', 'word2']);
// Check words
sensor.hasWord('badword'); // true/false
sensor.getWords(); // Get all forbidden words
sensor.clearWords(); // Clear all wordsRegex Patterns
// Enable regex support
const regexSensor = new WordSensor({ enableRegex: true });
// Add regex patterns
regexSensor.addRegexPattern('\\b\\w+@\\w+\\.\\w+\\b', '[EMAIL]');
regexSensor.addRegexPattern('\\b\\d{4}-\\d{4}-\\d{4}-\\d{4}\\b', '[CARD]');
// Filter with regex
const result = regexSensor.filter('Contact me at [email protected]');
console.log(result); // "Contact me at [EMAIL]"Statistics & Monitoring
// Get detection statistics
const stats = sensor.getStats();
console.log(stats);
// {
// totalDetections: 5,
// uniqueWords: ["badword", "offensive"],
// detectionCounts: { "badword": 3, "offensive": 2 },
// lastDetectionTime: Date
// }
// Get detection logs
const logs = sensor.getDetectionLogs();
console.log(logs); // ["badword", "offensive", "badword", ...]
// Reset statistics
sensor.resetStats();Configuration Methods
// Update configuration
sensor.setMaskChar('#');
sensor.setCaseInsensitive(false);
sensor.setLogDetections(true);
sensor.setCustomReplacer((word) => `[${word.toUpperCase()}]`);Utility Methods
// Check if text is clean
sensor.isClean('This is clean text.'); // true
sensor.isClean('This has badword.'); // false
// Get clean percentage
sensor.getCleanPercentage('This badword is offensive.'); // 50
// Sanitize text (quick filter)
sensor.sanitizeText('This is badword.'); // "This is *******."Utility Functions
Preset Filters
import {
createProfanityFilter,
createSpamFilter,
createPhishingFilter,
PRESET_WORDS
} from 'word-sensor';
// Create specialized filters
const profanityFilter = createProfanityFilter('*');
const spamFilter = createSpamFilter('#');
const phishingFilter = createPhishingFilter('!');
// Access preset word lists
console.log(PRESET_WORDS.profanity);
console.log(PRESET_WORDS.spam);
console.log(PRESET_WORDS.phishing);Batch Processing
import { batchFilter, batchDetect, getBatchStats } from 'word-sensor';
const texts = [
'This is bad.',
'This is offensive.',
'This is clean.'
];
// Batch filter
const filtered = batchFilter(texts, sensor);
console.log(filtered);
// ["This is ***.", "This is *********.", "This is clean."]
// Batch detect
const detected = batchDetect(texts, sensor);
console.log(detected);
// [
// { text: "This is bad.", detected: ["bad"] },
// { text: "This is offensive.", detected: ["offensive"] },
// { text: "This is clean.", detected: [] }
// ]
// Get batch statistics
const stats = getBatchStats(texts, sensor);
console.log(stats);
// {
// totalTexts: 3,
// cleanTexts: 1,
// dirtyTexts: 2,
// totalDetections: 2,
// averageCleanPercentage: 66.67
// }Custom Replacers
import { createCustomReplacer, createEmojiReplacer } from 'word-sensor';
// Create custom replacer
const customReplacer = createCustomReplacer({
'bad': 'good',
'offensive': 'appropriate',
'rude': 'polite'
});
// Create emoji replacer
const emojiReplacer = createEmojiReplacer();
// Use with sensor
sensor.setCustomReplacer(customReplacer);
sensor.setCustomReplacer(emojiReplacer);Regex Utilities
import { validateRegexPattern, escapeRegexSpecialChars } from 'word-sensor';
// Validate regex pattern
validateRegexPattern('\\b\\w+\\b'); // true
validateRegexPattern('invalid['); // false
// Escape special characters
escapeRegexSpecialChars('test.com'); // "test\\.com"
escapeRegexSpecialChars('test*test'); // "test\\*test"API Integration
import { loadForbiddenWordsFromAPI, loadWordsFromFile } from 'word-sensor';
// Load from API
await loadForbiddenWordsFromAPI(
'https://api.example.com/forbidden-words',
'data.words',
sensor
);
// Load from file (browser)
const fileInput = document.getElementById('file') as HTMLInputElement;
const file = fileInput.files[0];
if (file) {
const words = await loadWordsFromFile(file);
sensor.addWords(words);
}🎯 Advanced Examples
Content Moderation System
import { WordSensor, createProfanityFilter, createSpamFilter } from 'word-sensor';
class ContentModerator {
private profanityFilter: WordSensor;
private spamFilter: WordSensor;
private customFilter: WordSensor;
constructor() {
this.profanityFilter = createProfanityFilter();
this.spamFilter = createSpamFilter();
this.customFilter = new WordSensor({
enableRegex: true,
wordBoundary: false
});
// Add custom patterns
this.customFilter.addRegexPattern('\\b\\w+@\\w+\\.\\w+\\b', '[EMAIL]');
this.customFilter.addRegexPattern('\\b\\d{10,}\\b', '[PHONE]');
}
moderateContent(content: string): {
isClean: boolean;
filteredContent: string;
violations: string[];
stats: any;
} {
// Apply all filters
let filteredContent = content;
const violations: string[] = [];
// Check profanity
const profanityDetected = this.profanityFilter.detect(content);
if (profanityDetected.length > 0) {
violations.push('profanity');
filteredContent = this.profanityFilter.filter(filteredContent);
}
// Check spam
const spamDetected = this.spamFilter.detect(content);
if (spamDetected.length > 0) {
violations.push('spam');
filteredContent = this.spamFilter.filter(filteredContent);
}
// Apply custom filters
filteredContent = this.customFilter.filter(filteredContent);
return {
isClean: violations.length === 0,
filteredContent,
violations,
stats: {
profanity: this.profanityFilter.getStats(),
spam: this.spamFilter.getStats(),
custom: this.customFilter.getStats()
}
};
}
}
// Usage
const moderator = new ContentModerator();
const result = moderator.moderateContent('This is badword spam content with [email protected]');
console.log(result);Real-time Chat Filter
import { WordSensor, createEmojiReplacer } from 'word-sensor';
class ChatFilter {
private sensor: WordSensor;
private messageHistory: string[] = [];
constructor() {
this.sensor = new WordSensor({
words: ['badword', 'offensive'],
logDetections: true,
customReplacer: createEmojiReplacer()
});
}
processMessage(message: string, userId: string): {
filteredMessage: string;
isClean: boolean;
warning: string | null;
} {
const filteredMessage = this.sensor.filter(message);
const isClean = this.sensor.isClean(message);
// Check user history
const userViolations = this.messageHistory.filter(msg =>
msg.includes(userId) && !this.sensor.isClean(msg)
).length;
let warning = null;
if (!isClean) {
if (userViolations >= 3) {
warning = 'You have been warned multiple times. Further violations may result in a ban.';
} else {
warning = 'Please keep the chat appropriate.';
}
}
// Log message
this.messageHistory.push(`${userId}: ${message}`);
return { filteredMessage, isClean, warning };
}
getModerationStats() {
return this.sensor.getStats();
}
}Batch Content Analysis
import { WordSensor, batchDetect, getBatchStats } from 'word-sensor';
class ContentAnalyzer {
private sensor: WordSensor;
constructor() {
this.sensor = new WordSensor({
words: ['inappropriate', 'spam', 'offensive'],
logDetections: true
});
}
analyzeBatch(contentList: string[]): {
summary: any;
details: Array<{
content: string;
isClean: boolean;
detectedWords: string[];
cleanPercentage: number;
}>;
} {
const batchResults = batchDetect(contentList, this.sensor);
const batchStats = getBatchStats(contentList, this.sensor);
const details = contentList.map((content, index) => ({
content,
isClean: batchResults[index].detected.length === 0,
detectedWords: batchResults[index].detected,
cleanPercentage: this.sensor.getCleanPercentage(content)
}));
return {
summary: {
...batchStats,
sensorStats: this.sensor.getStats()
},
details
};
}
}🧪 Testing
# Run tests
npm test
# Run tests in watch mode
npm run test:watch
# Run tests with coverage
npm run test:coverage📦 Build
# Build for production
npm run build
# Build in watch mode
npm run dev
# Clean build artifacts
npm run clean🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
👨💻 Author
Developed by Asrul Harahap.
🙏 Acknowledgments
- Thanks to all contributors who helped improve this library
- Inspired by the need for better content moderation tools
- Built with TypeScript for better developer experience
📈 Changelog
v2.0.0
- ✨ Major Release: Complete rewrite with advanced features
- 🔧 New Constructor: Config-based initialization
- 📊 Statistics: Comprehensive detection tracking
- 🔍 Regex Support: Custom regex pattern filtering
- 📦 Batch Processing: Efficient multi-text processing
- 🎯 Preset Filters: Ready-to-use specialized filters
- 🎨 Custom Replacers: Flexible replacement functions
- 📈 Position Detection: Get exact word positions
- 🔄 Smart Masking: Intelligent masking algorithms
- 🌐 API Integration: External word list loading
- 📁 File Support: Import word lists from files
- 🎨 Emoji Replacers: Fun emoji-based replacements
- 📝 Enhanced Types: Better TypeScript support
- 🧪 Comprehensive Tests: 36 test cases covering all features
v1.0.5
- 🐛 Bug fixes and improvements
- 📝 Better documentation
⭐ Star this repository if you find it useful!
