@muppit/shield-engine
v0.1.5
Published
Deterministic PII detection and redaction for transcripts and text files. Lightweight NLP verification, no cloud services, no network access.
Maintainers
Readme
@muppit/shield-engine
Deterministic PII detection and redaction for transcripts and text files. No AI, no network access — everything runs locally.
Name detection uses a 50K+ name dictionary with local grammar-based verification (compromise.js) for context-aware precision. No cloud services, no machine learning models.
Install
npm install @muppit/shield-engineSupported Formats
- VTT — WebVTT with speaker labels and voice tags
- SRT — SubRip with speaker labels
- JSON — Transcript array with speaker, text, start, end
- TXT — Plain text, optionally with
Speaker:prefix
Quick Start
import { analyseFile, redact } from '@muppit/shield-engine';
const result = await analyseFile(vttContent, 'session.vtt');
// result.candidates — detected PII
// result.suggestedNameMappings — { "Sarah": "Coach", "John": "Client" }
const redacted = redact(result, {
nameMappings: [{ type: 'name', original: 'Sarah', replacement: 'Coach' }],
structuredMode: 'label',
preserveTimestamps: true,
});
// redacted.content — cleaned VTT with PII replacedDetection
| PII Type | Method | |---|---| | Name | Dictionary + grammar verification + speaker labels | | Email | Regex | | Phone | libphonenumber-js (international) | | URL | Regex | | Credit card | Regex + Luhn validation | | SSN | openredaction (government patterns) | | TFN | openredaction (Australian Tax File Number) | | National ID | openredaction (passports, NI numbers) | | Organisation | Custom names (user-supplied) |
Custom Names
Pass known participant names for high-confidence detection:
import { analyseFile } from '@muppit/shield-engine';
const result = await analyseFile(content, 'session.vtt', {
customNames: ['Sarah', 'Tom'],
customOrganisations: ['Acme Corp'],
});Custom names always match with high confidence, bypassing dictionary and grammar checks.
CLI
npx shield-engine session.vtt
npx shield-engine session.vtt --jsonAPI
| Function | Async | Description |
|---|---|---|
| parseFile(content, filename) | No | Parse transcript into segments |
| detect(document, options?) | Yes | Detect PII in a parsed document |
| analyseFile(content, filename, options?) | Yes | Parse + detect in one call |
| detectInSegments(segments, options?) | Yes | Detect PII in raw segments |
| redact(result, config) | No | Apply redaction to a detection result |
License
MIT
