datanonym
v1.9.0
Published
Lightweight JavaScript library for anonymizing datasets for testing and development. Zero dependencies. Supports string, number, and datetime anonymization with deterministic seeding.
Maintainers
Readme
data-anonymizer (v2.0)
🔐 Zero-dependency lightweight JavaScript library for anonymizing datasets for testing, development, and privacy compliance. Supports string, number, and datetime anonymization with deterministic seeding.
Key Features
✨ Zero External Dependencies - Pure Node.js implementation
🔐 Deterministic Seeding - Same seed = same output
📊 Multiple Formats - JSON, CSV, JSONL
🎯 Flexible Rules - Declarative anonymization rules
💾 Mapping Persistence - Ensure consistent anonymization across batches
⚡ Stream Processing - Handle large files efficiently
🛠️ CLI Support - Command-line interface included
- More detail -> Features detail
Installation
npm install datanonymQuick Start
Basic Usage
const anonymize = require('datanonym');
const data = [
{ id: 1, name: 'Alice Johnson', email: '[email protected]', age: 32 },
{ id: 2, name: 'Bob Smith', email: '[email protected]', age: 28 }
];
const rules = {
id: { type: 'number', min: 1000, max: 9999 },
name: { type: 'string', method: 'name' },
email: { type: 'string', method: 'email', preserveDomain: true },
age: { type: 'number', jitter: 0.1, integer: true }
};
// Deterministic output with seed
const anonymized = anonymize(data, rules, { seed: 'my-seed' });
console.log(JSON.stringify(anonymized, null, 2));CLI
The command-line interface makes anonymization easy for files:
Usage:
node src/cli.js <data.json> <rules.json> [out.json] [--seed <seed>]
# or, after installing globally/binary
data-anonymizer data.json rules.json out.json --seed "seed123"Example:
node src/cli.js ./data.json ./rules.json out.json --seed "stable-1"--seedor-s: optional global seed for deterministic output.
Rules schema (fields)
Each field rule is an object with type and type-specific options. If a rule is omitted for a field, original value is left unchanged.
Common rule root:
type: 'string' | 'number' | 'datetime' (or 'date')seed: optional, per-field seed (string or number). When present, the anonymizer temporarily uses this seed for that field; useful for reproducible results per field.
String options (rule.type = 'string'):
method(default: 'mask'): 'mask' | 'name' | 'email' | 'pattern' | 'keep-length'- mask:
maskChar(default '*')keepStart(count of leading chars to keep)keepEnd(count of trailing chars to keep)
- name: generate a random first + last name
- email:
preserveDomain(boolean). If true, domain of original kept, local-part replaced.
- pattern:
patternstring, uses custom pattern tokens:A= random letter#= random digit*= random alphanumeric- other characters are preserved
- keep-length: keep length and replace letters/digits with same class
- mask:
Examples:
{ "username": { "type": "string", "method": "mask", "keepStart": 1, "keepEnd": 1 } }
{ "email": { "type": "string", "method": "email", "preserveDomain": true } }
{ "code": { "type": "string", "method": "pattern", "pattern": "AA-###" } }Number options (rule.type = 'number'):
min,max: sample or clamp to rangeinteger: boolean (round result)jitter: numeric; if |jitter| < 1 it is treated as fraction of value (e.g. 0.1); otherwise absolute jitter added/subtracted- Example:
{ "salary": { "type": "number", "min": 30000, "max": 90000, "integer": true } }
{ "age": { "type": "number", "jitter": 0.15, "integer": true } }Datetime options (rule.type = 'datetime' or 'date'):
shiftDays: integer days to add (negative to subtract)jitterDays: integer days to jitter randomly within ±valuerandomBetween: { "start": "YYYY-MM-DD", "end": "YYYY-MM-DD" } — pick a random date in rangeformat: if true result returned as ISO string; otherwise returns a Date object when used programmatically- Example:
{ "dob": { "type": "datetime", "shiftDays": -3650, "format": true } }
{ "createdAt": { "type": "datetime", "randomBetween": { "start": "2020-01-01", "end": "2023-01-01" } } }Deterministic seeding:
- Global: call
anonymize(data, rules, { seed: 'global-seed' })or use CLI--seed. - Per-field: add
"seed": "field-seed"to a rule. Per-field seed temporarily overrides RNG while anonymizing that field (push/pop), so the same field with same rule+seed will always produce the same anonymized values.
Examples:
{
"name": { "type": "string", "method": "name", "seed": "name-seed" },
"email": { "type": "string", "method": "email" }
}API reference
Exports from the package:
anonymize(input, rules, opts):input: object or array of objects (dataset)rules: object keyed by field name (see rules schema)opts:seed: optional global seed (string or number)
- Returns: anonymized object or array
- Also exports
string,number,datetimeanonymizer modules for direct usage/extension:require('data-anonymizer').string.anonymizeString(...)etc.
Folder structure and file explanations
Top-level:
package.json— package metadata, entry pointsREADME.md— this documentationsrc/— source codeindex.js— main engine and API; handles global/per-field seeding and traversalcli.js— minimal CLI wrapper (read JSON, apply rules, optional seed flag)anonymizers/string.js— string anonymizers (mask, name, email, pattern, keep-length)number.js— number anonymizers (range, jitter, integer)datetime.js— date/time anonymizers (shift, jitter, randomBetween)
utils/pattern.js— helpers to generate strings from patterns and keep-length replacementrandom.js— RNG utilities; includes deterministic PRNG (seed, pushSeed, popSeed) and helpers like randomName, randomEmail
example/example.js— runnable example demonstrating typical usage
Examples
Programmatic with global seed:
const { anonymize } = require('datanonym');
const out = anonymize(data, rules, { seed: 'global-123' });Per-field deterministic:
{ "email": { "type": "string", "method": "email", "seed": "email-1" } }CLI:
node src/cli.js data.json rules.json --seed "stable"Testing & local development
- Use
node src/example/example.jsto try the example dataset. - For deterministic behavior verify output is stable using the same
--seedor opts.seed. - Consider adding unit tests (Jest/Mocha) for CI.
License
MIT (include a LICENSE file in the repository).
