redact-phi
v1.0.3
Published
Assists in the redaction of PHI in structured or semi-structured data, supporting excel (XSLX), CSV or JSON.
Downloads
353
Readme
![]()
redact-PHI
A command-line utility to remove, redact and fabricate PHI, PII or protected data (CSV, JSON, XLSX).
Online Demo
Local Installation
Install globally using npm i -g redact-phi
Usage

redact [options] <infile> [outfile]
Options:
-V, --version output the version number
--delimiter <delimiter> sets delimiter type (default: ",")
--skipRows <n> number of rows to skip (pass through unchanged) before processing (default: "0")
-y Skips all prompts
--strategy <override> Provide the location of your redaction specification
-h, --help display help for commandSupported File Formats
.csv— Comma-separated values (or custom delimiter via--delimiter).txt— Text files with delimited data (use--delimiterto specify)
Skipping Header / Metadata Rows
Some files include metadata rows before the actual data headers. Use --skipRows to pass these rows through unchanged:
redact --skipRows 2 data.csv
The first 2 rows will be written to the output file as-is, and redaction begins from row 3.
Custom Strategy Location
To use a strategy file with a name different from the input file, use --strategy:
redact --strategy my-strategy.json data.csv
The custom .js file (if any) is resolved relative to the strategy JSON, not the input file. For example, --strategy my-strategy.json will look for my-strategy.js alongside it.
Unused Column Warnings
After processing, redact-phi checks whether all columns defined in the strategy were matched to columns in the input file. If any strategy columns were not found, a warning is printed:
WARNING! Strategy columns unused: Full Name, SSN
The input file may not match the strategy, or column headers may differ.This helps catch typos in column names or mismatched strategy files.
Redaction Example
redact-PHI includes an example (./example/) :
example.csv- CSV source file (data to be redacted)example.json- JSON redaction strategy (defines how columns will be redacted with:faker,a custom redactoror aconstant)example.js- JavaScript custom redactors (this file is optional)
By default, the .csv, .json and .js files must all be named the same (e.g. data.csv, data.json, data.js). Use --strategy to point at a strategy file with a different name — see Custom Strategy Location.
example/example.csv :
id,email,ssn,first name,last name,full address,order date,order amount
783,[email protected],706-25-4558,Caroline,Goodwin,82703 Yasmeen Corner Apt. 379,"10/9/2020, 4:48:52 AM",252.94
784,[email protected],282-57-6226,Izabella,Bosco,2930 Alisa Heights Apt. 572,"3/27/2021, 11:51:26 PM",689.40
784,[email protected],282-57-6226,Izabella,Bosco,2930 Alisa Heights Apt. 572,"6/17/2021, 9:44:54 PM",446.29
786,[email protected],677-86-2303,Celine,Dicki,68305 Labadie Shoal Suite 608,"4/23/2021, 6:49:34 PM",889.37
...example/example.json :
{
"columns": [
{
"redactWith": "random.number",
"columnNum": "",
"columnKey": "order id",
"tracked": false
},
{
"redactWith": "customGenerateId",
"columnNum": 0,
"columnKey": "",
"tracked": true
},
...The columns array contains objects with properties that describe how each column should be redacted:
columnNum- (integer) Identifies the column to be redacted with an integer index (zero-based). (columnNumorcolumnKeymust be set)columnKey- (string) Identifies the column to be redacted using the file's header. (columnNumorcolumnKeymust be set)redactWith- (string) The strategy used to redact a value, there are four options:- faker function - Available functions :
name.firstNameorinternet.email - faker template - Faker methods in a mustache template :
{{address.streetAddress(true)}}or{{address.city}}, {{address.state}} {{address.zip}} - custom JavaScript - For more complicated redacted values you can call a JavaScript function defined in your
.jsfile (seeexample.js) - constant value - If you wish to replace every value in this column with a constant, e.g.
John Doe
- faker function - Available functions :
tracked- Tracking preserves the relationships within your data while de-identifying. With tracking enabled, the redaction engine tracks a column's original value and reuses the redacted value if the original is re-encountered. To see this in-action, runredact /example/example.csvand note how columns with tracking enabled (id,ssn,email) are redacted with the same value inexample_redacted.csv.
example/example.js :
module.exports = {
customGenerateId: () => {
return ++currentId;
},
...Custom redactors provide more control over your data. A custom redactor is referenced from your .json file using the redactWith property. For example: redactWith: "customGenerateId"
De-identification Examples
These JSON examples will remove the following PII:
- First Name:
"redactWith": "name.firstName" - Last Name:
"redactWith": "name.lastName" - Full Name:
"redactWith": "name.findName" - Social Security number:
"redactWith": "{{datatype.number({\"min\":100,\"max\":999})}}-{{datatype.number({\"min\":10,\"max\":99})}}-{{datatype.number({\"min\":1000,\"max\":9999})}}" - Email Address:
"redactWith": "internet.email" - Phone / Fax number:
"redactWith": "phone.phoneNumber" - Street Address:
"redactWith": "address.streetAddress" - City:
"redactWith": "address.city" - Zip Code:
"redactWith": "address.zipCode" - City, State, Zip:
"redactWith": "{{address.city()}}, {{address.stateAbbr()}} {{address.zipCode()}}" - Full Address:
"redactWith": "{{address.streetAddress(true)}}" - IP:
"redactWith": "internet.ip" - IP v6:
"redactWith": "internet.ipv6"
Full De-identification example.json:
{
"columns": [
{
"redactWith": "name.firstName",
"columnNum": "",
"columnKey": "first name",
"tracked": false
},
{
"redactWith": "name.lastName",
"columnNum": "",
"columnKey": "last name",
"tracked": false
},
{
"redactWith": "name.findName",
"columnNum": "",
"columnKey": "full name",
"tracked": false
},
{
"redactWith": "{{datatype.number({\"min\":100,\"max\":999})}}-{{datatype.number({\"min\":10,\"max\":99})}}-{{datatype.number({\"min\":1000,\"max\":9999})}}",
"columnNum": "",
"columnKey": "social security number",
"tracked": false
},
{
"redactWith": "internet.email",
"columnNum": "",
"columnKey": "email",
"tracked": false
},
{
"redactWith": "phone.phoneNumber",
"columnNum": "",
"columnKey": "phone",
"tracked": false
},
{
"redactWith": "address.city",
"columnNum": "",
"columnKey": "city",
"tracked": false
},
{
"redactWith": "address.zipCode",
"columnNum": "",
"columnKey": "zip",
"tracked": false
},
{
"redactWith": "address.streetAddress",
"columnNum": "",
"columnKey": "street address",
"tracked": false
},
{
"redactWith": "{{address.city()}}, {{address.stateAbbr()}} {{address.zipCode()}}",
"columnNum": "",
"columnKey": "city state zip",
"tracked": false
},
{
"redactWith": "{{address.streetAddress(true)}}",
"columnNum": "",
"columnKey": "full address",
"tracked": false
},
{
"redactWith": "internet.ip",
"columnNum": "",
"columnKey": "ip",
"tracked": false
},
{
"redactWith": "internet.ipv6",
"columnNum": "",
"columnKey": "ipv6",
"tracked": false
}
]
}