@undp-data/data-utils
v0.1.0
Published
Library with data utils for UNDP
Readme
UNDP Data Utils 
Getting Started
UNDP Data Utils is a lightweight utility library for working with data files (CSV, JSON, XLSX, etc.), with built-in support for validating, parsing, and formatting data.
We also assumes that you have intermediate knowledge about JavaScript/TypeScript.
NPM Package can be found here
✨ Features
- ✅ Parse CSV, JSON, and Excel (XLSX) files
- ✅ Validate data against a defined schema
- ✅ Convert between formats (e.g., CSV ↔ JSON)
- ✅ Merge or split data files based on a column
- ✅ Enrich data with country information (by name or ISO codes)
- ✅ Compatible with Node.js and modern browsers
Installation
Using npm
npm i @undp-data/data-utilsUsing yarn
yarn add @undp-data/data-utilsAPI Reference
Country Code & Info Utilities
- getISO2FromCountryName
- getISO3FromCountryName
- getISO2FromISO3
- getISO3FromISO2
- getCountryDetailsFromISO3
- getCountryDetailsFromISO2
- getCountryDetailsFromName
- addISO3CountryCodeFromISO2
- addISO3CountryCodeFromName
Data Format Conversion
Data Download Utilities
Remote Data Fetching
General Data Utilities
- addColumns
- removeColumns
- cleanData
- mergeData
- splitData
- renameHeader
- trimStringInData
- validateData
- parseData
- parseCsvString
- parseNumberAsFloatInData
- getDataSummary
getISO2FromCountryName
getISO2FromCountryName(name: string, threshold?: number): string | undefined
Returns the ISO 3166-1 Alpha-2 (2-letter) country code from a given country name. It attempts exact match, cleaned string match, and fuzzy match (using fuse.js) in order.
Parameters
- name: The name of the country (e.g., "Finland" or "United States of America").
- threshold: (optional) Fuzzy match sensitivity (default: 0.3, lower is stricter).
Example
getISO2FromCountryName('Finland'); // 'FI'
getISO2FromCountryName('U S A'); // 'US'
getISO2FromCountryName('Brzil'); // 'BR' (fuzzy match for 'Brazil')getISO3FromCountryName
getISO3FromCountryName(name: string, threshold?: number): string | undefined
Returns the ISO 3166-1 Alpha-3 (3-letter) country code from a given country name. It attempts exact match, cleaned string match, and fuzzy match (using fuse.js) in order.
Parameters
- name: The name of the country (e.g., "Finland" or "United States of America").
- threshold: (optional) Fuzzy match sensitivity (default: 0.3, lower is stricter).
Example
getISO2FromCountryName('Finland'); // 'FIN'
getISO2FromCountryName('U S A'); // 'USA'
getISO2FromCountryName('Argntina'); // 'ARG' (fuzzy match for 'Brazil')getISO2FromISO3
getISO2FromISO3(code: string): string | undefined
Converts a 3-letter ISO country code (Alpha-3) to a 2-letter ISO code (Alpha-2).
Parameters
- code: ISO 3166-1 Alpha-3 code (e.g., "SWE")
Example
getISO2FromISO3('SWE'); // 'SE'
getISO2FromISO3('DEU'); // 'DE'getISO3FromISO2
getISO3FromISO2(code: string): string | undefined
Converts a 2-letter ISO country code (Alpha-2) to a 3-letter ISO code (Alpha-3).
Parameters
- code: ISO 3166-1 Alpha-2 code (e.g., "SE")
Example
getISO3FromISO2('SE'); // 'SWE'
getISO3FromISO2('DE'); // 'DEU'getCountryDetailsFromISO3
getCountryDetailsFromISO3(code: string): object | undefined
Returns full country details for a given ISO 3-letter code.
Parameters
- code: ISO 3166-1 Alpha-3 code (e.g., "SWE")
Example
getCountryDetailsFromISO3('SWE'); // 'SWE'
/*
{
"Alpha-3 code":"SWE",
"Country or Area (official name)":"Sweden",
"Alpha-2 code":"SE",
"Numeric code":"752",
"Latitude (average)":"62.0",
"Longitude (average)":"15.0",
"Group 1":"Europe",
"Group 2":"Northern Europe",
"Group 3":"",
"LDC":false,
"LLDC":false,
"SIDS":false,
"UNDP Bureau": null
}
*/getCountryDetailsFromISO2
getCountryDetailsFromISO2(code: string): object | undefined
Returns full country details for a given ISO 2-letter code.
Parameters
- code: ISO 3166-1 Alpha-2 code (e.g., "SE")
Example
getCountryDetailsFromISO2('SE'); // 'SWE'
/*
{
"Alpha-3 code":"SWE",
"Country or Area (official name)":"Sweden",
"Alpha-2 code":"SE",
"Numeric code":"752",
"Latitude (average)":"62.0",
"Longitude (average)":"15.0",
"Group 1":"Europe",
"Group 2":"Northern Europe",
"Group 3":"",
"LDC":false,
"LLDC":false,
"SIDS":false,
"UNDP Bureau": null
}
*/getCountryDetailsFromName
getCountryDetailsFromName(name: string): object | undefined
Returns full country details from a given country name. It attempts exact match, cleaned string match, and fuzzy match (using fuse.js) in order.
Parameters
- name: The name of the country (e.g., "Finland" or "United States of America").
Example
getCountryDetailsFromName('Sweden');
/*
{
"Alpha-3 code":"SWE",
"Country or Area (official name)":"Sweden",
"Alpha-2 code":"SE",
"Numeric code":"752",
"Latitude (average)":"62.0",
"Longitude (average)":"15.0",
"Group 1":"Europe",
"Group 2":"Northern Europe",
"Group 3":"",
"LDC":false,
"LLDC":false,
"SIDS":false,
"UNDP Bureau": null
}
*/addISO3CountryCodeFromISO2
addISO3CountryCodeFromISO2(data, countryColumnName, iso3ColumnName?): object[]
Adds an ISO3 country code to each object based on a column with ISO2 codes.
Parameters
- data: object[] – array of records
- countryColumnName: string – column name with ISO2 codes
- iso3ColumnName: string (optional) – name for the new column (default: 'ISO3')
Example
const data = [
{ iso2: 'US', metric: 75 },
{ iso2: 'FR', metric: 60 },
];
const enriched = addISO3CountryCodeFromISO2(data, 'iso2', 'iso3_code');
console.log(enriched);
/*
[
{ iso2: 'US', metric: 75, iso3_code: 'USA' },
{ iso2: 'FR', metric: 60, iso3_code: 'FRA' }
]
*/addISO3CountryCodeFromName
addISO3CountryCodeFromName(data, countryColumnName, iso3ColumnName?, threshold?): object[]
Adds an ISO3 country code to each object in the dataset based on a country name column.
Parameters
- data: object[] – your array of records
- countryColumnName: string – name of the field containing country names
- iso3ColumnName: string (optional) – name for the new ISO3 column (default: 'ISO3')
- threshold: number (optional) – fuzzy match sensitivity (default: 0.3)
Example
const data = [
{ country: 'India', value: 100 },
{ country: 'Brzil', value: 50 },
];
const enriched = addISO3CountryCodeFromName(data, 'country');
console.log(enriched);
/*
[
{ country: 'India', value: 100, ISO3: 'IND' },
{ country: 'Brzil', value: 50, ISO3: 'BRA' } // fuzzy match to Brazil
]
*/csvToJson
csvToJson(data: string, delimiter = ',', fileName = 'data.json')
Parses a CSV string and downloads the result as a JSON file.
Parameters
- data: string – CSV string input
- delimiter: string (optional) – delimiter to use (default: ,)
- fileName: string (optional) – name for the downloaded file (default: 'data.json')
Example
const csv = `name,age,country
Alice,30,Canada
Bob,25,USA`;
csvToJson(csv); // Triggers download of "data.json"jsonToCsv
jsonToCsv(data: object[] | string, fileName = 'data.csv')
Converts a JSON array (or valid JSON string) to CSV and downloads it.
Parameters
- data: object[] | string – JSON data or JSON string
- fileName: string (optional) – name for the downloaded file (default: 'data.csv')
Example
const jsonData = [
{ name: 'Alice', age: 30 },
{ name: 'Bob', age: 25 },
];
jsonToCsv(jsonData); // Triggers download of "data.csv"xlsxToCSV
xlsxToCSV(file: File | string, combineSheets = false, filename = 'data.csv', ignoreSheets?: string[])
Converts an Excel file to CSV and triggers a file download.
Parameters
- file: File | string – XLSX file (uploaded or URL)
- combineSheets: boolean – if true, all sheets are merged into one CSV (default: false)
- filename: string – name for the downloaded CSV file (default: 'data.csv')
- ignoreSheets: string[] – sheet names to exclude from processing
Example
// From file upload
xlsxToCSV(fileInput.files[0], true, 'merged.csv', ['Sheet2']);
// From remote URL
xlsxToCSV('https://example.com/data.xlsx', false, 'first-sheet.csv');xlsxToJSON
xlsxToJSON(file: File | string, combineSheets = false, filename = 'data.json', ignoreSheets?: string[])
Converts an Excel file to JSON and triggers a file download.
Parameters
- file: File | string – XLSX file (uploaded or URL)
- combineSheets: boolean – if true, all sheets are merged into one array (default: false)
- filename: string – name for the downloaded JSON file (default: 'data.csv')
- ignoreSheets: string[] – sheet names to exclude from processing
Example
// From file upload
xlsxToJSON(fileInput.files[0], true, 'merged.json', ['Sheet2']);
// From remote URL
xlsxToJSON('https://example.com/data.xlsx', false, 'first-sheet.json');downloadCSVFromData
downloadCSVFromData(data: Record<string, any>[], filename = 'data.csv')
Converts an array of objects into a CSV file and triggers a download.
Parameters
- data: Record<string, any>[] – Array of objects to convert and download.
- filename: string (optional) – Filename for download (default: 'data.csv').
Example
const users = [
{ name: 'Alice', age: 30 },
{ name: 'Bob', age: 25 },
];
downloadCSVFromData(users, 'users.csv');downloadMultipleCSVFromData
downloadMultipleCSVFromData(data: Record<string, any>[][], filename = 'data')
Triggers download of multiple CSV files—one for each dataset in the array.
Parameters
- data: Record<string, any>[][] – Array of arrays of objects
- filename: string (optional) – Base name for the files (e.g., data_1.csv, data_2.csv...)
Example
const dataset1 = [{ id: 1 }, { id: 2 }];
const dataset2 = [{ id: 3 }, { id: 4 }];
downloadMultipleCSVFromData([dataset1, dataset2], 'batch');downloadJsonFile
downloadJsonFile(data: Record<string, any>[], filename = 'data.json')
Serializes the data to JSON and downloads it.
Parameters
- data: Record<string, any>[] – array of objects
- filename: string (optional) – name of the file to be downloaded (default: 'data.json')
Example
const report = [{ date: '2024-01-01', score: 80 }];
downloadJsonFile(report, 'report.json');getCsvFromFile
getCsvFromFile(file: File, delimiter = ','): Promise<object[]>
Parses a CSV file (e.g. from <input type="file" />) into an array of objects.
Parameters
- file: File – A file object (usually from an element)
- delimiter: string (optional) – CSV delimiter (default: ',')
Example
const file = fileInput.files[0];
const data = await getCsvFromFile(file);
console.log(data);
/*
[
{ name: 'Alice', age: '30' },
{ name: 'Bob', age: '25' },
]
*/getCsvFromUrl
getCsvFromUrl(url: string, delimiter = ','): Promise<object[]>
Fetches and parses a remote CSV file into structured data.
Parameters
- url: string – URL to a publicly accessible CSV file
- delimiter: string (optional) – CSV delimiter (default: ',')
Example
const url = 'https://example.com/data.csv';
const data = await getCsvFromUrl(url);getJsonFromUrl
getJsonFromUrl(url: string): Promise<object[]>
Fetches and parses a remote JSON file.
Parameters
- url: string – URL to a public JSON file
Example
const data = await getJsonFromUrl('https://example.com/data.json');getXLSXFromUrl
getXLSXFromUrl(url: string, combineSheets = false, ignoreSheets?: string[]): Promise<any[]>
Fetches and parses a .xlsx file from a remote URL.
Parameters
- url: string – Publicly accessible XLSX file URL
- combineSheets: boolean – if true, all sheets are merged into one array (default: false)
- filename: string – name for the downloaded JSON file (default: 'data.csv')
- ignoreSheets: string[] – sheet names to exclude from processing
Example
const data = await getXLSXFromUrl('https://example.com/data.xlsx', true);
console.log(data);getXLSXFromFile
getXLSXFromFile(file: File, combineSheets = false, ignoreSheets?: string[]): Promise<any[]>
Reads and parses a .xlsx file selected in the browser (via <input type="file" />).
Parameters
- file: File – XLSX file (uploaded or URL)
- combineSheets: boolean – if true, all sheets are merged into one array (default: false)
- ignoreSheets: string[] – sheet names to exclude from processing
Example
const file = fileInput.files[0];
const data = await getXLSXFromFile(file, true); // Combined data from all sheets
console.log(data);addColumns
addColumns(data: Record<string, any>[], columns: { header: string; value: any }[]): Record<string, any>[]
This utility adds one or more new columns with fixed values to each row in a dataset. Ideal for tagging data with metadata like source, sheet name, or version.
Parameters
- data: Record<string, any>[] – array of objects representing tabular data
- columns: { header: string; value: any }[] – list of column to add (with fixed values)
Example
const input = [
{ name: 'Alice', age: 30 },
{ name: 'Bob', age: 25 },
];
const updated = addColumns(input, [
{ header: 'source', value: 'Survey A' },
{ header: 'verified', value: true },
]);
/*
[
{ name: 'Alice', age: 30, source: 'Survey A', verified: true },
{ name: 'Bob', age: 25, source: 'Survey A', verified: true }
]
*/removeColumns
removeColumns(data: Record<string, any>[], columns: string[]): Record<string, any>[]
This utility Removes one or more specified columns from each object in a dataset. Useful for cleaning or anonymizing tabular data.
Parameters
- data: Record<string, any>[] – array of objects representing tabular data
- columns: string[] – list of column names (keys) to remove
Example
const file = fileInput.files[0];
const data = await getXLSXFromFile(file, true); // Combined data from all sheets
console.log(data);cleanData
cleanData(data: Record<string, any>[]): Record<string, any>[]
This utility cleans raw tabular data by trimming strings and parsing valid numbers into floats. Ideal as a pre-processing step before validation or transformation.
Parameters
- data: Record<string, any>[] – array of objects representing tabular data
Example
const rawData = [
{ name: ' Alice ', age: '30', country: ' USA ' },
{ name: 'Bob', age: '25.5', country: 'Canada' },
];
const cleaned = cleanData(rawData);
/*
[
{ name: 'Alice', age: 30, country: 'USA' },
{ name: 'Bob', age: 25.5, country: 'Canada' }
]
*/mergeData
mergeData(dataArray: Record<string, any>[][], joinKeys?: (keyof T)[]): Record<string,any>[]
This utility merges multiple arrays of objects into a single unified dataset. Optionally joins the data on specified key(s).
Parameters
- dataArray: Record<string, any>[] – array of datasets (arrays of objects) to be merged
- joinKeys: string[] – keys to use for grouping and merging rows; if not provided, data is flattened
Example
const a = [{ id: 1, name: 'Alice' }];
const b = [{ age: 30 }];
mergeData([a, b]);
// Output:
// [
// { id: 1, name: 'Alice' },
// { age: 30 }
// ]
const a = [
{ id: 1, name: 'Alice' },
{ id: 2, name: 'Bob' },
];
const b = [
{ id: 1, age: 30 },
{ id: 2, age: 25 },
];
mergeData([a, b], ['id']);
// Output:
// [
// { id: '1', name: 'Alice', age: 30 },
// { id: '2', name: 'Bob', age: 25 }
// ]splitData
splitData(data: Record<string, any>[], keyToSplit: string): Record<string, any>[][]
This utility splits a dataset (array of objects) into multiple groups based on the unique values of a specified key. Useful for segmenting data by category, region, user, etc.
Parameters
- data: Record<string, any>[] – array of objects representing tabular data
- keyToSplit: string – key name used to group the data
Example
const input = [
{ country: 'USA', name: 'Alice' },
{ country: 'India', name: 'Raj' },
{ country: 'USA', name: 'Bob' },
];
const grouped = splitData(input, 'country');
console.log(grouped);
/*
[
[
{ country: 'USA', name: 'Alice' },
{ country: 'USA', name: 'Bob' }
],
[
{ country: 'India', name: 'Raj' }
]
]
*/renameHeader
renameHeader(data: Record<string, any>[], headers: { oldHeader: string; newHeader: string }[]): Record<string, any>[]
This utility renames one or more column headers (object keys) in a dataset — typically used for tabular data like parsed CSV, JSON, or Excel rows.
Parameters
- data: Record<string, any>[] – array of objects representing tabular data
- headers: { oldHeader: string; newHeader: string }[] – list of header renaming instructions
Example
const input = [
{ Name: 'Alice', Age: 30 },
{ Name: 'Bob', Age: 25 },
];
const renamed = renameHeader(input, [
{ oldHeader: 'Name', newHeader: 'Full Name' },
{ oldHeader: 'Age', newHeader: 'Years' },
]);
console.log(renamed);
/*
[
{ 'Full Name': 'Alice', Years: 30 },
{ 'Full Name': 'Bob', Years: 25 },
]
*/trimStringInData
trimStringInData(data: Record<string, any>[]): Record<string, any>[]
This utility function trims leading and trailing whitespace from all string values across an array of objects (typically tabular data). It ensures full column coverage—even when keys are inconsistent across rows.
Parameters
- data: Record<string, any>[] – array of objects representing tabular data
Example
const input = [
{ name: ' Alice ', age: 25 },
{ name: 'Bob\n', country: ' USA ' },
{ name: ' Charlie', extra: null },
];
const cleaned = trimStringInData(input);
console.log(cleaned);
/*
[
{ name: 'Alice', age: 25 },
{ name: 'Bob', country: 'USA' },
{ name: 'Charlie', extra: null },
]
*/validateData
validateData(data: Record<string, any>[], schema: SchemaField[]): ValidationError[]
This utility validates data object based on provided schema.
Parameters
- data: Record<string, any>[] – array of raw row objects
- schema: SchemaField[] – validation schema for expected types
SchemaField
{
columnName: string;
type: 'string' | 'number' | 'Alpha 3 code' | 'dateTime' | 'boolean';
required?: boolean;
enum?: string[]; //only applicable if type is string
range?: [number, number]; //only applicable if type is number
dateFormat?: string; //only applicable if type is dateTime
};Output schema
{
index: number;
column: string;
error: string;
}[];Example
const schema = [
{ columnName: 'country', type: 'Alpha 3 code', required: true },
{ columnName: 'age', type: 'number', range: [18, 65] },
{ columnName: 'joinedAt', type: 'dateTime', dateFormat: 'yyyy-MM-dd' },
{ columnName: 'isActive', type: 'boolean' },
];
const data = [
{ country: 'USA', age: 22, joinedAt: '2024-01-01', isActive: 'Yes' },
{ country: 'XYZ', age: 70, joinedAt: 'invalid', isActive: 'maybe' },
];
const errors = validateData(data, schema);
console.log(errors);
/*
[
{ index: 1, column: 'country', error: 'country must be a valid Alpha-3 code. Received: "XYZ"' },
{ index: 1, column: 'age', error: 'age must be between 18 and 65. Received: "70"' },
{ index: 1, column: 'joinedAt', error: 'joinedAt must be a valid ISO 8601 datetime.Expected format: yyyy-MM-dd. Received: "invalid"' },
{ index: 1, column: 'isActive', error: 'isActive must be a true, True, TRUE, false, False, FALSE, Yes, YES, yes, No, no, NO. Received: "maybe"' },
]
*/parseData
parseData(data: Record<string, any>[], schema: SchemaField[]): Record<string, any>[] | ErrorMessage
This utility function parses data object based on provided schema.
If the data is not valid based on the schema the function throws an error
Parameters
- data: Record<string, any>[] – array of raw row objects
- schema: SchemaField[] – validation schema for expected types
SchemaField
{
columnName: string;
type: 'string' | 'number' | 'Alpha 3 code' | 'dateTime' | 'boolean';
dateFormat?: string; //only applicable if type is dateTime
};Example
const schema = [
{ columnName: 'age', type: 'number' },
{ columnName: 'isActive', type: 'boolean' },
{ columnName: 'joinedAt', type: 'dateTime', dataFormat: 'YYYY-mm-dd' },
];
const data = [
{ age: '25', isActive: 'Yes', joinedAt: '2024-01-01' },
];
const result = parseData(data, schema);
console.log(result)
/*
[
{ age: 25, isActive: true, joinedAt: "2021-01-01T00:00:00.000Z"}
]
*/parseCsvString
parseCsvString(csv: string, delimiter = ','): Record<string, any>[]
This utility function parses CSV data as string to object.
Parameters
- csv: string – CSV data as a string
- delimiter: string (optional) – CSV delimiter (default: ',')
Example
const file = fileInput.files[0];
const data = await getXLSXFromFile(file, true); // Combined data from all sheets
console.log(data);parseNumberAsFloatInData
getDataSummary(data: Record<string, any>[]): Record<string, any>[]
This utility function scans a dataset (array of objects) and converts any valid numeric values into JavaScript floating-point numbers (number type).
Parameters
- data: Record<string, any>[] – array of objects representing tabular data
Example
const csv = `name,age\nAlice,30\nBob,25`;
const data = parseCsvString(csv);
console.log(data);
/*
[
{ name: 'Alice', age: '30' },
{ name: 'Bob', age: '25' }
]
*/getDataSummary
getDataSummary(data: Record<string, any>[]): Summary[];
The getDataSummary function analyzes a dataset (array of objects) and returns a summary of each column, including:
- Data type detection
- Range and percentiles for numeric columns
- Histogram distribution for numeric data
- Unique values and frequency for categorical data
Parameters
- data: Record<string, any>[] – Array of objects representing tabular data
Output format
For numeric columns
{
columnName: 'age',
type: 'number',
range: [18, 65],
percentiles: {
p25: 25,
p50: 30,
p75: 45
},
histogram: [3, 7, 10, 5, 2, ...] // 10 bins
}For string (categorical) columns
{
columnName: 'country',
type: 'string',
enum: [
{ value: 'India', noOfRowsWithValue: 20 },
{ value: 'USA', noOfRowsWithValue: 15 },
]
}Example
const data = [
{ name: 'Alice', age: 25, country: 'India' },
{ name: 'Bob', age: 30, country: 'USA' },
{ name: 'Charlie', age: 25, country: 'India' },
];
const summary = getDataSummary(data);
console.log(summary);