@cerios/csv-nested-json
v1.2.0
Published
Parse CSV files into nested JSON objects with support for dot notation, arrays, and complex data structures
Readme
@cerios/csv-nested-json
A powerful TypeScript CSV parser that transforms flat CSV data into nested JSON objects with support for dot notation, automatic array detection, and complex hierarchical structures.
🚀 Features
- Zero Dependencies - No external CSV parsing libraries
- Nested Objects - Use dot notation in headers (e.g.,
address.city) - Automatic Array Detection - Smart array creation for grouped rows
- Multi-Level Nesting - Support for deeply nested structures
- Multiple Input Methods - Parse from files (sync/async), strings, or streams
- True Streaming Parser - Memory-efficient parsing for very large files
- Bidirectional Conversion - Convert CSV to JSON and JSON back to CSV
- Column Selection - Include or exclude specific columns during parsing
- Duplicate Header Handling - Smart strategies for duplicate column names
- Limit Records - Stop parsing after N records for previews or pagination
- Progress Monitoring - Track parsing progress with callbacks for large files
- Batch Processing - Process records in configurable batches for memory efficiency
- Value Transformations - Auto-parse numbers, booleans, dates, or use custom transformers
- Header Transformations - Transform and map column names
- Row Filtering - Filter rows during parsing for memory efficiency
- RFC 4180 Compliant - Handles quoted fields, escaped quotes, and various line endings
- Flexible Delimiters - Support for comma, semicolon, tab, pipe, and custom delimiters
- Custom Encodings - Handle different file encodings (UTF-8, Latin1, etc.)
- BOM Handling - Automatic Byte Order Mark detection and removal
- TypeScript & JavaScript - Full type definitions included
- CommonJS & ESM - Works in both module systems
- Validation Modes - Flexible error handling for malformed data
- Custom Error Classes - Detailed error information for debugging
📦 Installation
npm install @cerios/csv-nested-json🎯 Quick Start
import { CsvParser } from '@cerios/csv-nested-json';
// Parse CSV file
const result = CsvParser.parseFileSync('data.csv');
console.log(result);
// Output:
// [
// {
// id: "1",
// name: "John Doe",
// address: {
// street: "123 Main St",
// city: "New York",
// zip: "10001"
// }
// }
// ]📖 API Reference
| Method | Description |
|--------|-------------|
| CsvParser.parseFileSync() | Parse CSV file synchronously |
| CsvParser.parseFile() | Parse CSV file asynchronously |
| CsvParser.parseString() | Parse CSV string content |
| CsvParser.parseStream() | Parse CSV from readable stream |
| CsvStreamParser | True streaming parser for very large files |
| JsonToCsv.stringify() | Convert JSON objects to CSV string |
| JsonToCsv.writeFileSync() | Write JSON objects to CSV file (sync) |
| JsonToCsv.writeFile() | Write JSON objects to CSV file (async) |
🔧 Basic Usage
1. Parse File (Synchronous)
import { CsvParser } from '@cerios/csv-nested-json';
const result = CsvParser.parseFileSync('./data.csv');When to use: Small to medium files (<10MB), synchronous workflows, simple scripts.
2. Parse File (Asynchronous)
import { CsvParser } from '@cerios/csv-nested-json';
const result = await CsvParser.parseFile('./data.csv');When to use: Medium to large files, async/await workflows, web servers, non-blocking operations.
3. Parse String
import { CsvParser } from '@cerios/csv-nested-json';
const csvString = `id,name,age
1,Alice,30
2,Bob,25`;
const result = CsvParser.parseString(csvString);When to use: API responses, in-memory CSV data, testing, dynamic CSV generation.
4. Parse Stream
import { CsvParser } from '@cerios/csv-nested-json';
import { createReadStream } from 'node:fs';
const stream = createReadStream('./large-file.csv');
const result = await CsvParser.parseStream(stream);When to use: Very large files (>100MB), memory-constrained environments, real-time processing.
5. True Streaming Parser (Memory Efficient)
For very large files where you want to process records one at a time without loading everything into memory:
import { CsvStreamParser } from '@cerios/csv-nested-json';
import { createReadStream } from 'node:fs';
const parser = new CsvStreamParser({
autoParseNumbers: true,
autoParseBooleans: true
});
// Using async iteration
const stream = createReadStream('./very-large-file.csv');
for await (const record of stream.pipe(parser)) {
console.log('Parsed record:', record);
// Process each record as it's parsed
}
// Or using events
createReadStream('./very-large-file.csv')
.pipe(new CsvStreamParser())
.on('data', (record) => {
console.log('Record:', record);
})
.on('end', () => {
console.log('Done!');
})
.on('error', (err) => {
console.error('Error:', err);
});When to use: Files too large to fit in memory, real-time processing, ETL pipelines.
Progress Monitoring
Track parsing progress for large files:
import { CsvStreamParser, ProgressInfo } from '@cerios/csv-nested-json';
import { createReadStream } from 'node:fs';
const parser = new CsvStreamParser({
progressCallback: (progress: ProgressInfo) => {
console.log(`Processed ${progress.recordsEmitted} records`);
console.log(`Bytes: ${progress.bytesProcessed}`);
console.log(`Elapsed: ${progress.elapsedMs}ms`);
},
progressInterval: 1000 // Call every 1000 records (default: 100)
});
for await (const record of createReadStream('./large.csv').pipe(parser)) {
// Process record
}The ProgressInfo object contains:
bytesProcessed: Total bytes read so farrecordsEmitted: Number of records emittedheadersProcessed: Whether headers have been parsedelapsedMs: Milliseconds since parsing started
Batch Processing
Process records in batches for memory-efficient streaming:
import { CsvStreamParser } from '@cerios/csv-nested-json';
import { createReadStream } from 'node:fs';
const parser = new CsvStreamParser({
batchSize: 100 // Emit arrays of 100 records
});
for await (const batch of createReadStream('./large.csv').pipe(parser)) {
// batch is an array of up to 100 records
await processBatch(batch);
}
// Note: parseStream() always returns a flat array regardless of batchSize
const allRecords = await CsvStreamParser.parseStream(
createReadStream('./data.csv'),
{ batchSize: 100 } // Batching used internally, result is flattened
);6. Convert JSON to CSV
import { JsonToCsv } from '@cerios/csv-nested-json';
const data = [
{
id: '1',
name: 'Alice',
address: { city: 'NYC', zip: '10001' }
},
{
id: '2',
name: 'Bob',
address: { city: 'LA', zip: '90001' }
}
];
// Convert to CSV string
const csvString = JsonToCsv.stringify(data);
console.log(csvString);
// Output:
// id,name,address.city,address.zip
// 1,Alice,NYC,10001
// 2,Bob,LA,90001
// Write directly to file
JsonToCsv.writeFileSync('./output.csv', data);
// Or async
await JsonToCsv.writeFile('./output.csv', data);🎯 Advanced Examples
Simple Flat CSV
Input CSV:
id,name,email
1,John Doe,[email protected]
2,Jane Smith,[email protected]Output JSON:
[
{
"id": "1",
"name": "John Doe",
"email": "[email protected]"
},
{
"id": "2",
"name": "Jane Smith",
"email": "[email protected]"
}
]Nested Objects with Dot Notation
Input CSV:
id,name,address.street,address.city,address.zip
1,John Doe,123 Main St,New York,10001Code:
const result = CsvParser.parseFileSync('./nested-data.csv');Output JSON:
[
{
"id": "1",
"name": "John Doe",
"address": {
"street": "123 Main St",
"city": "New York",
"zip": "10001"
}
}
]Arrays from Grouped Rows
Rows without a value in the first column are treated as continuation rows and automatically create arrays:
Input CSV:
id,name,phones.type,phones.number
1,Alice,mobile,555-0001
,,home,555-0002
,,work,555-0003Code:
const result = CsvParser.parseFileSync('./grouped-data.csv');Output JSON:
[
{
"id": "1",
"name": "Alice",
"phones": [
{ "type": "mobile", "number": "555-0001" },
{ "type": "home", "number": "555-0002" },
{ "type": "work", "number": "555-0003" }
]
}
]Deeply Nested Structures
Input CSV:
id,user.name,user.profile.age,user.profile.address.city,user.profile.address.zip
1,Alice,30,New York,10001Code:
const result = CsvParser.parseString(csvContent);Output JSON:
[
{
"id": "1",
"user": {
"name": "Alice",
"profile": {
"age": "30",
"address": {
"city": "New York",
"zip": "10001"
}
}
}
}
]Forced Array Fields with [] Suffix
Use the [] suffix in headers to force a field to always be an array, even with a single value:
Input CSV:
id,name,tags[]
1,Alice,javascript
2,Bob,pythonCode:
const result = CsvParser.parseString(csvContent);Output JSON:
[
{ "id": "1", "name": "Alice", "tags": ["javascript"] },
{ "id": "2", "name": "Bob", "tags": ["python"] }
]Auto-Parse Numbers and Booleans
const csvContent = `id,name,age,price,active,verified
1,Alice,30,19.99,true,FALSE
2,Bob,25,29.99,false,TRUE`;
const result = CsvParser.parseString(csvContent, {
autoParseNumbers: true,
autoParseBooleans: true
});
// Result:
// [
// { id: 1, name: "Alice", age: 30, price: 19.99, active: true, verified: false },
// { id: 2, name: "Bob", age: 25, price: 29.99, active: false, verified: true }
// ]Note: Strings with leading zeros (like "007") are preserved as strings to avoid data loss.
Auto-Parse Dates
const csvContent = `id,name,createdAt,updatedAt
1,Alice,2024-01-15,2024-06-30T10:30:00Z`;
const result = CsvParser.parseString(csvContent, {
autoParseDates: true
});
// Result:
// [
// {
// id: "1",
// name: "Alice",
// createdAt: Date("2024-01-15"),
// updatedAt: Date("2024-06-30T10:30:00Z")
// }
// ]Custom Value Transformer
const csvContent = `id,name,email
1,alice,[email protected]
2,bob,[email protected]`;
const result = CsvParser.parseString(csvContent, {
valueTransformer: (value, header) => {
// Uppercase names
if (header === 'name' && typeof value === 'string') {
return value.toUpperCase();
}
return value;
}
});
// Result:
// [
// { id: "1", name: "ALICE", email: "[email protected]" },
// { id: "2", name: "BOB", email: "[email protected]" }
// ]Header Transformation
const csvContent = `User ID,First Name,Last Name,Email Address
1,John,Doe,[email protected]`;
const result = CsvParser.parseString(csvContent, {
// Convert headers to camelCase
headerTransformer: (header) => {
return header
.toLowerCase()
.replace(/\s+(.)/g, (_, c) => c.toUpperCase());
}
});
// Result:
// [{ userId: "1", firstName: "John", lastName: "Doe", emailAddress: "[email protected]" }]Column Mapping
const csvContent = `user_id,first_name,last_name
1,John,Doe`;
const result = CsvParser.parseString(csvContent, {
columnMapping: {
'user_id': 'id',
'first_name': 'firstName',
'last_name': 'lastName'
}
});
// Result:
// [{ id: "1", firstName: "John", lastName: "Doe" }]Row Filtering
Filter rows during parsing for better memory efficiency:
const csvContent = `id,name,status
1,Alice,active
2,Bob,deleted
3,Charlie,active
4,Diana,pending`;
const result = CsvParser.parseString(csvContent, {
rowFilter: (record, rowIndex) => {
// Only include active records
return record.status === 'active';
}
});
// Result:
// [
// { id: "1", name: "Alice", status: "active" },
// { id: "3", name: "Charlie", status: "active" }
// ]Column Selection
Include or exclude specific columns during parsing:
const csvContent = `id,name,email,password,role
1,Alice,[email protected],secret123,admin
2,Bob,[email protected],password456,user`;
// Include only specific columns
const result1 = CsvParser.parseString(csvContent, {
includeColumns: ['id', 'name', 'email']
});
// Result: [{ id: "1", name: "Alice", email: "[email protected]" }, ...]
// Exclude sensitive columns
const result2 = CsvParser.parseString(csvContent, {
excludeColumns: ['password']
});
// Result: [{ id: "1", name: "Alice", email: "[email protected]", role: "admin" }, ...]Duplicate Header Handling
Handle CSV files with duplicate column names:
const csvContent = `id,name,value,value,value
1,Test,A,B,C`;
// Keep first occurrence (default)
const result1 = CsvParser.parseString(csvContent, {
duplicateHeaders: 'first'
});
// Result: [{ id: "1", name: "Test", value: "A" }]
// Keep last occurrence
const result2 = CsvParser.parseString(csvContent, {
duplicateHeaders: 'last'
});
// Result: [{ id: "1", name: "Test", value: "C" }]
// Combine into comma-separated string
const result3 = CsvParser.parseString(csvContent, {
duplicateHeaders: 'combine'
});
// Result: [{ id: "1", name: "Test", value: "A,B,C" }]
// Rename duplicates with suffix
const result4 = CsvParser.parseString(csvContent, {
duplicateHeaders: 'rename'
});
// Result: [{ id: "1", name: "Test", value: "A", value_1: "B", value_2: "C" }]
// Throw error on duplicates (default)
const result5 = CsvParser.parseString(csvContent, {
duplicateHeaders: 'error'
});
// Throws CsvDuplicateHeaderErrorLimit Records
Limit the number of records parsed (useful for previews or pagination):
const csvContent = `id,name
1,Alice
2,Bob
3,Charlie
4,Diana
5,Eve`;
const result = CsvParser.parseString(csvContent, {
limit: 3
});
// Result: [{ id: "1", ... }, { id: "2", ... }, { id: "3", ... }]
// Parsing stops after 3 records - efficient for large filesSkip Rows (Metadata Headers)
const csvContent = `Report generated on 2024-01-15
Source: Production Database
id,name,email
1,Alice,[email protected]
2,Bob,[email protected]`;
const result = CsvParser.parseString(csvContent, {
skipRows: 2 // Skip the first 2 metadata rows
});
// Result:
// [
// { id: "1", name: "Alice", email: "[email protected]" },
// { id: "2", name: "Bob", email: "[email protected]" }
// ]Default Values
const csvContent = `id,name,status,country
1,Alice,,
2,Bob,active,USA`;
const result = CsvParser.parseString(csvContent, {
defaultValues: {
status: 'pending',
country: 'Unknown'
}
});
// Result:
// [
// { id: "1", name: "Alice", status: "pending", country: "Unknown" },
// { id: "2", name: "Bob", status: "active", country: "USA" }
// ]Null Value Handling
const csvContent = `id,name,nickname
1,Alice,N/A
2,Bob,null
3,Charlie,Bobby`;
const result = CsvParser.parseString(csvContent, {
nullValues: ['null', 'NULL', 'N/A', 'n/a', ''],
nullRepresentation: 'null' // or 'undefined', 'empty-string', 'omit'
});
// Result with nullRepresentation: 'null':
// [
// { id: "1", name: "Alice", nickname: null },
// { id: "2", name: "Bob", nickname: null },
// { id: "3", name: "Charlie", nickname: "Bobby" }
// ]
// Result with nullRepresentation: 'omit' (default):
// [
// { id: "1", name: "Alice" },
// { id: "2", name: "Bob" },
// { id: "3", name: "Charlie", nickname: "Bobby" }
// ]BOM Handling
The parser automatically strips UTF-8 and UTF-16 BOM by default:
// BOM is automatically handled
const result = CsvParser.parseFileSync('./windows-excel-export.csv');
// Disable BOM stripping if needed
const result2 = CsvParser.parseString(csvContent, {
stripBom: false
});Complex Multi-Group Example
Input CSV:
id,username,profile.firstName,profile.lastName,addresses.type,addresses.city
1,johndoe,John,Doe,home,New York
,,,,work,Boston
2,janedoe,Jane,Doe,home,ChicagoCode:
const result = CsvParser.parseFileSync('./complex-data.csv');Output JSON:
[
{
"id": "1",
"username": "johndoe",
"profile": {
"firstName": "John",
"lastName": "Doe"
},
"addresses": [
{ "type": "home", "city": "New York" },
{ "type": "work", "city": "Boston" }
]
},
{
"id": "2",
"username": "janedoe",
"profile": {
"firstName": "Jane",
"lastName": "Doe"
},
"addresses": {
"type": "home",
"city": "Chicago"
}
}
]Note: The first record has an array of addresses (multiple entries), while the second has a single address object.
Custom Delimiters
Semicolon-separated values:
const csvSemicolon = `id;name;city
1;Alice;NYC
2;Bob;LA`;
const result = CsvParser.parseString(csvSemicolon, {
delimiter: ';'
});Tab-separated values:
const csvTab = `id\tname\tcity
1\tAlice\tNYC`;
const result = CsvParser.parseString(csvTab, {
delimiter: '\t'
});Pipe-separated values:
const csvPipe = `id|name|city
1|Alice|NYC`;
const result = CsvParser.parseString(csvPipe, {
delimiter: '|'
});Custom Quote Character
const csvSingleQuote = `id,name,message
1,Alice,'Hello, World'
2,Bob,'It''s working'`;
const result = CsvParser.parseString(csvSingleQuote, {
quote: "'"
});Custom Encoding
// Latin1 encoding
const result = await CsvParser.parseFile('./data-latin1.csv', {
encoding: 'latin1'
});
// UTF-16LE encoding
const result2 = await CsvParser.parseFile('./data-utf16.csv', {
encoding: 'utf16le'
});Validation Modes
// Ignore extra columns silently
const result1 = CsvParser.parseString(csvData, {
validationMode: 'ignore'
});
// Warn about extra columns (default)
const result2 = CsvParser.parseString(csvData, {
validationMode: 'warn'
});
// Throw error on extra columns
try {
const result3 = CsvParser.parseString(csvData, {
validationMode: 'error'
});
} catch (error) {
console.error('Validation error:', error.message);
}Parse CSV from API Response
async function parseApiCsv() {
const response = await fetch('https://api.example.com/data.csv');
const csvString = await response.text();
const data = CsvParser.parseString(csvString, {
validationMode: 'ignore'
});
return data;
}Parse Large File with Streams
import { createReadStream } from 'node:fs';
async function parseLargeFile(filePath: string) {
const stream = createReadStream(filePath, {
highWaterMark: 64 * 1024 // 64KB chunks
});
const data = await CsvParser.parseStream(stream, {
validationMode: 'warn',
encoding: 'utf-8'
});
return data;
}European CSV Format
European CSV files typically use semicolon delimiters and comma as decimal separator:
const europeanCsv = `id;name;price;location.city;location.country
1;Product A;12,50;Paris;France
2;Product B;8,99;Berlin;Germany`;
const result = CsvParser.parseString(europeanCsv, {
delimiter: ';',
validationMode: 'error'
});
// Result:
// [
// {
// id: "1",
// name: "Product A",
// price: "12,50",
// location: { city: "Paris", country: "France" }
// },
// ...
// ]Parse Multiple Files Concurrently
const files = ['data1.csv', 'data2.csv', 'data3.csv'];
const results = await Promise.all(
files.map(file => CsvParser.parseFile(file))
);🧪 Options Reference
CsvParserOptions
interface CsvParserOptions {
// Validation
validationMode?: 'ignore' | 'warn' | 'error'; // Default: 'warn'
// Parsing
delimiter?: string; // Default: ','
quote?: string; // Default: '"'
// File I/O
encoding?: BufferEncoding; // Default: 'utf-8'
// Row handling
skipRows?: number; // Default: 0
stripBom?: boolean; // Default: true
rowFilter?: (record, rowIndex) => boolean; // Filter rows during parsing
limit?: number; // Max records to parse
// Column selection
includeColumns?: string[]; // Include only these columns
excludeColumns?: string[]; // Exclude these columns
// Duplicate header handling
duplicateHeaders?: DuplicateHeaderStrategy; // Default: 'error'
// Value transformations
autoParseNumbers?: boolean; // Default: false
autoParseBooleans?: boolean; // Default: false
autoParseDates?: boolean; // Default: false
valueTransformer?: (value, header) => any; // Custom value transformer
// Header transformations
headerTransformer?: (header) => string; // Transform header names
columnMapping?: Record<string, string>; // Rename columns
// Array handling
arraySuffixIndicator?: string; // Default: '[]'
emptyArrayBehavior?: 'empty-array' | 'omit'; // Default: 'omit'
// Null handling
nullValues?: string[]; // Values to treat as null
nullRepresentation?: 'null' | 'undefined' | 'empty-string' | 'omit'; // Default: 'omit'
// Default values
defaultValues?: Record<string, string>; // Default values for empty cells
// Row grouping
identifierColumn?: string; // Column for grouping continuation rows
}
// Streaming-specific options (CsvStreamParser)
interface CsvStreamParserOptions extends CsvParserOptions {
nested?: boolean; // Emit nested objects (default: true)
batchSize?: number; // Emit records in batches
progressCallback?: ProgressCallback; // Progress tracking callback
progressInterval?: number; // Records between callbacks (default: 100)
}Option Details
validationMode
Controls how the parser handles rows with more values than headers:
'ignore': Silently ignore extra values'warn'(default): Log a warning to console'error': Throw aCsvValidationError
delimiter
Field delimiter character. Common values:
','(default) - Comma-separated values';'- Semicolon-separated values (common in Europe)'\t'- Tab-separated values'|'- Pipe-separated values
quote
Quote character for escaping fields containing delimiters or newlines:
'"'(default) - Double quotes"'"- Single quotes
encoding
File encoding when reading from files or streams:
'utf-8'(default)'utf-16le''latin1''ascii'
skipRows
Number of rows to skip before the header row. Useful for files with metadata at the top.
stripBom
Automatically remove BOM (Byte Order Mark) from the beginning of content. Default: true
autoParseNumbers
Automatically convert numeric strings to numbers. Strings with leading zeros (like "007") are preserved.
autoParseBooleans
Automatically convert 'true'/'false' strings to booleans (case-insensitive).
autoParseDates
Automatically convert date strings to JavaScript Date objects using Date.parse().
valueTransformer
Custom function to transform values after parsing. Called after auto-parse options.
valueTransformer: (value, header) => {
if (header === 'email') return value.toLowerCase();
return value;
}headerTransformer
Transform header names before processing. Useful for converting to camelCase, lowercase, etc.
headerTransformer: (header) => header.toLowerCase().replace(/\s+/g, '_')columnMapping
Map/rename column headers. Applied after headerTransformer.
columnMapping: { 'user_id': 'id', 'first_name': 'firstName' }rowFilter
Filter rows during parsing. More memory-efficient than filtering after parsing.
rowFilter: (record, rowIndex) => record.status === 'active'limit
Maximum number of records to parse. Parsing stops after this limit is reached, which is efficient for large files when you only need a preview or first N records.
limit: 100 // Stop after 100 recordsincludeColumns
Array of column names to include. Only these columns will be in the output.
includeColumns: ['id', 'name', 'email'] // Only include these columnsexcludeColumns
Array of column names to exclude. All other columns will be included.
excludeColumns: ['password', 'secret'] // Exclude sensitive columnsduplicateHeaders
Strategy for handling duplicate column names in CSV headers. Default: 'error'
duplicateHeaders: 'rename' // 'error' | 'rename' | 'combine' | 'first' | 'last''error'(default): ThrowCsvDuplicateHeaderErroron duplicates'rename': Rename duplicates with suffix (e.g.,value,value_1,value_2)'combine': Combine values into comma-separated string'first': Keep only the first occurrence of duplicate headers'last': Keep only the last occurrence
identifierColumn
Column to use as the identifier for grouping continuation rows. By default, the first column is used to identify new records. When this column has an empty value, the row is treated as a continuation of the previous record.
// Use 'productId' instead of first column to group rows
identifierColumn: 'productId'arraySuffixIndicator
Suffix in headers to force array type. Default: '[]'
emptyArrayBehavior
How to handle forced array fields with no values:
'omit'(default): Don't include the field'empty-array': Include as[]
nullValues
Strings to interpret as null values. Default: ['null', 'NULL', 'nil', 'NIL', '']
nullRepresentation
How to represent null values in output:
'omit'(default): Remove the field'null': Use JavaScriptnull'undefined': Use JavaScriptundefined'empty-string': Use empty string''
defaultValues
Default values for columns when cells are empty.
defaultValues: { status: 'pending', country: 'Unknown' }Complete Example with All Options
const result = await CsvParser.parseFile('./data.csv', {
// Validation
validationMode: 'error',
// Parsing
delimiter: ',',
quote: '"',
encoding: 'utf-8',
// Row handling
skipRows: 2,
stripBom: true,
rowFilter: (record) => record.status !== 'deleted',
limit: 1000,
// Column selection
excludeColumns: ['password', 'secret'],
// Duplicate header handling
duplicateHeaders: 'rename',
// Value transformations
autoParseNumbers: true,
autoParseBooleans: true,
autoParseDates: true,
valueTransformer: (value, header) => {
if (header === 'email') return value.toLowerCase();
return value;
},
// Header transformations
headerTransformer: (h) => h.toLowerCase().replace(/\s+/g, '_'),
columnMapping: { 'user_id': 'id' },
// Row grouping
identifierColumn: 'id',
// Array handling
arraySuffixIndicator: '[]',
emptyArrayBehavior: 'empty-array',
// Null handling
nullValues: ['null', 'N/A', '-'],
nullRepresentation: 'null',
// Defaults
defaultValues: { status: 'pending' }
});📚 API Reference
CsvParser Class
parseFileSync<T>(filePath: string, options?: CsvParserOptions): T[]
Parses a CSV file synchronously and returns an array of nested JSON objects.
parseFile<T>(filePath: string, options?: CsvParserOptions): Promise<T[]>
Parses a CSV file asynchronously.
parseString<T>(csvContent: string, options?: CsvParserOptions): T[]
Parses CSV string content.
parseStream<T>(stream: Readable, options?: CsvParserOptions): Promise<T[]>
Parses CSV from a readable stream.
CsvStreamParser Class
A Transform stream that parses CSV data chunk by chunk, emitting records as they become available.
import { CsvStreamParser, ProgressInfo } from '@cerios/csv-nested-json';
const parser = new CsvStreamParser({
nested: true, // Emit nested objects (default: true)
autoParseNumbers: true,
limit: 1000, // Stop after 1000 records
batchSize: 100, // Emit in batches of 100
progressCallback: (info: ProgressInfo) => {
console.log(`Progress: ${info.recordsEmitted} records, ${info.elapsedMs}ms`);
},
progressInterval: 500, // Call progress every 500 records
// ... other CsvParserOptions
});
createReadStream('./large.csv')
.pipe(parser)
.on('data', (record) => console.log(record))
.on('end', () => console.log('Done'));Static Promise API
// Parse stream and collect all records
const records = await CsvStreamParser.parseStream(
createReadStream('./data.csv'),
{ autoParseNumbers: true, limit: 100 }
);JsonToCsv Class
stringify(data: object[], options?: JsonToCsvOptions): string
Convert array of objects to CSV string.
writeFileSync(filePath: string, data: object[], options?: JsonToCsvOptions): void
Write objects to CSV file synchronously.
writeFile(filePath: string, data: object[], options?: JsonToCsvOptions): Promise<void>
Write objects to CSV file asynchronously.
import { JsonToCsv } from '@cerios/csv-nested-json';
const data = [
{ id: 1, user: { name: 'Alice', age: 30 }, tags: ['js', 'ts'] }
];
const csv = JsonToCsv.stringify(data, {
delimiter: ',',
quote: '"',
arrayMode: 'rows' // 'rows' (continuation rows) or 'json' (stringify arrays)
});Error Classes
The library provides custom error classes for better error handling:
import {
CsvParseError,
CsvFileNotFoundError,
CsvValidationError,
CsvEncodingError,
CsvDuplicateHeaderError
} from '@cerios/csv-nested-json';
try {
const result = CsvParser.parseFileSync('./data.csv', {
validationMode: 'error'
});
} catch (error) {
if (error instanceof CsvFileNotFoundError) {
console.error(`File not found: ${error.filePath}`);
} else if (error instanceof CsvDuplicateHeaderError) {
console.error(`Duplicate headers: ${error.duplicateHeaders.join(', ')}`);
} else if (error instanceof CsvValidationError) {
console.error(`Validation error at row ${error.row}`);
console.error(`Expected ${error.expectedColumns}, got ${error.actualColumns}`);
} else if (error instanceof CsvEncodingError) {
console.error(`Encoding error: ${error.encoding}`);
} else if (error instanceof CsvParseError) {
console.error(`Parse error at row ${error.row}, column ${error.column}`);
}
}💡 How It Works
1. Row Grouping
Records are grouped by the first column (identifier) by default, or by the column specified in identifierColumn. When this column is empty, the row is treated as a continuation of the previous group:
id,name,item
1,Alice,Book
,,Pen
,,Notebook
2,Bob,LaptopGroups:
- Group 1: Rows with
id=1and the two continuation rows - Group 2: Row with
id=2
2. Dot Notation Parsing
Column headers with dots create nested object structures:
user.profile.name,user.profile.age
Alice,30Creates:
{
"user": {
"profile": {
"name": "Alice",
"age": "30"
}
}
}3. Automatic Array Detection
When the same key path appears multiple times within a group, an array is automatically created:
id,contact.type,contact.value
1,email,[email protected]
,,phone,555-1234Creates:
{
"id": "1",
"contact": [
{ "type": "email", "value": "[email protected]" },
{ "type": "phone", "value": "555-1234" }
]
}4. Empty Value Handling
Empty or null values are omitted from the output:
id,name,optional
1,Alice,
2,Bob,ValueCreates:
[
{ "id": "1", "name": "Alice" },
{ "id": "2", "name": "Bob", "optional": "Value" }
]🆚 Comparison
When to Use Each Method
| Method | Best For | File Size | Blocking |
|--------|----------|-----------|----------|
| parseFileSync() | Scripts, small files | <10MB | Yes |
| parseFile() | Web servers, medium files | 10MB-100MB | No |
| parseString() | API responses, testing | Any (in-memory) | Yes |
| parseStream() | Large files, memory efficiency | >100MB | No |
| CsvStreamParser | Very large files, ETL pipelines | Any size | No |
Traditional CSV Parsing
// ❌ Manual parsing - tedious and error-prone
const fs = require('fs');
const data = fs.readFileSync('data.csv', 'utf-8');
const lines = data.split('\n');
const headers = lines[0].split(',');
const result = [];
for (let i = 1; i < lines.length; i++) {
const values = lines[i].split(',');
const obj = {};
for (let j = 0; j < headers.length; j++) {
const keys = headers[j].split('.');
let current = obj;
// Manually handle nesting...
// ... complex nested object logic
}
result.push(obj);
}With @cerios/csv-nested-json
// ✅ Simple, type-safe, and powerful
const result = CsvParser.parseFileSync('data.csv');
// ✅ Automatic nested object creation
// ✅ Automatic array detection
// ✅ RFC 4180 compliant parsing
// ✅ Flexible configuration options📋 CSV Format Support
The library is fully RFC 4180 compliant and supports:
- ✅ Quoted Fields with Commas:
"value, with, commas" - ✅ Quoted Fields with Newlines: Multi-line values within quotes
- ✅ Escaped Quotes:
"He said ""Hello"""→He said "Hello" - ✅ Various Line Endings: Windows (CRLF), Unix (LF), Mac (CR)
- ✅ BOM Handling: UTF-8 and UTF-16 BOM automatically stripped
- ✅ Empty Lines: Automatically skipped
- ✅ Flexible Column Counts: Continuation rows can have different column counts
- ✅ Custom Delimiters: Comma, semicolon, tab, pipe, or any character
- ✅ Custom Quote Characters: Double quotes, single quotes, or any character
- ✅ Multiple Encodings: UTF-8, Latin1, UTF-16, and more
Quoted Fields Examples
id,name,description
1,Alice,"Product with, comma"
2,Bob,"Product with ""quotes"""
3,Charlie,"Multi-line
description here"All of these are correctly parsed!
💻 TypeScript Support
Full TypeScript support with comprehensive type definitions:
import {
CsvParser,
CsvStreamParser,
JsonToCsv,
CsvParserOptions,
CsvStreamParserOptions,
CsvParseError,
CsvValidationError,
NestedObject,
ProgressInfo,
ProgressCallback,
DuplicateHeaderOptions
} from '@cerios/csv-nested-json';
// Generic type support
interface Person {
id: number;
name: string;
address: {
city: string;
zip: string;
};
}
const result = CsvParser.parseFileSync<Person>('people.csv', {
autoParseNumbers: true
});
// result is typed as Person[]
console.log(result[0].address.city);Exported Types
// Options
type ValidationMode = 'ignore' | 'warn' | 'error';
type EmptyArrayBehavior = 'empty-array' | 'omit';
type NullRepresentation = 'null' | 'undefined' | 'empty-string' | 'omit';
type ArrayMode = 'rows' | 'json';
type DuplicateHeaderStrategy = 'error' | 'rename' | 'combine' | 'first' | 'last';
// Function types
type ValueTransformer = (value: unknown, header: string) => unknown;
type HeaderTransformer = (header: string) => string;
type RowFilter = (record: CsvRecord, rowIndex: number) => boolean;
type ProgressCallback = (info: ProgressInfo) => void | Promise<void>;
// Progress tracking
interface ProgressInfo {
bytesProcessed: number; // Total bytes read
recordsEmitted: number; // Records emitted so far
headersProcessed: boolean; // Whether headers have been parsed
elapsedMs: number; // Milliseconds since start
}
// Data types
type CsvRecord = Record<string, string>;
type NestedObject = { [key: string]: NestedValue };
type NestedValue = string | number | boolean | Date | null | NestedObject | NestedValue[];
// Options interfaces
interface CsvParserOptions { /* ... */ }
interface CsvStreamParserOptions extends CsvParserOptions { /* ... */ }🎯 Best Practices
Choose the Right Method:
- Use
parseFileSync()for small files in scripts - Use
parseFile()for web servers and async workflows - Use
parseString()for API responses and testing - Use
parseStream()for large files - Use
CsvStreamParserfor very large files or when you need to process records one at a time
- Use
Use Appropriate Validation Mode:
- Use
'ignore'when you trust the data source - Use
'warn'(default) during development - Use
'error'for strict validation in production
- Use
Enable Auto-Parsing When Appropriate:
const result = CsvParser.parseFileSync('./data.csv', { autoParseNumbers: true, autoParseBooleans: true, autoParseDates: true });Handle Errors Gracefully:
import { CsvParseError, CsvValidationError } from '@cerios/csv-nested-json'; try { const result = CsvParser.parseFileSync('./data.csv', { validationMode: 'error' }); } catch (error) { if (error instanceof CsvValidationError) { console.error(`Row ${error.row}: expected ${error.expectedColumns} columns`); } else { throw error; } }Use Streaming for Large Files:
// ✅ Good for very large files const parser = new CsvStreamParser({ autoParseNumbers: true }); for await (const record of createReadStream('./huge.csv').pipe(parser)) { await processRecord(record); } // ❌ May cause memory issues with large files const result = CsvParser.parseFileSync('./huge.csv');Use Row Filtering for Memory Efficiency:
// ✅ Filter during parsing - uses less memory const result = CsvParser.parseFileSync('./data.csv', { rowFilter: (record) => record.status === 'active' }); // ❌ Filter after parsing - loads everything into memory first const all = CsvParser.parseFileSync('./data.csv'); const filtered = all.filter(r => r.status === 'active');Specify Encoding for Non-UTF8 Files:
const result = await CsvParser.parseFile('./data.csv', { encoding: 'latin1' });Use Consistent Column Headers:
- Ensure the first column is always the identifier for grouping
- Use consistent dot notation for nested structures
- Keep header names descriptive and use
headerTransformerfor normalization
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
📄 License
MIT © Ronald Veth - Cerios
