excel-sheet-cleaner-combiner
v1.0.0
Published
Clean and combine multiple sheets from Excel (.xlsx) files with data cleaning capabilities
Downloads
90
Maintainers
Readme
Excel Sheet Cleaner & Combiner 📊
A powerful Node.js application to clean and combine multiple sheets from Excel (.xlsx) files.
✨ Features
- Clean and combine multiple Excel sheets into one
- Remove empty rows and columns automatically
- Trim whitespace from all cells
- Remove duplicate rows (optional)
- Track source sheet for each row
- Add processing timestamps
- Detailed processing reports
🚀 Quick Start
📦 Installation
Install globally to use as a CLI tool:
npm install -g excel-sheet-cleaner-combinerOr install locally in your project:
npm install excel-sheet-cleaner-combiner- Install dependencies:
npm install- Create sample data:
npm test- View file information:
excel-cleaner info sample-data/sample_sales_data.xlsx- Process Excel file:
excel-cleaner process -i sample-data/sample_sales_data.xlsx📝 Usage Examples
Basic processing:
excel-cleaner process -i input.xlsx -o output.xlsxProcess specific sheets:
excel-cleaner process -i input.xlsx -s Sheet1 Sheet2Remove duplicates:
excel-cleaner process -i input.xlsx --remove-duplicatesDisable source tracking:
excel-cleaner process -i input.xlsx --no-source💻 Programmatic Usage
Use it programmatically in your Node.js application:
Quick Reference:
| What You Want | Options to Use |
|---------------|----------------|
| Get data only (no file) | returnData: true, no outputPath |
| Save file only (memory efficient) | outputPath: 'file.xlsx', returnData: false |
| Both save and get data | outputPath: 'file.xlsx', returnData: true |
| Skip first row (header) | processing: { skipFirstRow: true } |
| Fill empty cells | cleaning: { fillEmptyWith: 'N/A' } |
| Process specific sheets | sheets: ['Sheet1', 'Sheet2'] |
| Track source sheet | output: { includeSourceSheet: true } |
| Remove duplicates | cleaning: { removeDuplicates: true } |
const { processExcelFile, getFileInfo, cleanData, excelToJSON, jsonToExcel } = require('excel-sheet-cleaner-combiner');
// Example 1: Get data only (no file output)
const result = await processExcelFile('input.xlsx', {
returnData: true // Returns data without saving to file
});
console.log('Data:', result.data);
// Example 2: Save to file only (memory efficient)
await processExcelFile('input.xlsx', {
outputPath: 'output.xlsx',
returnData: false // Don't return data (saves memory)
});
// Example 3: Both save file AND get data
const result2 = await processExcelFile('input.xlsx', {
outputPath: 'output.xlsx',
returnData: true, // Get data in response too
sheets: ['Sheet1', 'Sheet2'], // Process specific sheets
cleaning: {
removeEmptyRows: true,
removeEmptyColumns: true,
trimWhitespace: true,
removeDuplicates: true,
fillEmptyWith: 'N/A' // Fill empty cells
},
output: {
sheetName: 'Combined Data',
includeSourceSheet: true // Track source sheet
},
processing: {
addTimestamp: true,
skipFirstRow: false
}
});
console.log('Success:', result2.success);
console.log('Data:', result2.data); // Array of objects
console.log('Output file:', result2.outputPath);
console.log('Stats:', result2.stats);📖 API Reference
processExcelFile(inputPath, options)
Main function to process Excel files.
Parameters:
inputPath(string) - Path to input Excel fileoptions(object) - Configuration options
Options:
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| outputPath | string | - | Output file path (optional, if not provided no file is saved) |
| returnData | boolean | true | Return data in response (set to false to save memory) |
| sheets | string[] | all sheets | Specific sheets to process |
| cleaning | object | See below | Cleaning configuration |
| output | object | See below | Output configuration |
| processing | object | See below | Processing configuration |
Cleaning Options:
cleaning: {
removeEmptyRows: true, // Remove completely empty rows
removeEmptyColumns: true, // Remove completely empty columns
removeDuplicates: false, // Remove duplicate rows
trimWhitespace: true, // Trim whitespace from strings
removeNullValues: false, // Remove null/undefined values
fillEmptyWith: null // Fill empty cells with value (e.g., 'N/A', 0)
}Output Options:
output: {
sheetName: 'Combined Data', // Name of output sheet
includeSourceSheet: true // Add column tracking source sheet
}Processing Options:
processing: {
addTimestamp: true, // Add processing timestamp to each row
skipFirstRow: false, // Skip first row (header)
preserveHeaders: true // Preserve headers from first sheet
}Returns:
{
success: true, // Whether operation succeeded
data: [...], // Array of objects (if returnData: true)
outputPath: 'out.xlsx', // Output file path (if provided)
rowsProcessed: 100, // Total rows processed
stats: [...], // Per-sheet statistics
error: 'msg' // Error message (if success: false)
}getFileInfo(filePath)
Get information about an Excel file.
Parameters:
filePath(string) - Path to Excel file
Returns:
{
filePath: 'input.xlsx',
totalSheets: 3,
sheets: [
{
name: 'Sheet1',
rows: 100,
columns: 5,
columnNames: ['A', 'B', 'C', 'D', 'E']
}
]
}cleanData(data, config)
Clean an array of data objects.
Parameters:
data(array) - Array of objects to cleanconfig(object) - Cleaning configuration (same ascleaningoption above)
Example:
const data = [
{ name: ' John ', age: 30 },
{ name: '', age: '' },
{ name: 'John', age: 30 }
];
const cleaned = cleanData(data, {
removeEmptyRows: true,
trimWhitespace: true,
removeDuplicates: true
});
// Returns: [{ name: 'John', age: 30 }]excelToJSON(filePath, options)
Convert Excel sheet to JSON array.
Parameters:
filePath(string) - Path to Excel fileoptions(object)sheet(string) - Sheet name (default: first sheet)
Example:
const data = excelToJSON('input.xlsx', { sheet: 'Sheet1' });
console.log(data); // Array of objectsjsonToExcel(data, outputPath, options)
Convert JSON array to Excel file.
Parameters:
data(array) - Array of objectsoutputPath(string) - Output file pathoptions(object)sheetName(string) - Sheet name (default: 'Sheet1')
Example:
const data = [
{ Product: 'Laptop', Price: 1200 },
{ Product: 'Mouse', Price: 25 }
];
const result = jsonToExcel(data, 'output.xlsx', {
sheetName: 'Products'
});
console.log('Saved:', result.outputPath);combineSheets(filePath, options)
Combine specific sheets from an Excel file.
Parameters:
filePath(string) - Path to Excel fileoptions(object) - Same asprocessExcelFileoptions
Example:
const result = combineSheets('input.xlsx', {
sheets: ['Q1', 'Q2', 'Q3'],
returnData: true,
cleaning: { removeEmptyRows: true }
});🎯 Common Usage Scenarios
Scenario 1: Get Data for Further Processing (No File Output)
const { processExcelFile } = require('excel-sheet-cleaner-combiner');
const result = await processExcelFile('sales.xlsx', {
returnData: true, // Get data in response
// No outputPath = no file created
cleaning: {
removeEmptyRows: true,
trimWhitespace: true
}
});
// Use the data
const totalSales = result.data.reduce((sum, row) => sum + row.Amount, 0);
console.log('Total:', totalSales);Scenario 2: Clean and Save (Memory Efficient)
// For large files, don't return data to save memory
await processExcelFile('large-file.xlsx', {
outputPath: 'cleaned.xlsx',
returnData: false, // Don't return data
cleaning: {
removeEmptyRows: true,
removeDuplicates: true
}
});
// File saved, but data not loaded into memoryScenario 3: Both Save and Use Data
const result = await processExcelFile('data.xlsx', {
outputPath: 'cleaned.xlsx', // Save to file
returnData: true, // Also get data
sheets: ['January', 'February', 'March'],
cleaning: {
removeEmptyRows: true,
trimWhitespace: true,
removeDuplicates: true,
fillEmptyWith: 0
},
output: {
sheetName: 'Q1 Data',
includeSourceSheet: true
}
});
// File is saved AND you have the data
console.log('Saved to:', result.outputPath);
console.log('Processing:', result.data);Scenario 4: Custom Cleaning Logic
const { excelToJSON, cleanData, jsonToExcel } = require('excel-sheet-cleaner-combiner');
// Read Excel
const data = excelToJSON('input.xlsx');
// Clean with package
const cleaned = cleanData(data, {
removeEmptyRows: true,
trimWhitespace: true
});
// Apply your custom logic
const processed = cleaned.map(row => ({
...row,
Total: row.Quantity * row.Price,
Category: row.Product.includes('Laptop') ? 'Electronics' : 'Accessories'
}));
// Save result
jsonToExcel(processed, 'output.xlsx', { sheetName: 'Processed' });🛠️ Available Commands
info <file>- Display Excel file informationprocess -i <file>- Clean and combine sheets--help- Show all options
⚙️ Configuration
Edit config.js to customize:
- Which cleaning operations to perform
- Output file settings
- Processing options
📦 Dependencies
- xlsx - Excel file processing
- commander - CLI framework
- chalk - Terminal styling
Made with ❤️ for efficient Excel data processing
