@wuyuchentr/find-duplicate-files
v1.0.0
Published
Find duplicate files in a directory using MD5 hash. Supports auto-delete and dry-run.
Downloads
96
Maintainers
Readme
@wuyuchentr/find-duplicate-files
Find duplicate files in a directory using MD5 hash. Supports auto-delete and dry-run. Zero dependencies.
Install
npm install -g @wuyuchentr/find-duplicate-filesCLI
# Scan current directory
find-duplicate-files
# Scan a specific directory
find-duplicate-files ./downloads
# Dry-run (show what would be deleted)
find-duplicate-files --dry-run ./downloads
# Delete duplicates (keeps the first file in each group)
find-duplicate-files --delete ./downloads
# Skip small files and show progress
find-duplicate-files --min-size 1024 --progress ./downloads
# Machine-readable output
find-duplicate-files --json ./downloadsAPI
const { findDuplicates, deleteDuplicates } = require('@wuyuchentr/find-duplicate-files');
const dups = await findDuplicates('./downloads', {
minSize: 1024, // skip files < 1 KB
maxSize: 100 * 1024 * 1024, // skip files > 100 MB
onProgress(done, total) {
console.log(`Hashed ${done}/${total}`);
},
});
// dups → [{ hash, files: ['a.jpg', 'b.jpg'], size: 12345 }]
// Delete all but the first file in each group
const result = deleteDuplicates(dups);
// → { count: 3, bytes: 54321 }
// Dry run
const result = deleteDuplicates(dups, { dryRun: true });How it works
- Walk all files recursively (skips
node_modules,.git, hidden files) - Group by size — quick first pass, only same-size files can be duplicates
- MD5 hash — streaming hash for memory efficiency (handles huge files)
- Group by hash — files with identical hashes are duplicates
