npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

vaultmail

v2.0.0

Published

Email archive extraction library and CLI tool. Extracts emails, attachments, and metadata from PST, OST, MBOX, and OLM files with support for large archives and multiple output formats.

Readme

VaultMail

A Node.js library and CLI tool for extracting emails and attachments from multiple email archive formats including PST, OST, MBOX, and OLM files. Supports large files (40GB+) and exports emails in standardized EML format.

Features

  • Multi-format Support: PST, OST, MBOX, and OLM email archives
  • Library & CLI: Use as a Node.js library or command-line tool
  • Large File Handling: Processes files over 40GB with hybrid approach
  • EML Export: Standardized email format output with original formatting preserved
  • Attachment Extraction: Saves all attachments with organized naming (PST/OST)
  • Comprehensive Data Types: Emails, contacts, appointments, tasks, and notes (OLM)
  • Recursive Processing: Maintains original folder structure from archives
  • Robust Error Handling: Continues processing even with corrupted emails
  • NPM Package: Easy installation and integration into your projects
  • Extensive Testing: Comprehensive test suite for reliability

Installation

NPM Installation (Recommended)

# Global installation for CLI usage
npm install -g vaultmail

# Local installation for library usage
npm install vaultmail

Development Installation

git clone https://github.com/Mikej81/vaultmail.git
cd vaultmail
npm install

Optional: Install External Tools for Large Files (>2GB)

The tool works without external tools, but for PST/OST files larger than 2GB, installing pst-utils provides enhanced processing and better performance:

Linux (Ubuntu/Debian):

sudo apt-get update
sudo apt-get install pst-utils

macOS:

brew install libpst

Windows: Download libpst from https://www.five-ten-sg.com/libpst/

Usage

CLI Usage

# If installed globally
vaultmail -i <input-file> -o <output-directory> -a <attachments-directory>

# If installed locally
npx vaultmail -i <input-file> -o <output-directory> -a <attachments-directory>

Library Usage

const { EmailExtractor } = require('vaultmail');

const extractor = new EmailExtractor({
  verbose: true,
  format: 'eml'
});

// Extract PST/OST files
await extractor.extract('./archive.pst', './output', './attachments');

// Extract MBOX files
await extractor.extract('./mailbox.mbox', './output');

// Extract OLM files (Outlook for Mac)
await extractor.extract('./archive.olm', './output');

Examples

Extract PST file:

vaultmail -i archive.pst -o ./emails -a ./attachments

Extract large OST file with verbose output:

vaultmail -i large_archive.ost -o ./emails -a ./attachments --verbose

Extract MBOX file in text format:

vaultmail -i mailbox.mbox -f txt --verbose

Extract OLM file (Outlook for Mac):

vaultmail -i archive.olm -o ./emails

Process with depth limit:

vaultmail -i archive.pst -o ./emails -a ./attachments --max-depth 3

Include empty folders (disabled by default):

vaultmail -i archive.ost -o ./emails -a ./attachments --skip-empty false

Quiet mode for minimal output:

vaultmail -i archive.ost -o ./emails -a ./attachments --quiet

Command Line Options

| Option | Alias | Description | Default | |--------|-------|-------------|---------| | --input | -i | Input email archive file (PST, OST, MBOX, OLM) | Required | | --output | -o | Output directory for extracted emails | ./output | | --attachments | -a | Directory for extracted attachments | ./attachments | | --recursive | -r | Process folders recursively | true | | --format | -f | Output format: eml or txt | eml | | --verbose | -v | Enable verbose logging | false | | --max-depth | -d | Maximum folder depth (-1 for unlimited) | -1 | | --skip-empty | -s | Skip creating folders that contain no emails | true | | --quiet | -q | Suppress progress updates (less output) | false | | --help | -h | Show help information | |

File Format Support

PST Files

  • Microsoft Outlook Personal Storage Table files
  • Maintains folder hierarchy and metadata
  • Extracts embedded attachments
  • Supports files up to 40GB+ with external tools

OST Files

  • Microsoft Outlook Offline Storage Table files
  • Same capabilities as PST files
  • Handles Exchange server synchronized data

MBOX Files

  • Unix mailbox format
  • Processes files of any size efficiently with line-by-line streaming
  • Extracts individual emails while preserving original MIME structure
  • Attachments and images remain embedded in EML files for perfect mail client compatibility
  • Supports Gmail exports and other MBOX sources
  • Handles problematic headers gracefully

OLM Files

  • Microsoft Outlook for Mac archive format
  • Comprehensive extraction of emails, contacts, appointments, tasks, and notes
  • Automatic folder organization of extracted data
  • Support for multi-disk OLM archives
  • Conversion to standard formats (EML, VCF, ICS, TXT)

Output Structure

The tool creates organized output with preserved folder structures and specialized formats for different content types:

PST/OST Output Structure

emails/
├── archive_name/
│   ├── Inbox/
│   │   ├── 1-email-subject.eml
│   │   ├── 2-another-email.eml
│   │   ├── contacts/
│   │   │   ├── contact1.vcf
│   │   │   └── contact2.vcf
│   │   └── calendar/
│   │       ├── meeting1.ics
│   │       └── appointment1.ics
│   ├── Sent Items/
│   └── Deleted Items/
└── ...

attachments/
├── 1-document.pdf
├── 2-image.jpg
└── ...

MBOX Output Structure

emails/
├── 0001-email.eml
├── 0002-email.eml  
├── 0003-email.eml
└── ...

Note: MBOX emails preserve all attachments and images embedded within each EML file

OLM Output Structure

emails/
├── emails/
│   ├── email-1.eml
│   ├── email-2.eml
│   └── ...
├── contacts/
│   ├── contact-1.vcf
│   ├── contact-2.vcf
│   └── ...
├── appointments/
│   ├── meeting-1.ics
│   ├── appointment-1.ics
│   └── ...
├── tasks/
│   ├── task-1.txt
│   └── ...
└── notes/
    ├── note-1.txt
    └── ...

File Formats:

  • Emails: .eml format (RFC822 standard)
  • Contacts: .vcf format (VCard 3.0 standard) - PST/OST/OLM
  • Calendar/Tasks: .ics format (iCalendar standard) - PST/OST/OLM
  • Tasks: .txt format - OLM
  • Notes: .txt format - OLM
  • Attachments:
    • PST/OST: Extracted as separate files in attachments directory
    • MBOX: Preserved embedded within each EML file
    • OLM: Integrated within email content

Large File Handling

The tool automatically detects file sizes and uses appropriate processing methods:

  • Files ≤ 2GB: Uses built-in JavaScript PST extractor for fast processing
  • Files > 2GB: Automatically switches to external readpst tool if available
  • Fallback: Uses built-in extractor if external tools are missing (works but may be slower for very large files)

Enhanced readpst Integration: When using the external readpst tool for large files, the tool is configured with optimized settings:

  • EML format: -e flag ensures proper email format with extensions
  • VCard contacts: -cv flag exports contacts in VCard format
  • All content types: -t eajc extracts emails, attachments, journals, and contacts
  • UTF-8 encoding: -8 flag ensures proper character encoding
  • Includes deleted items: -D flag for comprehensive extraction

Progress Monitoring

The tool provides real-time feedback during extraction:

  • Live progress updates: Shows folder count, emails extracted, and elapsed time every 5 seconds
  • Folder-by-folder status: Displays current folder being processed
  • Heartbeat monitoring: Shows "Still working..." message if no updates for 15+ seconds
  • Final summary: Complete statistics when extraction finishes
  • Quiet mode: Use --quiet to suppress progress updates for minimal output

Example progress output:

Processing: Inbox
Progress: 25 folders, 150 emails, 45 attachments | 2m 30s elapsed
Processing: Sent Items
Still working... 5m 15s elapsed

Error Handling

The tool is designed to be robust:

  • Continues processing if individual emails are corrupted
  • Saves problematic emails in raw format when parsing fails
  • Handles Node.js memory limits for very large messages
  • Provides detailed error messages and recovery suggestions

Troubleshooting

Large File Errors

If you get errors with files over 2GB:

  1. Install pst-utils if not available (see installation instructions above)
  2. For very large files, ensure sufficient disk space for temporary files

Memory Issues

For extremely large archives:

  • Use the --max-depth option to limit processing depth
  • Process in smaller chunks if needed
  • Monitor system memory usage during processing

MBOX Processing

  • Individual email extraction: Each email is saved as a separate numbered EML file (0001-email.eml, 0002-email.eml, etc.)
  • MIME structure preservation: Original email structure is maintained for perfect mail client compatibility
  • Embedded attachments: All attachments and images remain within each EML file (no separate extraction)
  • Large file support: Uses line-by-line streaming to handle MBOX files of any size
  • Memory efficient: No file size limits due to streaming approach
  • Malformed header handling: Automatically handles problematic headers gracefully
  • Check verbose output for detailed processing information

OLM Processing

  • Comprehensive extraction: Extracts emails, contacts, appointments, tasks, and notes
  • Automatic organization: Content is automatically organized into appropriate subdirectories
  • Standard formats: Outputs in widely supported formats (EML, VCF, ICS, TXT)
  • Multi-disk support: Handles OLM archives that span multiple disks
  • Error handling: Continues processing even if individual items are corrupted

API Reference

EmailExtractor Class

const { EmailExtractor } = require('vaultmail');

const extractor = new EmailExtractor(options);

Options

  • verbose (boolean): Enable verbose logging (default: false)
  • format (string): Output format - 'eml' or 'txt' (default: 'eml')
  • maxDepth (number): Maximum folder depth to process (default: -1, unlimited)
  • skipEmpty (boolean): Skip empty folders (default: true)

Methods

extract(filePath, outputDir, attachmentDir, options)

Extract emails from any supported format.

  • filePath (string): Path to the email archive file
  • outputDir (string): Directory for extracted emails
  • attachmentDir (string): Directory for attachments (required for PST/OST)
  • options (object): Override default options for this extraction

Returns: Promise resolving to extraction statistics

detectFileType(filePath)

Detect the file type of an email archive.

  • filePath (string): Path to the file

Returns: String ('pst', 'ost', 'mbox', or 'olm')

Individual Extractors

const { PSTExtractor, MboxExtractor, OLMExtractor } = require('vaultmail');

// Use specific extractors directly
const pstExtractor = new PSTExtractor(options);
const mboxExtractor = new MboxExtractor(options);
const olmExtractor = new OLMExtractor(options);

Testing

# Run tests
npm test

# Run tests with coverage
npm run test:coverage

# Run tests in watch mode
npm run test:watch

Contributing

I welcome contributions to VaultMail! Whether you're fixing bugs, adding features, or improving documentation, your help makes the project better for everyone.

Development Setup

  1. Fork and Clone

    git clone https://github.com/your-username/vaultmail.git
    cd vaultmail
    npm install
  2. Run Tests

    npm test              # Run all tests
    npm run test:watch    # Run tests in watch mode
    npm run test:coverage # Generate coverage report
  3. Code Quality

    npm run lint          # Check code style
    npm run lint:fix      # Auto-fix style issues
    npm run validate      # Run both linting and tests

Contribution Guidelines

  • Code Style: Follow the existing ESLint configuration
  • Testing: Add tests for new features and bug fixes
  • Documentation: Update README and JSDoc comments for public APIs
  • Commits: Use clear, descriptive commit messages
  • Pull Requests: Include a description of changes and link any related issues

Areas for Contribution

  • Performance: Optimize extraction for very large files
  • Formats: Add support for additional email archive formats
  • Features: Enhance filtering, search, and export capabilities
  • Documentation: Improve examples and troubleshooting guides
  • Testing: Increase test coverage and add integration tests

Reporting Issues

When reporting bugs, please include:

  • Operating system and Node.js version
  • VaultMail version
  • Input file format and approximate size
  • Complete error message and stack trace
  • Steps to reproduce the issue

License

Apache-2.0 License - see LICENSE file for details

Support

For issues and feature requests, please use the GitHub Issues page.