colmerge
v0.1.16
Published
Import mail from your device into a Colioe mailbox — paired CLI for the colioe-website wizard.
Maintainers
Readme
colioemigrate
End-user CLI that imports mail from your device into a Colioe mailbox.
$ npx colioemigrate --email [email protected] --migration-code CLO-XXXX-XXXX-XXXXPaired with a Colioe wizard session, it detects your local mail client (Thunderbird, Apple Mail, Outlook, …), normalizes messages to RFC 822, preserves folder structure and flags, deduplicates, and APPENDs into the destination mailbox over IMAP — resumable across power/network loss.
See mailMigrationD.md (in the parent repo) for the full design.
Status
Phase 0 (control plane + stub APPEND, end-to-end provable). Phase 1
(Thunderbird mbox + Apple Mail emlx as real source readers) is in this drop
but the backend session-management routes are still being built; until they
land, run with --api mock to drive the CLI against an in-memory mock.
What works
| Component | Phase 0 | Phase 1 | Phase 2+ |
| ------------------------------- | :-----: | :-----: | :------: |
| TUI flow (8 steps, §8.2) | ✅ | ✅ | — |
| Mock backend (--api mock) | ✅ | ✅ | — |
| Thunderbird (mbox) detector | — | ✅ | — |
| Apple Mail (emlx) detector | — | ✅ | — |
| Outlook for Mac (legacy) | — | — | ✅† |
| Outlook PST (in-place read) | — | — | ✅‡ |
| Outlook OST (in-place read) | — | — | ✅§ |
| New Outlook (account-pull) | — | — | ✅¶ |
† Outlook for Mac legacy detector (v0.1.10, Phases 2.5 + 2.6 + 2.7): exports
every message whose .olk15Message exists on disk (≈100% of Inbox on a real
profile) with full header AND body fidelity, plus attachments where the link
is recoverable. Bodies come from UTF-16LE-stored HTML/text (modern Outlook
default) or LZFu-decompressed RTF (older Office). Attachments are read from
Message Attachments/<bucket>/<uuid>.olk15MsgAttachment and grafted onto the
message as a multipart/mixed envelope; the message↔attachment link uses
Outlook.sqlite's Mail_OwnedBlocks table (BlockTag = 0x41747463 for
attachment chunks). Mail_OwnedBlocks covers most Sent/Drafts/Archive
attachments but a subset of attachments — typically inline IMAP-synced
ones in the Inbox — are referenced only inside the bag's binary MAPI prop
data, which this detector doesn't parse; those attachments are not
auto-linked yet. Closing the remaining gap requires either a full MAPI
prop-bag parser or an IMAP re-fetch.
‡ Outlook PST detector (v0.1.7, Phase 2): in-place read via pst-extractor
(pure JS, no native binaries). Auto-discovers .pst files in
~/Documents/Outlook Files/ and platform-specific Outlook profile dirs;
--pst-file=<path> adds extras. Folder mapping mirrors the existing
backend PST-upload path (Inbox/Sent/Trash/Junk/Drafts/Archive canonicalised,
calendar/contacts/tasks/RSS skipped). RFC 822 synthesis preserves From,
To/Cc, Subject, Date, In-Reply-To, Message-ID, attachments, read state →
\Seen, and original delivery time → APPEND INTERNALDATE.
§ Outlook OST detector (v0.1.9, Phase 2 OST): same in-place machinery as
the PST detector — OST and PST share the on-disk format. Auto-discovers
.ost files in %LOCALAPPDATA%\Microsoft\Outlook\ on Windows and the
equivalent macOS path; --ost-file=<path> adds extras. Outlook holds
an exclusive lock on the active OST; the detector probes the file,
attempts a best-effort copy fallback (succeeds when Outlook uses a shared
lock), and otherwise surfaces a precise remediation message ("Close
Outlook then re-run" or "Export to PST in Outlook, then rerun with
--pst-file"). Compression-encrypted OSTs (type 0x02) are detected and
routed to the same PST-export workaround.
¶ New Outlook account-pull (v0.1.12, Phases 3 + 3.1): per
mailMigrationD.md §10, New Outlook's Hx-format local cache is
undocumented and shifting between Office builds, so this detector skips
the cache entirely and pulls from the source mail server New Outlook
syncs with — Exchange Online / Outlook.com via IMAP, Gmail, Yahoo,
iCloud, or arbitrary IMAP host. Opt-in via
--imap-pull-host=<preset|hostname> + --imap-pull-user=<email>;
preset keys (m365, outlook, gmail, yahoo, icloud) expand to
the canonical endpoint. For Microsoft 365 + Gmail add
--imap-pull-oauth to skip the App Password requirement (basic IMAP
auth was retired mid-2024 for both): a browser opens to the provider's
sign-in page, the CLI catches the redirect at a random 127.0.0.1 port
via PKCE, exchanges the code for an access + refresh token, and uses
XOAUTH2 against the IMAP server. Tokens are cached under
~/.colioemigrate/oauth/ (mode 0600, filename hashed) and silently
refreshed on subsequent runs. For non-OAuth providers (iCloud / Yahoo /
self-hosted), the password / app-password is read from
COLIOE_IMAP_PULL_PASS or prompted interactively. TLS is mandatory
(TLSv1.2+, hostname validation, reject-untrusted) outside dev mode.
RFC 6154 SPECIAL-USE attributes drive the folder mapping, with name
heuristics for legacy servers. Microsoft 365 ships with a working
default Azure client ID (Thunderbird's public registration); Gmail
requires a project-specific client ID via
COLIOE_OAUTH_GOOGLE_CLIENT_ID.
| Streaming parse + per-folder dedup | ✅ | ✅ | — |
| Folder mapping (SPECIAL-USE) | ✅ | ✅ | — |
| Append-only JSONL manifest | ✅ | ✅ | — |
| Resume across runs | ✅ | ✅ | — |
| --dry-run stub APPEND | ✅ | ✅ | — |
| IMAP APPEND (real) | ✅ | ✅ | — |
| OVERQUOTA hard-stop | ✅ | ✅ | — |
| Backend session routes | 🚧 | 🚧 | 🚧 |
Architecture (single-page tour)
bin/colioemigrate.ts # arg parsing, version check
└─ cli/run.ts # banner + step dispatch + abort handling
├─ cli/steps/01_pair.ts # exchange code → scoped JWT
├─ cli/steps/02_device.ts # report device + poll for admin approval
├─ cli/steps/03_source.ts # pick client + account
├─ cli/steps/04_inventory.ts# count + bytes + quota pre-flight
├─ cli/steps/05_confirm.ts # user confirm, fetch scoped credential
├─ cli/steps/06_import.ts # run the import with live progress
└─ cli/steps/07_summary.ts # final report + heuristic placements
api/ # typed fetch client, errors, mock backend, poller
detect/ # client detectors (Thunderbird, Apple Mail, fixture)
parse/ # streaming mbox / emlx → RawMessage
normalize/ # per-folder dedup + SPECIAL-USE folder mapping
import/ # IMAP APPEND (real), stub (dry-run), manifest, checkpoint
session/ # ~/.colioemigrate/ session + manifest state
device/ # fingerprint + OS label
cli/ui/ # colors, box, prompts, spinner, table, progressLocal development
# install
npm install --cache /tmp/npm-cache-colioemigrate
# typecheck
npm run typecheck
# tests (Node's built-in runner, no jest)
npm test
# live demo against fixture mbox + in-memory mock backend
TMP=$(mktemp -d)
npx tsx scripts/build-demo-fixture.ts "$TMP"
COLIOE_FIXTURE_MBOX_DIR="$TMP" \
COLIOE_AUTO_PICK_FIRST=1 \
COLIOE_AUTO_CONFIRM=1 \
COLIOE_STATE_DIR="$TMP/.state" \
npx tsx src/bin/colioemigrate.ts \
--email fixture@local \
--migration-code CLO-DEV-DEMO-RUN \
--api mock \
--dry-run
# build for publish
npm run buildFlags
| Flag | What it does |
| ----------------------------- | --------------------------------------------------------------------- |
| --email <addr> | destination mailbox address (required) |
| --migration-code <code> | pairing code from the wizard (required) |
| --api <url> | API base (default https://www.colioe.io; mock for offline testing) |
| --dry-run | simulate APPEND without contacting IMAP |
| --no-resume | don't auto-resume an in-flight session for this email |
| --folder <regex> | only import folders matching this regex |
| --verbose | print debug logs |
Test/dev env vars (NOT for end users):
| Var | Effect |
| -------------------------------- | ------------------------------------------------------------------- |
| COLIOE_FIXTURE_MBOX_DIR | enables the fixture detector pointing at this mbox dir |
| COLIOE_AUTO_PICK_FIRST=1 | auto-pick the first option in any source/account selector |
| COLIOE_AUTO_CONFIRM=1 | auto-yes the "Start import?" confirm |
| COLIOE_STATE_DIR | override ~/.colioemigrate/ location |
| COLIOE_NO_COLOR / NO_COLOR | disable ANSI color output |
| COLIOE_DEBUG | print stack traces on fatal exit |
Security notes
- The migration code is a pairing token, not an access token. It can be exchanged exactly once for a scoped session JWT, bound to a device fingerprint. Subsequent runs from the same device reuse the persisted JWT.
- The IMAP credential the CLI uses for APPEND is scoped, expiring, and
revocable: the backend provisions it as a per-mailbox app password at
confirmtime and revokes it on complete/cancel/quota-exceeded. - The CLI never persists the IMAP password outside
~/.colioemigrate/sessions/(chmod 0600 on POSIX, written atomically via temp-file + rename). It is discarded when the run completes. - The local manifest contains message metadata (not message bodies — those go straight from the source store to IMAP APPEND without being staged):
What the manifest contains
Each line of ~/.colioemigrate/sessions/<sessionId>.manifest.jsonl is one
JSON record per message:
{ "ts": "...", "folder": "INBOX/2024",
"destFolder": "INBOX/2024",
"dedupKey": "mid:<hash-of-message-id>",
"status": "appended",
"sourceLocator": "/Users/you/Library/Mail/V10/.../Messages/12345.emlx",
"bytes": 4837,
"messageId": "<[email protected]>" }sourceLocator is a filesystem path on your device — it can leak your
profile name, mail-client install location, and OS account. messageId is
the literal email Message-ID header. Both are personally identifiable.
The manifest stays on the device (chmod 0600 on POSIX). Do NOT email the raw file to support without redacting it. Use the built-in helper:
npx colioemigrate redact-manifest <session-id> [output-path]This hashes sourceLocator and messageId (sha256 truncated to 12 chars)
and truncates error to 80 chars, while leaving folder names, counts,
status, and timestamps intact — enough for support to debug without
seeing your filesystem or your email IDs.
Defensive parsing
--folder <regex>is run with a 512-char input cap and a 100ms soft budget so a pathological pattern against a maliciously-named local folder can't ReDoS the inventory step.- The mbox parser caps individual messages at 256 MiB to defeat an mbox that's one giant fake "message" with no separators.
- The emlx parser rejects implausible length prefixes (>64 MiB or
MAX_SAFE_INTEGER). - All source-supplied strings (folder names, account labels, error messages)
are sanitized of ANSI / OSC escapes before terminal output or manifest
write — so a folder named
\e[2JHACKEDcannot repaint your terminal.
Exit codes
| Code | Meaning | | ---: | ---------------------------------------------------------------- | | 0 | All messages imported (or user declined at confirm prompt) | | 1 | API error or transport failure (see error code printed) | | 2 | Quota pre-flight failed (import didn't start) | | 3 | Import started but stopped early (overquota, canceled, aborted) | | 64 | Bad CLI arguments / Node too old | | 130 | User canceled via prompt or Ctrl-C |
Open items before GA (tracks §17 of the design)
- ~~Mailcow app-password scoping~~ ✅ Resolved 2026-06-15 — mailcow
supports per-mailbox + IMAP-only; we manage TTL ourselves (no native
support) and probe with synthetic IMAP LOGIN to mitigate upstream bug
mailcow-dockerized#6147. See
PHASE0-BACKEND.mdfor the runtime contract. - ~~Backend routes~~ ✅ Resolved 2026-06-15 —
/api/v1/migration/*shipped on Fluid main behindimport-from-device-enabledflag. - Wizard blades — Destination → Generate Code → Confirm Device → Live
Progress → Summary in colioe-website. Backend is ready; flag is in
UI_FLAGSallowlist. - Outlook detectors — PST (in-place read), Outlook-Mac legacy header-strip + DB folder lookup, New Outlook account-pull fallback.
- PGP/encrypted passthrough — currently works (opaque bytes survive), but needs an explicit test case.
- Signed releases — Code-sign macOS/Windows binaries so antivirus doesn't flag the tool when it reads mail stores.
