messaging-markdown-exporter

v0.3.3

Published

12 days ago

Export iMessage, Telegram, WhatsApp, and Signal conversations into a shared markdown format

Downloads

227

0High
0Medium
0Low

mjaverto

imessage telegram whatsapp signal messages markdown export

Messaging Markdown Exporter

Export conversations from multiple messaging apps into a shared markdown format.

Supported sources

All four adapters are native, passive readers — no manual export step.

| Source | Input | One-time setup | |---|---|---| | imessage | macOS chat.db (direct read) | Grant Full Disk Access to the binary running the exporter | | telegram | MTProto via gramjs (persistent session) | node dist/cli.js telegram-login | | whatsapp | WhatsApp Desktop ChatStorage.sqlite (Group Container, plaintext) | Grant Full Disk Access; quit WhatsApp Desktop briefly on first run | | signal | Signal Desktop db.sqlite (SQLCipher, key from macOS Keychain) | Quit Signal Desktop so the DB is unlocked; approve the keychain prompt on first run |

Architecture

The repo is structured around three layers:

Adapters
- one per source system
- convert source-specific exports or databases into a normalized model
Normalized model
- shared conversation/message representation
- keeps rendering independent from source-specific parsing
Renderer
- one shared markdown renderer
- creates daily markdown files in a consistent layout

This keeps source complexity from leaking across the whole codebase.

Install

git clone https://github.com/mjaverto/imessage-to-markdown.git
cd imessage-to-markdown
npm install
npm run build

Package name:

messaging-markdown-exporter

CLI binaries:

messaging-markdown-exporter
imessage-to-markdown (legacy alias)

CLI usage

iMessage

node dist/cli.js \
  --source imessage \
  --db-path ~/Library/Messages/chat.db \
  --output-dir ~/brain/iMessage

First-time auth (run once, interactively):

node dist/cli.js telegram-login

You'll be prompted for your apiId/apiHash (from https://my.telegram.org/apps), phone number, login code, and optional 2FA password. The resulting session string is saved under ~/.config/imessage-to-markdown/telegram/ with chmod 600.

Subsequent unattended runs:

node dist/cli.js \
  --source telegram \
  --output-dir ~/brain/Telegram

node dist/cli.js \
  --source whatsapp \
  --output-dir ~/brain/WhatsApp

Reads ~/Library/Group Containers/group.net.whatsapp.WhatsApp.shared/ChatStorage.sqlite directly. No manual export. Override with --whatsapp-db-path if needed.

Signal

node dist/cli.js \
  --source signal \
  --output-dir ~/brain/Signal \
  --my-name "Mike"

Reads the Signal Desktop SQLCipher database in place. The encryption key is auto-retrieved from the macOS Keychain entry "Signal Safe Storage" and unwrapped via Chromium's OSCrypt scheme. Override with --signal-db-path and --signal-config-path if Signal is installed outside the default location.

Contacts integration (iMessage)

For the imessage source, the exporter resolves chat handles (phone numbers, emails) to display names before writing markdown. The resolved name is used in the markdown header, message senders, and the YAML frontmatter.

Resolution strategy (in order):

Direct AddressBook SQLite read (preferred). The exporter reads every AddressBook-v22.abcddb under ~/Library/Application Support/AddressBook/Sources/<UUID>/ via better-sqlite3-multiple-ciphers. No Apple Events / Automation grant required -- only Full Disk Access, which the launchd runner already needs to read chat.db. This path is fast and works under launchd where JXA/osascript reliably fails with Apple Events error -1743 (errAEEventNotPermitted).
JXA via osascript (fallback). If the SQLite path finds zero contacts (sources dir missing, schema change, custom Contacts setup on a network mount), the exporter falls back to the legacy JXA dump. This triggers a Contacts permission prompt the first time and only works from a context that has the Automation -> Contacts grant.
Raw handles. If both paths fail, the exporter logs a one-line warning and uses raw handles in the output -- exports still succeed.

Phone numbers are normalized to the last 10 digits for matching (US-centric; documented tradeoff). Emails are lowercased and trimmed. Handles present in multiple AddressBook sources are resolved with first-writer-wins semantics using alphabetical source-directory order, which is deterministic across runs.

Flags

--no-contacts -- skip Contacts.app entirely (no permission prompt).
--use-contact-names -- when set, 1:1 chat output files are named after the resolved contact (e.g. Karissa Smith.md) instead of the slugified handle. Group chats keep slug-based filenames. Default off for backward compatibility with installed runners.

YAML frontmatter

Every generated markdown file starts with a YAML frontmatter block:

---
contact: "Karissa Smith"          # 1:1 chats only
participants: ["Alice", "Bob"]    # group chats only
handles: ["+15705551234"]
chat_id: 42                       # source-specific stable id (iMessage ROWID)
service: "iMessage"
source: "imessage"
message_count: 12
first_message: 2026-04-19T12:30:00.000Z
last_message: 2026-04-19T18:45:00.000Z
exported_at: 2026-04-19T19:30:00.000Z
contacts_resolved: false          # only when contacts lookup was attempted and empty
---

Downstream tooling (Obsidian, Dataview, custom indexers) can rely on the shape above being stable across sources.

contacts_resolved: false is emitted only when contacts resolution was requested for the source (i.e. --no-contacts was not passed and the source is one that uses Contacts.app, currently imessage and whatsapp) and the resolved map came back empty (both AddressBook SQLite and JXA fallback failed). Use it to flag exports where raw phone numbers / emails appear in place of names so downstream indexers do not treat handles as canonical contact identities. The field is omitted on successful resolution and on --no-contacts runs.

Installer

The installer writes a launchd agent and a generated runner script that invokes the CLI once per enabled source.

The runner reads config.json and loops over enabledSources (e.g. ["imessage", "telegram", "whatsapp", "signal"]). When enabledSources is absent, it falls back to [config.source] for backward compatibility with existing installs. Each source writes to either outputDir (single source) or outputDir/<source> (multiple).

Fresh installs start with the selected source in config.source; to enable more sources after install, add "enabledSources": [...] to config.json.

Interactive:

npm run install:local

Non-interactive example:

node dist/install.js \
  --source imessage \
  --yes \
  --output-dir "$HOME/brain/iMessage" \
  --schedule 05:30 \
  --ac-power-only

Doctor mode:

node dist/install.js --doctor --source imessage

Uninstall:

node dist/install.js --uninstall

Source-specific notes

iMessage

direct chat.db reads via the sqlite3 CLI + tmpdir copy
attributed-body cleanup is heuristic, not perfect
Contacts resolution reads the AddressBook .abcddb SQLite files directly (no Automation / Apple Events grant required); JXA remains as a fallback for non-standard setups

uses MTProto (gramjs TelegramClient) with a persistent StringSession
per-dialog cursors under ~/.config/imessage-to-markdown/telegram/cursors.json
FLOOD_WAIT_N errors sleep N seconds and retry once
AUTH_KEY_UNREGISTERED (session invalidated) emits a warning and exits 0 so scheduled jobs don't spam errors — re-run telegram-login

reads ChatStorage.sqlite via the same sqlite3 CLI + tmpdir copy pattern as iMessage
joins ZWAMESSAGE with ZWACHATSESSION, ZWAGROUPMEMBER, ZWAPROFILEPUSHNAME, and ZWAMEDIAITEM
sender resolution order: ZCONTACTNAME → ZPUSHNAME → ZWAPROFILEPUSHNAME → Contacts.app → ZFIRSTNAME → parsed JID user
if WhatsApp Desktop holds the DB lock at the moment of copy, the adapter warns and returns an empty conversation list (the next run will retry)

Signal

unlocks Signal's SQLCipher v4 database using the Chromium OSCrypt scheme: PBKDF2-HMAC-SHA1 with salt "saltysalt", 1003 iterations, AES-128-CBC with a 16-space IV, applied to the encryptedKey field in config.json retrieved from the macOS Keychain entry "Signal Safe Storage"
falls back to the legacy plaintext key field when present
SQLCipher v4 pragmas (cipher='sqlcipher', legacy=4) are set before PRAGMA key — without them better-sqlite3-multiple-ciphers defaults to sqleet and rejects the key
a SIGNAL_DB_BUSY (SQLITE_BUSY) is treated as a soft failure: the adapter exits 0 with a warning so cron runs while Signal is open don't fail the job
Sonoma 14.5+ caveat: recent macOS versions changed how Electron's safeStorage negotiates with the Keychain. If the first run fails to retrieve the key, see carderne/signal-export#133 — the workaround is usually one targeted Keychain permission dialog approval

Development

npm install
npm run build
npm test
npm run lint

Coverage thresholds

npm test (via vitest.config.ts) enforces minimum coverage:

| Metric | Floor | |------------|------:| | Lines | 75% | | Statements | 75% | | Functions | 75% | | Branches | 70% |

The floors are set below current coverage so normal churn doesn't turn CI red, but a sudden drop (e.g. an entire adapter losing its tests) will. Raise these as coverage climbs — check current numbers with npx vitest run --coverage.

Current limitations

Attachment handling is still simplified across all sources — the markdown renders an attachment marker but does not copy the attachment bytes.
WhatsApp: newer Desktop builds emit opaque privacy-IDs (base64-like) in ZFROMJID for group messages instead of the <phone>@s.whatsapp.net form, so those senders are shown as the raw ID. 1:1 chats and older group data resolve correctly.
Telegram: the adapter reads dialogs and messages, but media download is out of scope for this pass.
Signal: requires Signal Desktop to be quit at the moment of read (SQLite write-locks the DB). Scheduled runs during an active Signal session will soft-fail with a warning.

Releasing

Releases are fully automated. Every merge to main triggers .github/workflows/release.yml, which:

Runs lint, build, and tests as a gate.
Bumps the patch version (npm version patch) and commits as chore(release): vX.Y.Z [skip ci].
Tags vX.Y.Z and pushes the commit + tag back to main.
Publishes to npm as messaging-markdown-exporter with provenance.
Creates a GitHub Release with auto-generated notes.

The [skip ci] token on the release commit prevents the workflow from re-triggering itself.

Manual minor / major bumps

The workflow only does patch bumps. For a minor or major release, bump the version locally on a normal commit (npm version minor / npm version major --no-git-tag-version, commit, push). The release workflow will then patch on top of that on the next merge — so prefer to land the version bump in a release-only PR if you want the resulting tag to match exactly.

npm authentication

Two paths are supported; pick one and configure it in the GitHub repo:

OIDC trusted publishing (recommended). No secret needed. Configure messaging-markdown-exporter on npmjs.com with this repo + workflow as a trusted publisher. The workflow already requests id-token: write and passes --provenance. See: https://docs.npmjs.com/trusted-publishers
Long-lived NPM_TOKEN (fallback). Add an automation token as repo secret NPM_TOKEN. The workflow reads it via NODE_AUTH_TOKEN.

If neither is configured, the publish step will fail (the rest of the workflow up to that point still runs).

Branch protection

After CI lands, enable branch protection on main and require the Lint, Build, Test check to pass before merge.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Messaging Markdown Exporter

Supported sources

Architecture

Install

CLI usage

iMessage

Telegram

WhatsApp

Signal

Contacts integration (iMessage)

Flags

YAML frontmatter

Installer

Source-specific notes

iMessage

Telegram

WhatsApp

Signal

Development

Coverage thresholds

Current limitations

Releasing

Manual minor / major bumps

npm authentication

Branch protection

License