preverification
v0.1.0
Published
Fast, SMTP-free email preverification CLI
Readme
verification
Fast, SMTP-free email pre-verification CLI. Takes a file full of emails (in any sensible separator format) and produces a verdict, a score, and the reasons behind each decision.
Why no SMTP?
SMTP probing works but drags along a bunch of real-world problems:
- Greylisting and tarpitting make it slow.
- Many providers (Google, Microsoft) lie — they return
250 OKon RCPT TO for any local part, defeating the point. - It hurts sender reputation and gets the probing IP blocklisted.
- Most cloud and residential networks block outbound port 25.
Instead we combine several cheap signals:
- Syntax validation — pragmatic RFC-ish check that every real mailer enforces.
- Typo correction — e.g.
gamil.com→gmail.com,gmail.con→gmail.com. Uses Damerau-Levenshtein over a list of popular providers, plus TLD-only correction for custom domains. - Domain validation — DNS-over-HTTPS against Google (
dns.google) and Cloudflare (1.1.1.1), randomly picking a resolver per query. - Catch-all detection — merged list of disposable providers plus heuristics on the domain's MX records (ImprovMX, forwardemail, etc. are strong catch-all signals).
- Light server verification — resolve the primary MX host to an IP and
optionally TCP-connect to
:25without speaking SMTP. Confirms the host actually exists and a mail server is listening. - Additional scoring — role-based locals (
info@,admin@), free providers, plus-addressing, long locals, gibberish locals, digit runs, etc.
Each check contributes to a 0–100 score mapped to a valid / risky /
invalid verdict.
Usage
npx verification@latest emails.txtThe input file can mix separators freely:
[email protected], [email protected]
[email protected]
[email protected];[email protected]Options
| flag | description |
| --- | --- |
| -c, --concurrency <n> | parallel verifications (default 20) |
| -f, --format <fmt> | pretty | json | jsonl | csv |
| -o, --output <path> | write to a file instead of stdout |
| -e, --export <path> | export a list of verified emails (normalized, one per line) |
| --no-tcp | skip the TCP :25 reachability probe (DNS only) |
| --tcp-timeout <ms> | TCP connect timeout (default 2500) |
| --doh-timeout <ms> | DoH query timeout (default 3000) |
| --min-score <n> | only print results at or above this score |
| --only <verdict> | only print results with this verdict |
Examples
# pretty output, default
verification list.txt
# CSV export for a spreadsheet
verification list.txt -f csv -o report.csv
# JSON-lines for piping into jq
verification list.txt -f jsonl | jq 'select(.verdict == "risky")'
# only keep valid addresses
verification list.txt --only valid -f jsonl > clean.jsonlScoring reference
Every email starts at 100 and subtracts:
| signal | penalty | | --- | --- | | typo suggested | −15 | | no MX records | −25 | | disposable domain | −60 | | catch-all (high confidence) | −20 | | catch-all (medium confidence) | −10 | | role-based local | −12 | | plus-addressing | −3 | | long local (>30) | −5 | | gibberish local | −15 | | 6+ consecutive digits in local | −5 | | MX host does not resolve | −20 | | MX reachable on TCP :25 | +3 |
Thresholds: >=75 valid, >=40 risky, otherwise invalid. Hard fails
(invalid syntax, NXDOMAIN) immediately score 0.
