phishlens

v0.1.1

Published

19 days ago

CLI tool that detects phishing and scam URLs, emails, and SMS messages using deterministic signals plus pluggable LLM analysis (Anthropic, OpenAI, Ollama).

0High
0Medium
0Low

mistankh

phishing scam cybersecurity security url-checker email-checker anti-phishing cli llm anthropic openai ollama

PhishLens 🔍

Phishing & scam detection for URLs, emails, and SMS — from the terminal.

PhishLens combines deterministic heuristics, real-time reputation lookups, and pluggable LLM analysis to deliver fast, calibrated risk scores for any suspicious input. Use it as a CLI, an interactive TUI, or a local REST API.

Features

| Category | What PhishLens checks | |---|---| | URL structure | Domain age (WHOIS), brand lookalike (Levenshtein + homograph deconfusion), punycode/IDN, TLD reputation, subdomain depth, URL shorteners, IP-as-host | | Page content | Live page fetch, cross-domain login forms, hidden iframes, meta-refresh redirects, obfuscated JavaScript, brand/title mismatch | | Email headers | SPF / DKIM / DMARC authentication results, Received-SPF parsing, DMARC DNS policy lookup | | Text & SMS | Urgency phrases, social-engineering patterns, sender-brand spoofing, display-text/href mismatches | | Reputation feeds | Google Safe Browsing, VirusTotal (70+ AV scanners), URLScan.io | | LLM analysis | Anthropic Claude, OpenAI GPT, Ollama, LM Studio — calibrated risk score with natural-language reasoning |

Installation

npm install -g phishlens

Or run without installing:

npx phishlens url https://suspicious-site.example.com

Quick start

# Analyze a URL
phishlens url https://paypa1-verify.com

# Analyze an email (.eml file)
phishlens email suspicious.eml

# Analyze a copy-pasted email via stdin
cat suspicious.eml | phishlens email

# Analyze an SMS / text message
phishlens text "Your parcel is on hold. Pay £1.99 to release: bit.ly/..."

# Bulk-scan a list of URLs
phishlens bulk urls.txt

# Launch the interactive TUI
phishlens

Sample output

phishlens — https://paypa1-verify.com

 DANGEROUS  score 91/100
confidence 94%  •  7 signals  •  2847 ms

Domain impersonates PayPal using a character substitution ('1' for 'l') and
was registered only 2 days ago. Cross-domain password form and brand mismatch
in page title confirm credential-harvesting intent.

URL
  ✖ Domain "paypa1" uses character substitution to mimic "paypal"  [+30]
  ✖ Domain registered 2 days ago                                   [+25]
  ▲ Top-level domain ".xyz" has a high abuse rate                  [+12]

CONTENT
  ✖ Login form submits credentials to evil-server.ru               [+35]
  ✖ Page title mentions "paypal" but the domain doesn't            [+22]

LLM REASONING  (anthropic • claude-sonnet-4-6)
  This URL exhibits textbook phishing characteristics: homograph substitution,
  fresh domain registration, and a credential-harvesting form. High confidence
  this is a phishing page targeting PayPal users.

Configuration

All settings live in ~/.phishlens/config.json. Use phishlens config set to update them — API keys are stored with 0600 file permissions.

# LLM providers
phishlens config set defaultLLM anthropic
phishlens config set providers.anthropic.apiKey  sk-ant-...
phishlens config set providers.openai.apiKey     sk-...
phishlens config set providers.ollama.baseUrl    http://localhost:11434
phishlens config set providers.lmstudio.baseUrl  http://localhost:1234

# Reputation feeds
phishlens config set virusTotalApiKey   YOUR_VT_KEY   # free: 500 lookups/day
phishlens config set safeBrowsingApiKey AIza...        # Google Safe Browsing
phishlens config set urlScanApiKey      YOUR_KEY       # optional; public mode works without one

# Tune timeouts
phishlens config set networkTimeoutMs  5000    # HTTP checks (default 5 s)
phishlens config set llmTimeoutMs      120000  # LLM calls  (default 120 s)

phishlens config show   # view current settings (keys are masked)
phishlens config path   # print config file path

LLM providers

| Provider | How to configure | Default model | |---|---|---| | Anthropic | config set providers.anthropic.apiKey or ANTHROPIC_API_KEY env var | claude-sonnet-4-6 | | OpenAI | config set providers.openai.apiKey or OPENAI_API_KEY env var | gpt-4o | | Ollama | Run ollama serve — models discovered automatically | (first available) | | LM Studio | Run LM Studio local server — models discovered automatically | (first loaded) |

CLI reference

phishlens [command] [options]

Commands:
  url <url>            Analyze a URL
  email [file]         Analyze an email (.eml file or stdin)
  text [message…]     Analyze a text/SMS message (args or stdin)
  bulk <file>          Analyze multiple inputs from a file (one per line)
  serve                Start a local REST API server
  tui                  Launch the interactive terminal UI  (default with no args)
  config show          Show current configuration
  config set <k> <v>   Set a config value
  config path          Print the config file path
  cache clear          Delete all cached results

Common flags

| Flag | Description | |---|---| | --no-llm | Skip LLM analysis (deterministic signals only, faster) | | --llm-provider <name> | Override provider: anthropic \| openai \| ollama \| lmstudio | | --llm-model <name> | Override model (e.g. llama3.2, gpt-4o, claude-opus-4-7) | | --llm-timeout <ms> | Timeout for LLM calls (default 120000) | | --offline | Disable all network calls | | --timeout <ms> | Network timeout for HTTP checks (default 5000) | | --no-cache | Skip the result cache | | -j, --json | Output result as JSON | | -v, --verbose | Verbose debug logging |

Bulk mode

One input per line — URLs, email snippets starting with From:, or free text. Empty lines and # comments are skipped.

# urls.txt
https://suspicious-login.example.com
https://paypa1.com/verify
# This is a comment — skipped
https://www.google.com

phishlens bulk urls.txt
phishlens bulk --json urls.txt > results.json
phishlens bulk --no-llm --offline urls.txt   # fast, deterministic-only

REST API server

phishlens serve --port 3000 --host 127.0.0.1

# Analyze a URL
curl -s -X POST http://localhost:3000/analyze \
  -H "Content-Type: application/json" \
  -d '{"url":"https://suspicious-site.example.com"}' | jq .verdict

# Analyze email text
curl -s -X POST http://localhost:3000/analyze \
  -H "Content-Type: application/json" \
  -d '{"email":{"raw":"From: [email protected]\nSubject: Verify\n\nDear customer..."}}'

# Health check
curl http://localhost:3000/health

Programmatic API

import { analyze } from 'phishlens';

const result = await analyze(
  { kind: 'url', url: 'https://suspicious-site.example.com' },
  {
    config: {
      virusTotalApiKey: process.env.VT_KEY,
      providers: { anthropic: { apiKey: process.env.ANTHROPIC_API_KEY } },
    },
  }
);

console.log(result.verdict);  // 'safe' | 'suspicious' | 'dangerous' | 'unknown'
console.log(result.score);    // 0–100
console.log(result.signals);  // detailed signal breakdown
console.log(result.llm);      // LLM reasoning (if configured)

Exit codes

| Code | Meaning | |---|---| | 0 | Safe, unknown, or analysis complete without danger | | 2 | Verdict is dangerous | | 1 | Unhandled error |

Architecture

Input (URL / Email / SMS)
         │
         ▼
  ┌─────────────────────────────────────────────────┐
  │           Deterministic Analyzers               │
  │  (run in parallel)                              │
  │                                                 │
  │  UrlAnalyzer        ContentAnalyzer             │
  │  TextAnalyzer       EmailAuthAnalyzer           │
  │  BlocklistAnalyzer  VirusTotalAnalyzer          │
  │  UrlScanAnalyzer                                │
  └────────────────────────┬────────────────────────┘
                           │ Signal[]
                           ▼
              ┌────────────────────────┐
              │      LLM Analyzer      │
              │  (informed by prior    │
              │   deterministic        │
              │   signals)             │
              └───────────┬────────────┘
                          │ Signal[]
                          ▼
              ┌────────────────────────┐
              │     Scoring Engine     │
              │  weighted sum →        │
              │  verdict + score +     │
              │  confidence + summary  │
              └────────────────────────┘

Each analyzer emits Signal objects with a weight (positive = more risk, negative = reduces risk). The scoring engine clamps the weighted aggregate into [0, 100] and derives a verdict at fixed thresholds.

Development

git clone https://github.com/MistanKh/phishlens
cd phishlens
npm install
npm test          # vitest — 220+ tests
npm run typecheck # tsc --noEmit
npm run build     # tsup → dist/
npm start         # run built CLI
npm run dev       # watch mode

Adding an analyzer

Create src/analyzers/my-analyzer.ts implementing the Analyzer<T> interface.
Register it in src/engine.ts (add to the analyzers array).
Add tests in tests/my-analyzer.test.ts.
If it needs a new config key, add it to PhishlensConfig in src/types.ts.

License

MIT