@agentmarketing/payload-spam-filtering

v2.0.3

Published

a month ago

Payload CMS plugin for spam detection using Google Gemini AI. Includes init CLI, local pre-filter, result caching, and a Captured Spam admin collection.

Downloads

382

0High
0Medium
0Low

agent-marketing

jake-holcroft

shane-farmer

payload payloadcms plugin spam detection gemini ai form security

Payload Spam Filtering

A spam detection plugin for Payload 3 that hooks into both site-form-submissions (Site Forms) and form-submissions (Payload form-builder). Catches obvious spam locally before any API call, caches Gemini results for repeated content, saves blocked submissions to a Captured Spam admin collection, and returns a friendly message to the end user.

By Shane Farmer / Agent Marketing
Questions or issues? Email [email protected].

Install

pnpm add @agentmarketing/payload-spam-filtering
# or: npm install @agentmarketing/payload-spam-filtering

Quick setup

1. Set environment variables

Add to your .env:

GEMINI_API_KEY=your_gemini_api_key_here
GEMINI_MODEL=gemini-2.5-flash
SPAM_DETECTION_BUSINESS_CONTEXT=Describe your business here
SPAM_DETECTION_STRICTNESS=2

2. Run init

From your Payload project root:

pnpm exec agent-spam-filter init
# or: npx agent-spam-filter init

This copies spamFilterPlugin.ts into src/plugins/ and patches src/plugins/index.ts to import and register it. Restart your dev server and you're done.

Re-running is safe and idempotent (skips files that already exist). Use --force to overwrite — a .bak copy is kept.

What `init` installs

`src/plugins/spamFilterPlugin.ts`

A Payload Plugin function that:

Adds a capture-spam collection to your Payload config (visible in admin under Spam Detection)
Registers a beforeChange hook on both site-form-submissions and form-submissions via onInit, so plugin order in your config does not matter

The hook runs checks in this order on every complete submission:

| Step | Cost | What happens | | --- | --- | --- | | 1. Disabled / no API key | — | Fail open, submission passes | | 2. Content hash cache hit | Zero | Return stored result, no Gemini call | | 3. Local pre-filter match | Zero | Block immediately, save to Captured Spam | | 4. Gemini AI | API call | Classify, cache result, save to Captured Spam if spam |

When spam is detected the user sees a friendly 400 message instead of a silent failure.

`capture-spam` collection

Saved to the Payload admin under the Spam Detection group. Each entry records:

| Field | Description | | --- | --- | | Form type | site-form-submissions or form-submissions | | Form ID | The ID of the form that was submitted | | Detected by | pre-filter or gemini | | Confidence | 0.0 to 1.0 score | | Reasoning | Gemini explanation or pre-filter rule name | | Detected at | Timestamp | | Submission data | Full field/value array as JSON |

How it hooks into Payload

The plugin uses onInit to attach hooks after all plugins have loaded, so it works regardless of plugin registration order.

site-form-submissions — gated on data.status === 'complete' to skip drafts and partial multi-step sequences
form-submissions — gated on operation === 'create'

Local pre-filter

Catches unambiguous spam without any Gemini API call:

| Pattern | Examples | | --- | --- | | Nigerian prince / inheritance scams | "nigerian bank", "inheritance fund" | | Lottery / prize scams | "you have won", "claim your prize" | | Medical spam | "viagra", "cialis" | | Generic scam greetings | "dear sir", "dear friend", "dear beneficiary" | | Work-from-home scams | "earn" + "work from home" | | Guaranteed profit / crypto | "guaranteed returns", "crypto guaranteed profit" | | Excessive URLs | 4+ links in a single message | | Excessive all-caps | Over 50% uppercase alpha chars, message over 40 chars |

Pre-filter results are saved to Captured Spam with source: pre-filter.

Content hash cache

Results are cached by SHA-256 hash of the submission content. Identical submissions return the cached decision instantly with no API call.

TTL: SPAM_DETECTION_CACHE_TTL_HOURS (default 24 hours)
Max entries: 500 (oldest evicted)
Saved to: {SPAM_DETECTION_STORAGE_PATH}/spam-detection-cache.json

Strictness levels

Set via SPAM_DETECTION_STRICTNESS env var:

| Level | Value | Behaviour | | --- | --- | --- | | Low | 1 | Only obvious spam — scams, lottery, medical. Very conservative. | | Medium | 2 | Balanced — also blocks promotional language and generic greetings. Default. | | High | 3 | Strict — only passes content that is clearly a specific business inquiry. Requires SPAM_DETECTION_BUSINESS_CONTEXT. |

Configurable Gemini model

The model is read from GEMINI_MODEL (default gemini-2.5-flash). Change it in .env without a code deploy.

Upgrading from 1.x

Update the package: npm update @agentmarketing/payload-spam-filtering
Add to .env: GEMINI_MODEL=gemini-2.5-flash
Re-run init to get the capture-spam collection and updated hook behaviour:

pnpm exec agent-spam-filter init --force

A .bak copy of your existing spamFilterPlugin.ts is kept before overwriting.

CLI reference

agent-spam-filter init [--force] [--skip-plugins] [--package name]

  --force          Overwrite existing scaffold file (keeps a .bak copy).
  --skip-plugins   Do not patch src/plugins/index.ts.
  --package name   Override the import path used in the patch.

Using `checkForSpam` directly

The core function is still available for custom integrations:

import { checkForSpam } from '@agentmarketing/payload-spam-filtering'

const result = await checkForSpam(
  [{ field: 'message', value: 'Hello, I need a quote.' }],
  { strictness: 2 }
)

if (result.isSpam) {
  // result.confidence  -- 0.0 to 1.0
  // result.reasoning   -- explanation
}

All env vars are read automatically. Pass options to override per-call.

Environment variable reference

| Variable | Required | Default | Description | | --- | --- | --- | --- | | GEMINI_API_KEY | Yes | — | Google AI Studio API key | | GEMINI_MODEL | No | gemini-2.5-flash | Gemini model slug | | SPAM_DETECTION_BUSINESS_CONTEXT | No | General business inquiries | Used at strictness 3 | | SPAM_DETECTION_STRICTNESS | No | 2 | 1, 2, or 3 | | SPAM_DETECTION_ENABLED | No | true | Set to false to disable | | SPAM_DETECTION_CACHE_TTL_HOURS | No | 24 | Hours to keep cached results | | SPAM_DETECTION_STORAGE_PATH | No | ./public/spam-detection | Folder for JSON result files |

License

MIT.

Contact

Agent Marketing · [email protected]

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme