image-prompt-guard

v0.1.1

Published

3 months ago

Lightweight NSFW prompt filter for text-to-image AI services. Rule-based detection with keyword, phrase combination, and context scoring.

English | 繁體中文

image-prompt-guard

輕量級、零依賴的 NSFW prompt 過濾器，專為 text-to-image AI 服務設計。

透過關鍵字匹配、片語組合分析與上下文評分進行規則式檢測 — 不需要任何外部 API 呼叫。

安裝

npm install image-prompt-guard

快速開始

import { createGuard } from 'image-prompt-guard'

const guard = createGuard({
  level: 'moderate',
  categories: ['sexual', 'violence', 'hate', 'drugs', 'minors'],
})

// 安全的 prompt
const result = guard.check('a beautiful sunset over the ocean')
// { safe: true, score: 0, flags: [] }

// NSFW prompt
const result2 = guard.check('naked woman in bedroom')
// {
//   safe: false,
//   score: 1,
//   flags: [
//     { category: 'sexual', trigger: 'naked', layer: 'keyword', score: 0.6 },
//     { category: 'sexual', trigger: 'naked + woman', layer: 'phrase', score: 0.4 },
//   ]
// }

設定選項

const guard = createGuard(options)

| 選項 | 型別 | 預設值 | 說明 | |------|------|--------|------| | level | 'strict' \| 'moderate' \| 'loose' | 'moderate' | 檢測嚴格程度 | | categories | Category[] | 全部分類 | 要啟用的檢測分類 | | customKeywords | Partial<Record<Category, string[]>> | {} | 自訂擴充關鍵字 | | customPhrases | Partial<Record<Category, string[][]>> | {} | 自訂擴充片語組合 | | customPatterns | Partial<Record<Category, RegExp[]>> | {} | 自訂擴充正規表達式 | | allowlist | string[] | [] | 白名單片語，永遠放行 |

嚴格程度說明

| Level | 閾值 | 適用場景 | |-------|------|----------| | strict | 0.3 | 兒童友善平台、企業應用 | | moderate | 0.5 | 一般用途（推薦） | | loose | 0.7 | 寬鬆場景，僅攔截明確違規 |

檢測分類

| 分類 | 說明 | 嚴重程度 | |------|------|----------| | sexual | 裸露、色情內容、挑逗性描述 | 標準 | | violence | 血腥、暴力、肢解 | 標準 | | hate | 仇恨言論、種族歧視、歧視性內容 | 標準 | | drugs | 毒品使用、吸毒器具、販毒 | 標準 | | minors | 任何將未成年人性化的內容 | 零容忍（分數加倍） |

檢測層級

檢測透過三層管線依序進行，每層獨立評分後累加：

1. 關鍵字匹配 (Keyword)

直接比對黑名單詞彙，支援 leet-speak 正規化。

"nsfw photo" → 命中 "nsfw" → score +0.6

2. 片語組合 (Phrase)

當多個詞同時出現時觸發，用於偵測單一詞彙無害但組合後有風險的情況。

"nude woman" → "nude" + "woman" 同時出現 → score +0.4

3. 正規表達式 (Pattern)

用 regex 匹配特定句式結構。

"woman without clothes" → 匹配 /no|without\s+clothes/ → score +0.5

回傳結果

interface GuardResult {
  safe: boolean   // 是否安全（score < 閾值）
  score: number   // 風險分數 0~1
  flags: Flag[]   // 觸發的詳細資訊
}

interface Flag {
  category: Category          // 觸發的分類
  trigger: string             // 觸發的關鍵字或片語
  layer: 'keyword' | 'phrase' | 'pattern'  // 哪一層觸發的
  score: number               // 該 flag 的分數
}

文字正規化

內建的正規化引擎會自動處理常見的規避手法，使用者不需要對輸入做任何前處理：

| 規避手法 | 範例 | 正規化結果 | |----------|------|------------| | 大小寫 | NSFW | nsfw | | Leet-speak | nud3 | nude | | 重複字元 | fuuuck | fuck | | 分隔符插入 | n.u.d.e | nude | | Unicode 變體 | 西里爾字母 lookalike | 還原為 Latin | | 變音符號 | é | e |

自訂擴充

擴充關鍵字

const guard = createGuard({
  customKeywords: {
    sexual: ['my-custom-word', 'another-blocked-term'],
    violence: ['custom-violence-word'],
  },
})

擴充片語組合

當陣列中的所有詞彙同時出現在 prompt 中時觸發。

const guard = createGuard({
  customPhrases: {
    sexual: [
      ['word-a', 'word-b'],           // word-a 和 word-b 同時出現才觸發
      ['word-x', 'word-y', 'word-z'], // 三個詞都出現才觸發
    ],
  },
})

擴充正規表達式

const guard = createGuard({
  customPatterns: {
    violence: [/\bcustom\s+pattern\b/i],
  },
})

白名單

避免誤判合法用語。

const guard = createGuard({
  allowlist: ['nude palette', 'killer app', 'drop dead gorgeous'],
})

guard.check('a nude palette for makeup')  // { safe: true }

實際整合範例

Express middleware

import express from 'express'
import { createGuard } from 'image-prompt-guard'

const app = express()
const guard = createGuard({ level: 'strict' })

app.post('/api/generate', express.json(), (req, res) => {
  const { prompt } = req.body
  const result = guard.check(prompt)

  if (!result.safe) {
    return res.status(403).json({
      error: 'Prompt 包含不當內容',
      flags: result.flags,
    })
  }

  // prompt 安全，繼續生圖流程...
})

搭配 AI 二次檢查

本套件適合作為第一道快速過濾，搭配 AI 做更深層的語意判斷：

import { createGuard } from 'image-prompt-guard'

const guard = createGuard({ level: 'moderate' })

async function validatePrompt(prompt: string) {
  // 第一道：規則式快速過濾（< 1ms）
  const result = guard.check(prompt)
  if (!result.safe) return { allowed: false, reason: result.flags }

  // 第二道：送 AI 做語意判斷（僅對第一道放行的 prompt）
  const aiResult = await aiModerationCheck(prompt)
  return aiResult
}

導出項目

import {
  createGuard,   // 建立 guard 實例
  normalize,     // 文字正規化工具（可單獨使用）
} from 'image-prompt-guard'

// 型別導出
import type {
  GuardOptions,  // createGuard 的設定選項
  GuardResult,   // check() 的回傳結果
  Flag,          // 單一觸發 flag
  Category,      // 'sexual' | 'violence' | 'hate' | 'drugs' | 'minors'
  Level,         // 'strict' | 'moderate' | 'loose'
} from 'image-prompt-guard'

注意事項

本套件是規則式過濾器，適合作為第一道快速防線，建議搭配 AI 語意檢測或圖片級 NSFW 偵測使用
詞庫以英文為主，如需其他語言支援，可透過 customKeywords / customPhrases / customPatterns 擴充
minors 分類採零容忍策略，分數自動加倍，確保最高攔截優先級

授權

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

image-prompt-guard

安裝

快速開始

設定選項

嚴格程度說明

檢測分類

檢測層級

1. 關鍵字匹配 (Keyword)

2. 片語組合 (Phrase)

3. 正規表達式 (Pattern)

回傳結果

文字正規化

自訂擴充

擴充關鍵字

擴充片語組合

擴充正規表達式

白名單

實際整合範例

Express middleware

搭配 AI 二次檢查

導出項目

注意事項

授權