npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

sensitive-word-tool

v1.1.9

Published

基于 DFA 算法实现,非常轻巧完备的 JavaScript 敏感词处理库🚀🚀🚀

Downloads

239

Readme

sensitive-word-tool

npm npm npm bundle size NPM

基于 DFA 算法实现,支持过滤掉干扰词的基础上处理敏感词,非常轻巧完备的 JavaScript 敏感词处理库🚀🚀🚀

说明

本库是一个处理敏感词的工具库,但也提供了一些默认的敏感词。如果需要的话可以参考这里

性能

以下测试均在本地进行,在生产环境将会更快一些。

测试字符串为随机生成的汉字、字母、数字。 在 20000 个随机敏感词构建的树下进行测试,每组测试 5 次取平均值。

| 字符串长度 | 实例化时间 | verify | match | filter | | ----------- |----------- | ----------- | ----------- | -----------| | 1000 | < 65ms | < 0.7ms | < 1.25ms | < 1.35ms | | 5000 | < 65ms | < 0.75ms | < 10ms | < 9.5ms | | 10000 | < 65ms | < 1.5ms | < 13.5ms | < 15ms | | 20000 | < 65ms | < 1.5ms | < 22ms | < 24ms |

安装

  • 使用 npm
npm install sensitive-word-tool
  • 使用 yarn
yarn add sensitive-word-tool
  • 使用 pnpm
pnpm add sensitive-word-tool

使用

导入包

  • CommonJS 导入
const { SensitiveWordTool } = require('sensitive-word-tool')
  • ESModule 导入
import SensitiveWordTool from 'sensitive-word-tool'

进行敏感词检测

  • 基础用法
import SensitiveWordTool from 'sensitive-word-tool'

// 初始化时使用默认敏感词
const sensitiveWordTool = new SensitiveWordTool({
  useDefaultWords: true
})
sensitiveWordTool.match("资金周转救市,股市圈钱崩盘?联系我们!") // ['资金周转', '救市', '股市圈钱', '崩盘']

// 继续添加敏感词
sensitiveWordTool.addWords(['王八蛋', '王八羔子', '测试', '江南皮革厂'])

// 《》()属于干扰词,将被自动忽略
sensitiveWordTool.match('浙江温州,江南《皮革厂》老板王(八)蛋,带着小姨子跑了') // ['江南皮革厂', '王八蛋']
sensitiveWordTool.verify('浙江温州,江南《皮革厂》老板王(八)蛋,带着小姨子跑了') // true
sensitiveWordTool.filter('浙江温州,江南《皮革厂》老板王(八)蛋,带着小姨子跑了') // 浙江温州,**(***)老板*(*)*,带着小姨子跑了

sensitiveWordTool.match('皮革厂老板带着小姨子跑了') // []
sensitiveWordTool.verify('皮革厂老板带着小姨子跑了') // false
sensitiveWordTool.filter('皮革厂老板带着小姨子跑了') // 皮革厂老板带着小姨子跑了
  • 进阶用法
// 初始化时设置敏感词
const sensitiveWordTool = new SensitiveWordTool({ wordList: ['王八蛋', '王八羔子', '测试', '江南皮革厂'] })

// 支持继续增加敏感词
sensitiveWordTool.addWords(['小姨子'])
sensitiveWordTool.match('江南皮革厂老板带着小姨子跑了')  // ['江南皮革厂', '小姨子']


// 支持清空当前的敏感词
sensitiveWordTool.clearWords()
sensitiveWordTool.addWords(['江南皮革厂'])
sensitiveWordTool.match('江南皮革厂老板带着小姨子跑了')  // ['江南皮革厂']


// 支持主动设置干扰词(不设置将使用默认干扰词),敏感词检测时会将文本中的干扰词删除再匹配
sensitiveWordTool.setNoiseWords(' $')
sensitiveWordTool.match('浙江温州,江南 皮革$厂老板王$八&蛋,带着小姨子跑了')  // ['江南皮革厂']

API

构造函数

示例
const sensitiveWordTool = new SensitiveWordTool({
  wordList: ['王八蛋', '王八羔子', '测试', '江南皮革厂'],
  noiseWords: ' $',
  useDefaultWords: true
})
参数
  • wordList: 可选。用于设置初始的敏感词。默认值:[]
  • noiseWords: 可选。用于设置干扰词,敏感词检测时会将待检测文本中的干扰词删除后再匹配。默认值:
 \t\r\n~!@#$%^&*()_+-=【】、{}|;\':",。、《》?αβγδεζηθικλμνξοπρστυφχψωΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨΩ。,、;:?!…—·ˉ¨‘’“”々~‖∶"'`|〃〔〕〈〉《》「」『』.〖〗【】()[]{}ⅠⅡⅢⅣⅤⅥⅦⅧⅨⅩⅪⅫ⒈⒉⒊⒋⒌⒍⒎⒏⒐⒑⒒⒓⒔⒕⒖⒗⒘⒙⒚⒛㈠㈡㈢㈣㈤㈥㈦㈧㈨㈩①②③④⑤⑥⑦⑧⑨⑩⑴⑵⑶⑷⑸⑹⑺⑻⑼⑽⑾⑿⒀⒁⒂⒃⒄⒅⒆⒇≈≡≠=≤≥<>≮≯∷±+-×÷/∫∮∝∞∧∨∑∏∪∩∈∵∴⊥∥∠⌒⊙≌∽√§№☆★○●◎◇◆□℃‰€■△▲※→←↑↓〓¤°#&@\︿_ ̄―♂♀┌┍┎┐┑┒┓─┄┈├┝┞┟┠┡┢┣│┆┊┬┭┮┯┰┱┲┳┼┽┾┿╀╁╂╃└┕┖┗┘┙┚┛━┅┉┤┥┦┧┨┩┪┫┃┇┋┴┵┶┷┸┹┺┻╋╊╉╈╇╆╅╄
  • useDefaultWords: 是否使用默认敏感词。默认值:false。默认敏感词参考这里

.setNoiseWords

设置干扰词。敏感词检测时会将待检测文本中的干扰词过滤掉再匹配。

示例
const sensitiveWordTool = new SensitiveWordTool()
sensitiveWordTool.setNoiseWords(' $')

.clearWords

清空当前设置的所有敏感词。

示例
const sensitiveWordTool = new SensitiveWordTool()
sensitiveWordTool.clearWords()

.addWords

继续增加敏感词。

示例
const sensitiveWordTool = new SensitiveWordTool()
sensitiveWordTool.addWords(['王八蛋', '王八羔子', '测试', '江南皮革厂'])

.match

从文本中匹配出所有出现过的敏感词。返回匹配到的敏感词数组,如未匹配则返回空数组。

示例
const sensitiveWordTool = new SensitiveWordTool()
sensitiveWordTool.addWords(['王八蛋', '王八羔子', '测试', '江南皮革厂'])
sensitiveWordTool.match('浙江温州,江南《皮革厂》老板王(八)蛋,带着小姨子跑了') // ['江南皮革厂', '王八蛋']

.verify

检测文本中是否出现了敏感词。返回 true or false

示例
const sensitiveWordTool = new SensitiveWordTool()
sensitiveWordTool.addWords(['王八蛋', '王八羔子', '测试', '江南皮革厂'])
sensitiveWordTool.verify('浙江温州,江南《皮革厂》老板王(八)蛋,带着小姨子跑了') // true

.filter

替换掉文本中出现的敏感词。

示例
const sensitiveWordTool = new SensitiveWordTool()
sensitiveWordTool.addWords(['王八蛋', '王八羔子', '测试', '江南皮革厂'])
sensitiveWordTool.filter('浙江温州,江南《皮革厂》老板王(八)蛋,带着小姨子跑了', '*') // 浙江温州,**《***》老板*(*)*,带着小姨子跑了
参数
sensitiveWordTool.filter(content)
sensitiveWordTool.filter(content, filterChar)
  • content: 待匹配文本内容
  • filterChar: 敏感词替代符,默认为*