npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@dsojevic/profanity-list

v1.0.0

Published

A highly consumable list of profanities / bad words with severity ratings, exceptions, and tags.

Downloads

48

Readme

Profanity List

This repository contains highly consumable lists of words and/or phrases that may be considered profane or inappropriate. The profanity lists come in two forms: JSON format that contains the words and phrases with associated meta information and a plain text format that contains one word or phrase per line.

Please note that most entries in the JSON lists have started their life with a Severity Level of 3 (Strong) prior to any solid classification - this is not necessarily an accurate reflection of the common severity level of any of these words or phrases. Contributions are more than welcome to adjust these levels to bring them more in line with what you may expect.

Languages

| Name | Code | JSON | Plain Text | Meta | | ------- | ------- | ------------------------ | ---------------------- | ------------------------------------------ | | English | en | en.json | en.txt | 434 profanities, 809 matches, 6 tags | | Emoji | emoji | emoji.json | emoji.txt | 7 profanities, 18 matches, 2 tags |

Available Tags

| Name | Code | Tags | | ------- | ------- | ----------- | | English | en | general | | | | lgbtq | | | | racial | | | | religious | | | | sexual | | | | shock | | Emoji | emoji | general | | | | sexual |

JSON Format

The JSON format has a top level array containing objects with the following structure:

| Property | Required? | Description | | --------------- | --------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | id | Y | String in lowercase characters used as a unique ID for the profanity. | | match | Y | String in lowercase characters. Asterisks (*) can be used to indicate the previous character can have one or more appearances. Pipes (\|) can be used as a separator if matching multiple terms under this profanity. | | severity | Y | Integer from 1 to 4 corresponding to a supported Severity Level. | | tags | N | An array of lowercase strings to indicate how this profanity is tagged. These should be in English. | | allow_partial | N | Boolean value. Whether or not this profanity should be used with partial matching. Explicitly set to false when the match may otherwise have hundreds or thousands of exceptions. | | | | Implementations should default to true if no attribute is present. (ie. the profanity can be used in partial matching) | | exceptions | N | An array of lowercase strings indicating exceptions to this profanity if using partial matching. An asterisk (*) is used as a placeholder for the matched word. | | | | Example: sp* would be a valid exception ('sparse') for the profanity arse if relying on partial matching._ |

Severity Levels

| Value | Severity | | ----- | -------- | | 1 | Mild | | 2 | Medium | | 3 | Strong | | 4 | Severe |

Example Data Structure

[
  {
    "id": "plain-text",
    "match": "plain text",
    "severity": 1,
    "tags": ["insults", "anti-computer"],
    "exceptions": ["unusually *", "very *"]
  },
  {
    "id": "multiple-matches",
    "match": "multiple|multipal",
    "severity": 2,
    "tags": ["functionality"]
  },
  {
    "id": "elongated-words",
    "match": "lo*ng",
    "severity": 3,
    "tags": ["long-words"],
    "exceptions": ["*ing"]
  },
  {
    "id": "exact-match-only",
    "match": "en",
    "severity": 1,
    "tags": ["exact-words"],
    "partial_match": "false"
  }
]

Example Matching

| Profanity ID | Sentence | Should Match? | Comment | | ------------------ | --------------------------------------------- | ------------- | ----------------------------------------- | | plain-text | I like plain text! | Y | Exact match | | plain-text | I generally do plain texting. | Y | Partial match | | plain-text | Unusually plain text is weird... | N | Part of an exception | | plain-text | You have very plain text. | N | Part of an exception | | plain-text | Plain old sentence with text | N | Not found at all | | multiple-matches | There are multiple ways to match. | Y | Exact match on first match option | | multiple-matches | I can spell multipal just fine, thx. | Y | Exact match on second match option | | multiple-matches | I'm using the word many instead... | N | Not found at all | | elongated-words | This is a long word. | Y | Exact match | | elongated-words | Such a looooong wait! | Y | Match on repeating 'o' | | elongated-words | I am longing for some food | N | Part of an exception | | elongated-words | Short words are the best! | N | Not found at all | | exact-match-only | The language of this is en | Y | Exact match | | exact-match-only | Ensure I send a pencil to the agency. | N | Partial matching disallowed/discouraged |

Plain Text Format

The plain text format contains one word or phrase per line and does not include any meta information on the matching such as asterisks to indicate one or more characters matched.

Example Data Structure

plain text
long
multiple
multipal

Node Modules

List / Data Package

This package is available as an NPM package for use in JS projects.

NPM installation:

npm install @dsojevic/profanity-list

Yarn installation:

yarn add @dsojevic/profanity-list

Contributing

Contributions to the profanity lists are welcome. As profanities are subjective by nature, please use your best judgement when it comes to adding to or updating these lists.

The source of truth for these lists is located in the JSON files in the src directory. The top level language files should be built from these source files using the ./bin/build.js command.

Adding Profanities

Adding new profanities to the lists and adding new lists for other languages are both highly encouraged. Please

Removing Profanities

Removal of items is discouraged - if you feel strongly that an item isn't profane or doesn't belong, it is preferred that you instead adjust the severity level of the item and/or update the tags associated with it. This allows consumers of the JSON formats to make those decisions for themselves.

Managing Tags

It is preferred that the number of tags used overall is kept relatively small in this data set so as to not overwhelm the choices required by a consumer. Please open an issue to propose any new tags so it can be made open for discussion.

Tag usage should be kept consistent from language to language. Some cases may warrant a language (or set of languages) having a small number of tags that are only applicable to them, though this should only be for exceptional circumstances.

Severity Levels

As noted in the introduction, the default severity level for items in this list is 3 (Strong) as a starting point. Contributions that adjust these to levels that better reflect their severity are more than welcome. For example, if you believe an item is only mildly offensive to the common person, it can be downgraded to 1 (Mild) -- conversely, if you think it is even more offensive it can be upgraded to 4 (Severe).


Copyright (c) 2021 David Sojevic