gentext

v1.2.0

Published

2 months ago

Generates random English words ranked by real-world frequency from a one-billion-word corpus

0High
0Medium
0Low

iamursky

random english words text generation

gentext

Generates random English words ranked by real-world frequency from a one-billion-word corpus. Useful for typing practice, placeholder content, and testing.

Install

npm install gentext

Usage

import gentext from "gentext";

// 25 random words with default settings
gentext();

// 10 nouns only, drawn from the full dictionary
gentext({
  numberOfWords: 10,
  type: "nouns",
  frequencyThreshold: 0,
});

// 40 words, 80% nouns
gentext({
  numberOfWords: 40,
  nounToVerbRatio: 0.8,
});

// 25 words weighted toward words containing "an" or "in"
gentext({
  ngrams: ["an", "in"],
});

Options

| Option | Type | Default | Description | | -------------------- | --------------------------------------------- | ------------------- | --------------------------------------------------------------------------------------------------------------------------------------- | | type | "nouns" | "verbs" | "nouns-and-verbs" | "nouns-and-verbs" | Which word types to include. | | numberOfWords | number | 25 | Number of words to return (capped at the pool size — see below). | | frequencyThreshold | number (0–1) | 1 | Controls how much the word pool is restricted to high-frequency words. 1 = only the most frequent words; 0 = the entire dictionary. | | nounToVerbRatio | number (0–1) | 0.5 | Noun-to-verb ratio when type is "nouns-and-verbs". Higher values produce more nouns. | | excludeWords | string[] | [] | Words to exclude from the output. Replacements are drawn from the remaining dictionary so numberOfWords is still met. | | ngrams | string[] | [] | Substrings to prioritise. Words containing each n-gram are weighted N, N-1, …, 1 so the first n-gram gets the most representation. Empty slots are backfilled with unrestricted words. |

Word limits

The built-in dictionaries contain a finite number of words:

| Constant | Value | Available when type is | | ----------- | ----- | ------------------------ | | MAX_NOUNS | 2 886 | "nouns" | | MAX_VERBS | 609 | "verbs" | | MAX_WORDS | 3 495 | "nouns-and-verbs" |

If numberOfWords exceeds the available pool, the output is silently capped at the pool size. You can import the constants to check or enforce limits in your own code:

import gentext, { MAX_NOUNS, MAX_VERBS, MAX_WORDS } from "gentext";

Data source

Word lists are derived from wordfrequency.info, based on the one-billion-word Corpus of Contemporary American English (COCA) — the only corpus of English that is large, up-to-date, and balanced between many genres.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

gentext

Install

Usage

Options

Word limits

Data source

License