elemental-tokens

v0.1.0

Published

14 days ago

Human-transcribable, LLM-stable tokens built from chemical element symbols

0High
0Medium
0Low

nibsbin

token tokens csprng crypto entropy elements periodic-table bip39 identifier slug

elemental-tokens

Human-transcribable, LLM-stable tokens built from chemical element symbols.

Fe-Au-Rn-Cu-Xe

Five symbols off the periodic table. Easy to read aloud, easy to type, easy for a language model to echo back without mangling — and backed by a cryptographically secure RNG. Each symbol is worth a clean 6.7 bits of entropy, so the whole token is just length × 6.7 bits of unpredictability with nothing to memorize.

Why element symbols?

A good short identifier has to survive being spoken, typed, dictated to a phone, or round-tripped through an LLM. Random base32 (q7f2k9) and UUIDs fail that test: l/1/I and 0/O collide, and a model has no prior for an arbitrary string.

Element symbols are different:

A closed, formally specified set. No synonyms, no spelling variants, no "correct" value a model would prefer. Rn means Rn.
Strong LLM priors. The periodic table is learned as a single complete artifact — including the superheavy elements — so models reproduce symbols reliably and detect typos against a known vocabulary.
Invalid symbols are detectable before any database lookup. Xx is simply not an element.
Uniform two-character width. Every symbol is exactly two characters, giving a transcriber positional anchors and removing drop/merge errors.

The vocabulary is the 104 two-letter symbols only. All 14 single-letter symbols (H, B, C, N, O, F, P, S, K, V, W, Y, I, U) are excluded — a lone C floating next to Fe is exactly the kind of thing that gets dropped or merged. 104 symbols is log2(104) ≈ 6.7 bits apiece.

Install

npm install elemental-tokens

Ships ESM + CommonJS + TypeScript types. Runs on Node.js ≥ 18 and in the browser (uses the Web Crypto API via globalThis.crypto).

Quick start

import { generate, validate } from "elemental-tokens";

const token = generate();        // "Fe-Au-Rn-Cu-Xe"
validate(token);                 // true
validate("fe-au-rn-cu-xe");      // false  (case-sensitive)
validate("Xx-Au-Rn-Cu-Xe");      // false  (Xx is not an element)

CommonJS:

const { generate, validate } = require("elemental-tokens");

API

`generate(options?): string`

Generates a token. Each symbol is drawn uniformly from the vocabulary with a CSPRNG and rejection sampling — that's the full 6.7 bits per symbol, no modulo bias shaving anything off.

| Option | Type | Default | Description | | ----------- | ---------- | ------------------ | ---------------------------------------- | | length | number | 5 | Number of symbols. Positive integer. | | delimiter | string | "-" | String placed between symbols. | | symbols | string[] | the 104 elements | Override the vocabulary entirely. |

generate();                                   // "Mg-Sc-Pb-Re-Nd"
generate({ length: 8 });                      // 8 symbols ≈ 53.6 bits
generate({ delimiter: "" });                  // "MgScPbReNd"
generate({ symbols: ["aa", "bb", "cc"] });    // custom vocabulary

Throws RangeError if length is not a positive integer, or if symbols is provided but not a non-empty array.

`validate(token, options?): boolean`

Returns true only if every delimiter-separated segment is a known symbol. Strict and case-sensitive — "Fe-Au" is valid, "fe-au" is not. No trimming, no checksum. Returns false for an empty token or any empty segment (leading, trailing, or doubled delimiters). Pass the same delimiter / symbols you generated with.

validate("Fe-Au-Rn-Cu-Xe");                       // true
validate("Fe.Au", { delimiter: "." });            // true
validate("aa-bb", { symbols: ["aa", "bb"] });     // true

Exported constants

import { ELEMENT_SYMBOLS, SYMBOL_COUNT, BITS_PER_SYMBOL } from "elemental-tokens";

SYMBOL_COUNT;     // 104
BITS_PER_SYMBOL;  // 6.700439718141092

Entropy math

The unit of account is one symbol = log2(104) ≈ **6.7 bits**. A token is just a stack of these 6.7-bit quanta, so picking a length is picking a strength:

| Length | Entropy | Notes | | ------ | -------------- | ---------------------------------------------- | | 4 | ≈ 26.8 bits | | | 5 | ≈ 33.5 bits| default — 5 × 6.7 | | 6 | ≈ 40.2 bits | | | 8 | ≈ 53.6 bits | | | 12 | ≈ 80.4 bits | | | 20 | ≈ 134 bits | ~128-bit-class secret |

To hit a target strength, divide by 6.7: 128 bits ÷ 6.7 ≈ 20 symbols.

Threat model

elemental-tokens is designed as a security-load-bearing identifier for short-lived tokens under aggressive rate limiting — think a token that's valid for five minutes and gets at most a few dozen guesses before lockout.

CSPRNG only. Randomness comes from crypto.getRandomValues (Web Crypto, present in Node ≥ 18 and browsers). Math.random is never used.
No modulo bias. Sampling uses rejection sampling, so each of the 104 symbols is equally likely and the per-symbol entropy really is the full 6.7 bits — not "6.7 bits minus a sliver." (The 256-byte enumeration test proves each symbol is reachable from exactly two byte values.)
The whole vocabulary is public. Observing tokens teaches an attacker nothing; the search space is fixed at 104^length.
No checksum, by design. Error detection is the database lookup — an unknown or malformed token fails immediately. If you need offline typo detection, use a longer token or add your own check digit.
Pick length for your threat model. The default 33.5 bits is comfortable for rate-limited, short-lived tokens (50 guesses in 5 minutes is ~50 / 2^33.5 ≈ 1 in 170 million per window). It is not sized for offline brute force — for secrets that must resist that, use length ≥ 12 (≈ 80 bits) or ≥ 20 (≈ 128 bits).

Comparison to a BIP39 wordlist

BIP39 is the closest well-known relative: a fixed vocabulary mapped to entropy.

| | BIP39 | elemental-tokens | | ------------------------ | ---------------------------- | --------------------------------- | | Vocabulary size | 2048 words | 104 symbols | | Entropy per unit | 11 bits/word | 6.7 bits/symbol | | Unit length | 3–8 letters | exactly 2 characters | | Checksum | yes (built into seed phrase) | no (lookup is the check) | | Designed for | 128–256-bit seed phrases | short, dictation-friendly tokens | | Units to reach ~128 bits | ~12 words | ~20 symbols | | Autocorrect risk | high (real English words) | low (not dictionary words) |

BIP39 trades shorter sequences for a much larger, English-word vocabulary and a checksum. elemental-tokens trades that for shorter, two-character atoms that are faster to read aloud, internationally recognizable, and stable through an LLM — at a tidy 6.7 bits each.

TypeScript

Types ship with the package:

import { generate, type GenerateOptions } from "elemental-tokens";

const opts: GenerateOptions = { length: 8, delimiter: "-" };
const token = generate(opts);

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

elemental-tokens

Why element symbols?

Install

Quick start

API

generate(options?): string

validate(token, options?): boolean

Exported constants

Entropy math

Threat model

Comparison to a BIP39 wordlist

TypeScript

License

`generate(options?): string`

`validate(token, options?): boolean`