text-wrap-minor-words
v0.3.1
Published
Polyfill/extension for CSS text-wrap: pretty to bias against breaks after minor words and apply safe typographic joins.
Maintainers
Readme
text-wrap-minor-words
Experimental, CSS-first polyfill that augments text-wrap: pretty by biasing against line breaks immediately after minor words (articles, prepositions, short conjunctions) in languages where this is a widely accepted typesetting convention. It also applies a couple of safe, language-agnostic joins (e.g., Fig. 2, 20 °C).
Status: experimental. See explainer.md. Live demo: https://jlorenzetti.github.io/text-wrap-minor-words/
Motivation (lean)
text-wrap: pretty improves paragraph breaking but does not let authors express locale-aware preferences about breaking immediately after minor words. Many European languages treat this as a common editorial convention even in body text. This library offers a CSS-first polyfill so authors can experiment today and help inform standardization.
Install
npm i text-wrap-minor-wordsUsage (ESM)
import { init } from 'text-wrap-minor-words';
// Process elements that compute to text-wrap: pretty
const ctrl = init({ observe: true });
// Optionally process a specific subtree later:
// ctrl.process(element);HTML markup should declare the language (lang) on blocks:
<main class="typo">
<p lang="it">Vado a casa con la bici.</p>
<p lang="fr">Je vais à Paris.</p>
<p lang="pl">Jestem w domu i czekam.</p>
<p lang="en">See Fig. 2 for details.</p>
<p lang="en">It was 20 °C at 9:30 am.</p>
<h2 lang="en">A display heading if you want to opt-in later</h2>
<!-- The library acts only where text-wrap: pretty is in effect -->
<!-- NBSP is inserted where appropriate; content remains otherwise intact. -->
<!-- Apostrophes/elision are intentionally out of scope for now. -->
</main>Usage (Browser global)
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/index.global.js"></script>
<script>
const ctrl = TextWrapMinorWords.init({ observe: true });
// ctrl.process(document.querySelector('.typo'));
// ctrl.disconnect();
</script>What it does
- Extends
text-wrap: prettybehavior by inserting NBSP after minor words in languages where this is customary (Romance, Slavic, Greek by default). - Applies safe joins regardless of language:
- label + number:
Fig. 2,p. 12,§ 5 - number + unit:
20 °C,9:30 am - honorific + Name:
Mr. Smith,Dr. Müller - initials sequence:
J. K. Rowling - numeric ranges: adds WORD JOINER around the dash
- label + number:
Lite usage (single‑locale, recommended for production)
Load only the core engine and register the locales you actually use. This keeps bundles small and avoids shipping unnecessary language data.
// Load the lite entry (no locales included)
import { init, registerLanguage } from 'text-wrap-minor-words/lite';
// Register only the locales you need (example: Italian)
import it from 'text-wrap-minor-words/locales/it.json';
registerLanguage('it', it);
// Optionally preload the same tags here (helps the engine avoid a first lookup)
init({ languages: ['it'] });Browser global (lite):
<script type="module">
import { init, registerLanguage } from 'https://cdn.jsdelivr.net/npm/[email protected]/dist/lite.mjs';
import it from 'https://cdn.jsdelivr.net/npm/[email protected]/locales/it.json' assert { type: 'json' };
registerLanguage('it', it);
init({ languages: ['it'] });
</script>Notes:
- The default (non‑lite) entry includes built‑in locale data for quick trials. Prefer the lite entry in production apps.
- You can register multiple locales by calling
registerLanguage(tag, data)more than once.
Advanced (CSS opt‑in per container):
You can enable the minor‑words preference declaratively on specific elements via CSS. This is useful for safe‑only languages where you want the behavior only in display contexts.
h1[lang="en"], h2[lang="en"] {
--text-wrap-preferences: minor-words;
--text-wrap-minor-threshold: 1; /* glue after 1‑letter tokens */
--text-wrap-minor-stoplist: "of to in on at for by a I"; /* optional additions */
}These custom properties are read when the preference is opted‑in on the element (or an ancestor) and the current language doesn’t have a built‑in minor‑words configuration.
Language defaults
- Active by default: be, bg, ca, cs, el, es, fr, gl, hr, it, mk, pl, pt, ro, ru, sk, sl, sr, uk.
- Neutral by default: da, de, en, lt, lv, nb, nl, nn, sv (only safe joins; no minor-words glue in body text).
- The effective language is taken from
lang(with fallback to the document root).
API
type InitOptions = {
selector?: string; // default: 'html' (scans under elements that compute to text-wrap: pretty)
languages?: string[]; // pre-load specific BCP-47 primary subtags (e.g., ['it','en'])
observe?: boolean; // MutationObserver to process added/edited content
context?: 'all'|'display'; // if 'display', only process headings/DT
};Returns a controller { process(root?), disconnect() }.
Configuration
- The library reads
langto select language defaults. - Neutral languages (e.g.,
en,de,nl) do not enable minor-words glue by default; only safe joins apply. - For display-only processing, pass
{ context: 'display' }. - You can pre-load language data via
languages: ['it','fr']to avoid first-use compile cost.
CSS preference gate and overrides:
- You can opt in/out declaratively per container with
--text-wrap-preferences: minor-words | none. On browsers withouttext-wrap: pretty, authors can set the preference under@supports not (text-wrap: pretty). - When the preference is active and the current language has no built‑in
minorWords, the engine reads optional overrides:--text-wrap-minor-threshold: <number>(glue after tokens up to N chars; typical value: 1)--text-wrap-minor-stoplist: "space-separated tokens"(per‑container additions)
Example (headings in English):
h1[lang="en"], h2[lang="en"] {
--text-wrap-preferences: minor-words;
--text-wrap-minor-threshold: 1;
--text-wrap-minor-stoplist: "of to in on at for by a I";
}Performance & constraints
- One TreeWalker pass over text nodes under elements that compute to
text-wrap: pretty. - No layout measurements; O(n) string replacements; NBSP insertions are idempotent.
- Skips
pre, code, kbd, samp, script, style, textarea, math, svg, [contenteditable]and basic URL/email-like text. - Does not cross inline elements by default.
Browser support
- The polyfill acts only where the computed style is
text-wrap: pretty. On browsers without support, it effectively no-ops unless the author explicitly applies an opt-in selector targeting the same blocks. - The library itself targets modern evergreen browsers (ES2020, Intl.Segmenter optional).
Contributing language data
- Language tables live in
src/data/languages/<lang>.json. - To propose additions:
- Add or edit the JSON with
minorWords(threshold + list) and lexical categories (labels,honorifics,abbrCompounds). - Add unit tests in
tests/engine.spec.ts(or a new spec file) with input → expected output. - Run
npm run test.
- Add or edit the JSON with
Tests & benchmarks
- Run tests:
npm run test - Run a simple throughput benchmark:
npm run bench
Limitations
- No apostrophe/elision handling (out of scope for now).
- Does not measure layout; it applies static glues consistent with editorial conventions.
- Does not cross inline elements unless an advanced option is introduced in future.
Standards context
- This repo accompanies the explainer (
explainer.md) that proposestext-wrap-preferences: minor-wordsas an additive, language-sensitive author preference for paragraph-aware wrapping. For a consolidated list of safe labels/honorifics used by the polyfill, seedocs/LEXICON.md.
License
MIT
