npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

ua-word-stress-wasm

v0.5.2

Published

Ukrainian word stress engine — dictionary lookup, IPA transcription, morphology. Rust/WASM, no init() required.

Downloads

48

Readme

ua-word-stress-wasm

Ukrainian word stress lookup + full IPA phonetic transcription, compiled to WebAssembly from Rust.

The dictionary is embedded in the WASM binary — no separate data file to host or fetch.
Works in browsers (ESM) and Node.js.

| Feature | ua-word-stress (TS trie) | ua-word-stress-wasm (this) | |---|---|---| | Stress lookup | ✓ | ✓ | | Full IPA transcription | — | ✓ | | Syllabification | — | ✓ | | Morphology (POS, UD features, lemma) | — | ✓ | | Data file to serve | 9.4 MB .ctrie.gz | none (embedded) | | WASM binary size | — | ~14 MB |


Database statistics

| Metric | Value | |---|---| | Word forms | 3,008,723 | | Binary format | bzip2-compressed binary (V2) | | Phonetic pipeline passes | 6 | | IPA standard | IPA (Steriopolo 2012, Savchenko 2014) |


Installation

# pnpm (recommended)
pnpm add ua-word-stress-wasm

# npm
npm install ua-word-stress-wasm

# yarn
yarn add ua-word-stress-wasm

Quick start

import { lookup, mark, stressIndex, stressIndexBatch, wordCount } from 'ua-word-stress-wasm';

// No init() needed — the WASM is loaded automatically by your bundler.

// Stress-marked word
mark('університет');         // → 'університе́т'
mark('замок');               // → 'за́мок'  (first reading)

// Stress index (0-based vowel position; -1 = unknown)
stressIndex('мама');         // → 0
stressIndex('університет'); // → 4

// Full lookup with IPA, syllables and morphology
const r = lookup('замок');
r.readings[0].stressedForm;  // → 'за́мок'
r.readings[0].ipa;           // → 'zɑmɔk'
r.readings[0].ipaSyllables;  // → ['ˈzɑ', 'mɔk']
r.readings[0].morph[0].pos;  // → ['NOUN']

// Batch lookup — much faster than calling stressIndex() in a loop
const indices = stressIndexBatch(['мама', 'тато', 'xyz']);
// → Int32Array [0, 0, -1]

// Dictionary size
wordCount(); // → 3008723

Vite / webpack / Rollup

No configuration needed. Vite, webpack 5, and Rollup all handle .wasm imports natively:

import { mark } from 'ua-word-stress-wasm';

mark('університет'); // → 'університе́т'

Node.js (ESM, Node 20+)

import { mark } from 'ua-word-stress-wasm';

console.log(mark('привіт')); // → 'приві́т'

API reference

mark(word: string): string

Returns the word with a combining acute accent (U+0301) placed over the stressed vowel.
Returns the word unchanged if it is not in the dictionary.

mark('мама');         // → 'ма́ма'
mark('університет'); // → 'університе́т'
mark('xyz');          // → 'xyz'  (unknown — returned as-is)

markBatch(words: Array<string>): Array<string>

Batch variant of mark. Takes an array of words and returns an array of stress-marked strings.
Words not in the dictionary are returned unchanged.
Significantly faster than calling mark() in a loop — ideal for processing pasted text blocks.

markBatch(['мама', 'тато', 'університет', 'xyz']);
// → ['ма́ма', 'та́то', 'університе́т', 'xyz']

stressIndex(word: string): number

Returns the 0-based syllable index of the stressed syllable, or -1 if the word is not in the dictionary.

This is the minimal-overhead call — no object allocation. The value is the same as readings[0].syllableIndex from lookup() and is directly usable as a syllable position (e.g. syllables[syllableIndex] gives the stressed syllable).

stressIndex('мама');         // → 0  (first syllable: ма́-ма)
stressIndex('університет'); // → 4  (fifth syllable: у-ні-вер-си-те́т)
stressIndex('xyz');          // → -1 (unknown)

stressIndexBatch(words: Array<string>): Int32Array

Batch variant of stressIndex. Takes a JS Array of strings and returns an Int32Array of 0-based syllable indices (one per word, -1 = unknown).
Significantly faster than looping over stressIndex() because the JS↔WASM encoding overhead is amortised.

stressIndexBatch(['мама', 'тато', 'дитина', 'xyz']);
// → Int32Array [0, 0, 1, -1]

lookupBatch(words: Array<string>): Array<LookupResult>

Batch variant of lookup. Returns an array of full result objects, one per word.
Ideal for processing large texts — avoids per-call JS↔WASM overhead.

const results = lookupBatch(['мама', 'замок', 'xyz']);
results[0].readings[0].stressedForm; // → 'ма́ма'
results[1].readings[0].ipa;          // → 'zɑmɔk'
results[2].readings;                 // → [] (unknown)

lookup(word: string): LookupResult

Returns a full result object with all stress readings, IPA, syllabification, tokens and morphology.
readings is an empty array when the word is not in the dictionary.

interface LookupResult {
  form: string;        // normalised input word
  readings: Reading[];
}

interface Reading {
  syllableIndex: number; // 0-based syllable index of the stressed syllable
  stressFromEnd: number; // syllables from the end (1 = ultima, 2 = penult, …)
  syllableCount: number;
  form: string;          // normalised form (same as input for most words)
  stressedForm: string;  // word with U+0301 accent on stressed vowel

  wordSyllables: string[];  // Cyrillic syllables, e.g. ['за', 'мок']
  ipa: string;              // full IPA string, e.g. 'zɑmɔk'
  ipaSyllables: string[];   // IPA per syllable with stress mark, e.g. ['ˈzɑ', 'mɔk']

  tokens: Token[];
  morph: MorphReading[];
  confidence: string | null;
}

interface Token {
  ipa: string;        // IPA of this token, e.g. 'm', 'ɑ', 'tʲ'
  source: string;     // source grapheme(s)
  type: string;       // 'Vowel' | 'Consonant' | 'Glide' | 'Separator'
  vowelIndex: number; // -1 for consonants; 0-based vowel position for vowels
  stressed: boolean;
  palatalized: boolean;
}

interface MorphReading {
  pos: string[];                       // e.g. ['NOUN'], ['VERB']
  feats: Record<string, string[]>;     // UD morphological features
  lemma: string | null;
  definition: string | null;
}

Example — heteronym:

const r = lookup('замок');
// r.readings[0] → { stressedForm: 'за́мок', ipa: 'zɑmɔk', … }
// r.readings[1] → { stressedForm: 'замо́к', ipa: 'zɑmɔk', … }
// → use morph[0].feats to disambiguate by context

Example — IPA transcription:

const r = lookup('правда');
r.readings[0].ipa;          // → 'prɑwdɑ'
r.readings[0].ipaSyllables; // → ['ˈprɑw', 'dɑ']
r.readings[0].tokens.map(t => t.ipa);
// → ['p', 'r', 'ɑ', 'w', 'd', 'ɑ']
// (в realised as [w] — post-vocalic before consonant)

transcribe(word: string, syllableIndex: number): TranscriptionResult

Low-level IPA transcription for a word when you already know the syllable stress position.
Bypasses the dictionary lookup entirely — useful for OOV words or ML-predicted stress.

interface TranscriptionResult {
  word: string;
  stressIndex: number;   // same as input syllableIndex
  ipa: string;
  ipaSyllables: string[];
  wordSyllables: string[];
  syllableCount: number;
  tokens: Token[];
}
transcribe('слово', 0);
// → { ipa: 'slɔwɔ', ipaSyllables: ['ˈslɔ', 'wɔ'], … }

wordCount(): number

Returns the total number of word forms in the embedded dictionary.

wordCount(); // → 3008723

Phonetic pipeline

The IPA transcription runs 6 sequential passes over the tokenised word:

| Pass | Module | Description | |------|--------|-------------| | 1 | Tokenizer | Cyrillic graphemes → phoneme tokens | | 1.5 | Geminates | Merge adjacent identical consonants into | | 2 | Palatalization | Regressive dental softening (ь, і, я, є, ю propagation) | | 3 | Voicing assimilation | Obstruent voicing/devoicing before voiced/voiceless clusters | | 3b | Place assimilation | Sibilant + affricate place assimilation | | 4 | Vowel allophones | Stressed vowel marking; unstressed vowel reduction | | 5 | /в/ allophones | Positional [w] / [u̯] / [ʋ] selection | | 6 | Syllabifier | Sonority-based boundary placement (Savchenko 2014 rules) |

/в/ allophone rules

| Context | Allophone | Example | |---------|-----------|---------| | Post-vocalic + pre-consonantal | [w] | правдаprɑwdɑ | | Post-vocalic + word-final | [u̯] | кровkrɔu̯ | | Word-initial (default) | [ʋ] | вінʋin |

Syllabification rules (summary)

The syllabifier follows Savchenko (2014) §20, applied after all assimilation passes:

| Rule | Condition | Result | |------|-----------|--------| | 1 | Single intervocalic consonant | Onset of next syllable | | 2a | Both obstruents, both voiceless | Both to next syllable | | 2b | Both obstruents, both voiced, same manner | Both to next syllable | | 2c | Obstruent + sonorant | Both to next syllable | | 2α | Both sonorants | Split: first in coda | | 2β | Sonorant first, any second | Sonorant in coda | | 2γ | Voiced + voiceless obstruent | Split between them | | 2δ | Voiced fricative + voiced stop | Split between them | | 3a | First consonant is sonorant | Sonorant in coda, rest to next | | 3b | Obstruents + sonorant | All to next syllable | | 3c | All voiceless obstruents | All to next syllable | | 3d | Voiced + voiceless(es) + sonorant | Split after voiced |


Data sources

The embedded dictionary aggregates four Ukrainian lexical databases:

| Source | Records | License | |--------|---------|---------| | kaikki.org Wiktionary extract | ~2 M word forms | CC BY-SA 4.0 | | lang-uk trie stress dictionary | ~2.9 M word forms | MIT | | lang-uk plain-text stress dictionary | ~2.9 M word forms | MIT / ULIF | | ua_variative_stressed_words (manual curation) | ~150 lemmas | — |

Duplicates are resolved by a lossless merge that preserves all unique readings.


Academic sources

The IPA transcription pipeline is implemented according to:

| Source | Scope | |--------|-------| | Стеріополо О. (2012). «Українська фонетична система у парадигмі МФА». Науковий вісник УжНУ. Філологія. Соціальні комунікації 27: 51–58. | IPA consonant/vowel system, transcription rules, allophony | | Савченко І. С. (2014). Фонетика, орфоепія і графіка сучасної української мови. | Syllabification rules (§19–20), sonority scale, phonological/phonetic level | | Касьянова (2015). [/в/ allophones in standard Ukrainian] | Positional allophony of /в/: [w], [u̯], [ʋ], [ʋʲ] rules |


Relation to ua-word-stress

ua-word-stress and ua-word-stress-wasm share the same underlying stress dictionary data but serve different use cases:

  • ua-word-stress — pure TypeScript, ~9 MB compressed trie served as a separate asset. Best for applications that only need stress lookup and want zero native code.
  • ua-word-stress-wasm — Rust/WASM, ~14 MB self-contained binary. Best for applications that need full IPA output, syllabification, morphology, or prefer a zero-fetch dependency.

Both packages are published from the same repository: ua-stress-engine.


License

AGPL-3.0-or-later — see LICENSE.

The dictionary data retains the licenses of its contributing sources (see Data sources above). Wiktionary-derived data is CC BY-SA 4.0; attribution to Wiktionary contributors is required when redistributing.