hunspell-native

v1.0.1

Published

22 days ago

Cross-platform Node.js binding for Hunspell, the spellcheck library used by LibreOffice, Firefox, Chrome and Adobe.

Downloads

352

0High
0Medium
0Low

vweevers

spell spelling spellcheck spellchecker hunspell

hunspell-native

Cross-platform Node.js binding for Hunspell, the spellcheck library used by LibreOffice, Firefox, Chrome and Adobe.

Highlights :sparkles:

Native performance
Exposes most of Hunspell's functionality
- Spell checking and suggestions
- Personal dictionaries;
- Stemming and word analysis.
Extensive unit tests;
Built with Node-API which provides forward-compatibility.

Quick Start

import { Hunspell } from 'hunspell-native'
import { loadDictionary } from 'language-packages'

const dictionary = await loadDictionary('en-US')
const hunspell = new Hunspell(dictionary)

// Check if a word is spelled correctly
console.log(await hunspell.spell('color')) // true
console.log(await hunspell.spell('colour')) // false (in US English)

// Get suggestions
console.log(await hunspell.suggest('colour')) // ['color', 'co lour', ...]

// Alternative synchronous API
const isCorrect = hunspell.spellSync('color')
const suggestions = hunspell.suggestSync('color')

The companion language-packages package provides on-demand downloads of wooorm/dictionaries.

Background

This is a fork of nodehun-native and thus nodehun. There's also nspell which is a pure JavaScript alternative. How do they compare?

The nspell package suffers from memory leaks (gigabytes) on certain dictionaries. It's great for English but not suitable for multilingual use.
The nodehun package uses an unmaintained fork of Hunspell that works with buffers instead of reading dictionaries from disk. It has a race condition on startup that can cause all words to be treated as incorrect (at least on German).
The nodehun-native package went back to using file paths and updated to the latest and official version of Hunspell (1.7.2).
This hunspell-native package adds ESM support, Windows support and prebuilt binaries for a variety of platforms.

Install

With npm do:

npm install hunspell-native

The npm package ships with prebuilt binaries for Linux, Mac and Windows, both x64 and ARM (except Linux ARM which fails to compile; help is welcome) including Alpine, and Linux flavors with an old glibc (2.28). Upon install, node-gyp-build will check if a compatible binary exists and fallback to compiling from source if it doesn't. In that case you'll need a valid node-gyp installation.

If you don't want to use the prebuilt binary for the platform you are installing on, specify the --build-from-source flag when you install:

npm install hunspell-native --build-from-source

API

Beware: Because reads and writes use locks, it's not advised to mix asynchronous and synchronous usage of the API. For example, if you were to remove a word synchronously while many different suggestion threads were working in the background, the remove call could take seconds to complete while it waits to take control of the read-write lock.

The examples below assume use of a US English dictionary.

`new Hunspell(dictionary)`

Create a new instance of Hunspell, passing in a dictionary object in the form of { aff: string, dic: string } where aff and dic are paths to the affix file and dictionary file. Both files must be UTF-8. Hunspell only reads from the files and doesn't write to them; methods like add() are not persistent.

`new Hunspell(aff: string, dic: string)`

Alternative function signature, provided for drop-in compatibility with nodehun-native.

`hunspell.spell(word: string): Promise<boolean>`

Yields true if the provided word is spelled correctly.

`hunspell.spellSync(word: string): boolean`

Synchronous version of spell().

`hunspell.suggest(word: string): Promise<string[] | null>`

Get suggestions to fix spelling. Yields an array of words that could match the provided word, ordered by most likely to be correct. Yields null if the input word is correct.

await hunspell.suggest('calor') // ['carol', 'valor', ...]
await hunspell.suggest('color') // null

`hunspell.suggestSync(word: string): string[] | null`

Synchronous version of suggest().

`hunspell.addDictionary(dictionaryFile: string): Promise<void>`

Add a dictionary file with additional words. Uses the affix file of the existing hunspell instance.

await hunspell.spell('colour') // false (incorrect)
await hunspell.addDictionary('./path/to/en-CA.dic')
await hunspell.spell('colour') // true (correct)

`hunspell.addDictionarySync(dictionaryFile: string): void`

Synchronous version of addDictionary().

`hunspell.add(word: string): Promise<void>`

Add a single word, which can also contain flags (as if it was a line in a dictionary file).

await hunspell.spell('colour') // false (incorrect)
await hunspell.add('colour')
await hunspell.spell('colour') // true (correct)

Note: colouring will still be considered incorrect. See the addWithAffix() example below.

`hunspell.addSync(word: string): void`

Synchronous version of add().

`hunspell.addWithAffix(word: string, example: string): Promise<void>`

Like add() except it also applies the example word's affix definition to the new word. Quoting upstream Hunspell documentation:

Uses a second root word as the model of the enabled affixation and compounding of the new word.

await hunspell.spell('colouring') // false (incorrect)
await hunspell.addWithAffix('colour', 'color')
await hunspell.spell('colouring') // true (correct)

`hunspell.addWithAffixSync(word: string, example: string): void`

Synchronous version of addWithAffix().

`hunspell.remove(word: string): Promise<void>`

Remove a word. This ignores flags and just strips words that match.

await hunspell.suggest('color') // null (correct)
await hunspell.remove('color')
await hunspell.suggest('color') // ['colon', 'dolor', ...]

`hunspell.removeSync(word: string): void`

Synchronous version of remove().

`hunspell.stem(word: string): Promise<string[]>`

Get the stems (root forms) of a word. This is useful for e.g. search indexing. The result will include the input word if that's also a valid stem (like running) and it may return multiple stems if the word has multiple meanings (leaves).

await hunspell.stem('mice') // ['mouse']
await hunspell.stem('running') // ['running', 'run']
await hunspell.stem('quickly') // ['quick']
await hunspell.stem('leaves') // ['leave', 'leaf']
await hunspell.stem('abc123') // []

`hunspell.stemSync(word: string): string[]`

Synchronous version of stem().

`hunspell.analyze(word: string): Promise<string[]>`

Yields a morphological analysis of a word. Consult upstream documentation for details on the notation.

// E.g. [' st:telling ts:0', ' st:tell ts:0 al:told is:Vg']
await hunspell.analyze('telling')

`hunspell.analyzeSync(word: string): string[]`

Synchronous version of analyze().

`hunspell.generate(word: string, word2: string): Promise<string[]>`

Generates a variation of a word by matching the morphological structure of the second word.

await hunspell.generate('telling', 'ran') // ['told']
await hunspell.generate('told', 'run') // ['tell']

`hunspell.generateSync(word: string): string[]`

Synchronous version of generate().

`hunspell.getWordCharacters(): string | undefined`

Get the characters that are considered valid within words. Useful for tokenization. Returns undefined if the affix file lacks this information (WORDCHARS).

hunspell.getWordCharacters() // 0123456789'.-’

Development

Getting Started

This repository uses git submodules. Clone it recursively:

git clone --recurse-submodules https://github.com/vweevers/hunspell-native.git

Alternatively, initialize submodules inside the working tree:

cd hunspell-native
git submodule update --init --recursive

Building

Build the native addon with npm run rebuild, or with npm install if you didn't already install dependencies.

Note: the build process applies a patch to the hunspell submodule to replace dynamic_cast with static_cast, removing the need for RTTI. If compiling succeeds, the patch is reverted, keeping the submodule clean.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

hunspell-native

Highlights :sparkles:

Quick Start

Background

Install

API

new Hunspell(dictionary)

new Hunspell(aff: string, dic: string)

hunspell.spell(word: string): Promise<boolean>

hunspell.spellSync(word: string): boolean

hunspell.suggest(word: string): Promise<string[] | null>

hunspell.suggestSync(word: string): string[] | null

hunspell.addDictionary(dictionaryFile: string): Promise<void>

hunspell.addDictionarySync(dictionaryFile: string): void

hunspell.add(word: string): Promise<void>

hunspell.addSync(word: string): void

hunspell.addWithAffix(word: string, example: string): Promise<void>

hunspell.addWithAffixSync(word: string, example: string): void

hunspell.remove(word: string): Promise<void>

hunspell.removeSync(word: string): void

hunspell.stem(word: string): Promise<string[]>

hunspell.stemSync(word: string): string[]

hunspell.analyze(word: string): Promise<string[]>

hunspell.analyzeSync(word: string): string[]

hunspell.generate(word: string, word2: string): Promise<string[]>

hunspell.generateSync(word: string): string[]

hunspell.getWordCharacters(): string | undefined

Development

Getting Started

Building

License

`new Hunspell(dictionary)`

`new Hunspell(aff: string, dic: string)`

`hunspell.spell(word: string): Promise<boolean>`

`hunspell.spellSync(word: string): boolean`

`hunspell.suggest(word: string): Promise<string[] | null>`

`hunspell.suggestSync(word: string): string[] | null`

`hunspell.addDictionary(dictionaryFile: string): Promise<void>`

`hunspell.addDictionarySync(dictionaryFile: string): void`

`hunspell.add(word: string): Promise<void>`

`hunspell.addSync(word: string): void`

`hunspell.addWithAffix(word: string, example: string): Promise<void>`

`hunspell.addWithAffixSync(word: string, example: string): void`

`hunspell.remove(word: string): Promise<void>`

`hunspell.removeSync(word: string): void`

`hunspell.stem(word: string): Promise<string[]>`

`hunspell.stemSync(word: string): string[]`

`hunspell.analyze(word: string): Promise<string[]>`

`hunspell.analyzeSync(word: string): string[]`

`hunspell.generate(word: string, word2: string): Promise<string[]>`

`hunspell.generateSync(word: string): string[]`

`hunspell.getWordCharacters(): string | undefined`