npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

sentencex-wasm

v0.1.13

Published

Sentence segmentation library with wide language support optimized for speed and utility.

Readme

SentenceX WASM Bindings

A WebAssembly binding for the SentenceX sentence segmentation library. This allows you to use the fast, multilingual sentence segmentation capabilities of SentenceX directly in web browsers and Node.js applications.

Installation

npm install sentencex-wasm

Usage

Basic Sentence Segmentation

import init, { segment } from 'sentencex-wasm';

async function run() {
    // Initialize the WASM module
    await init();

    const text = "The James Webb Space Telescope (JWST) is a space telescope specifically designed to conduct infrared astronomy. The U.S. National Aeronautics and Space Administration (NASA) led Webb's design and development.";
    const sentences = segment("en", text);

    sentences.forEach((sentence, index) => {
        console.log(`${index + 1}. ${sentence}`);
    });
}

run();

Detailed Sentence Boundaries

For more advanced use cases, you can get detailed information about sentence boundaries:

import init, { get_sentence_boundaries } from 'sentencex-wasm';

async function run() {
    await init();

    const text = "Hello world. This is a test.\n\nNew paragraph.";
    const boundaries = get_sentence_boundaries("en", text);

    boundaries.forEach(boundary => {
        console.log({
            text: boundary.text,
            start: boundary.start_index,
            end: boundary.end_index,
            symbol: boundary.boundary_symbol,
            isParaBreak: boundary.is_paragraph_break
        });
    });
}

run();

API Reference

segment(language: string, text: string): string[]

Segments a given text into sentences based on the specified language.

Parameters:

  • language - Language code (e.g., "en" for English, "fr" for French)
  • text - Text to be segmented

Returns:

  • Array of sentence strings

get_sentence_boundaries(language: string, text: string): SentenceBoundary[]

Returns detailed sentence boundaries for analysis and advanced processing.

Parameters:

  • language - Language code (e.g., "en" for English, "fr" for French)
  • text - Text to be analyzed

Returns:

  • Array of SentenceBoundary objects containing:
    • start_index: Byte index where the sentence starts
    • end_index: Byte index where the sentence ends
    • text: The sentence text
    • boundary_symbol: Punctuation mark that ended the sentence (if any)
    • is_paragraph_break: Whether this boundary represents a paragraph break

Language Support

SentenceX supports sentence segmentation for over 240 languages with intelligent fallback chains. Common language codes include:

  • en - English
  • es - Spanish
  • fr - French
  • de - German
  • it - Italian
  • pt - Portuguese
  • ja - Japanese
  • zh - Chinese
  • ar - Arabic
  • hi - Hindi
  • And many more...

Performance

The WASM bindings provide near-native performance for sentence segmentation while running in JavaScript environments. The segmentation is non-destructive, meaning the original text can be reconstructed by joining the segments.

License

MIT license. See the main project repository for details.