npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

blingfire

v0.1.1

Published

Node.js bindings for BlingFire tokenization library

Downloads

697

Readme

blingfire

Node.js bindings for BlingFire, a tokenization library from Microsoft.

BlingFire provides text tokenization and sentence splitting with high performance and no external dependencies.

Features

  • Native C++ implementation with minimal overhead
  • Production-tested tokenization and sentence splitting
  • Full TypeScript support with type definitions
  • No runtime dependencies beyond Node.js
  • Includes precompiled binaries for Linux (arm64, amd64) and macOS (arm64, amd64)

Installation

npm install blingfire

Platform Support

This package includes precompiled binaries for:

  • Linux: arm64, amd64
  • macOS (darwin): arm64, amd64

For these platforms, no build tools are required. For other platforms, you will need to build from source.

Building from Source (Other Platforms)

If precompiled binaries are not available for your platform, you will need:

  • Node.js 16.x or higher
  • Python (for node-gyp)
  • C++ compiler:
    • macOS: Xcode Command Line Tools (xcode-select --install)
    • Linux: GCC or Clang (apt-get install build-essential)
    • Windows: Visual Studio Build Tools

During source builds, the package will:

  1. Copy BlingFire source files to the vendor directory
  2. Compile the native C++ addon
  3. Build TypeScript files

Usage

Basic Example

import { textToSentences, textToWords } from 'blingfire';

// Sentence splitting
const text = 'Hello world. This is a test. How are you?';
const sentences = textToSentences(text);
console.log(sentences);
// Output: "Hello world.\nThis is a test.\nHow are you?"

// Get sentences as array
const sentenceArray = sentences.split('\n');
console.log(sentenceArray);
// Output: ["Hello world.", "This is a test.", "How are you?"]

// Word tokenization
const words = textToWords(text);
console.log(words);
// Output: "Hello world . This is a test . How are you ?"

// Get words as array
const wordArray = words.split(' ');
console.log(wordArray);
// Output: ["Hello", "world", ".", "This", "is", "a", "test", ".", "How", "are", "you", "?"]

CommonJS

const { textToSentences, textToWords } = require('blingfire');

const text = 'Dr. Smith went to the U.S. He was happy.';
const sentences = textToSentences(text);
console.log(sentences.split('\n'));

Advanced Examples

Processing Documents

import { textToSentences, textToWords } from 'blingfire';

// Process a document
const document = `
  Natural language processing (NLP) is a subfield of AI.
  It focuses on the interaction between computers and humans.
  Dr. Jones believes it's revolutionary!
`;

// Split into sentences
const sentences = textToSentences(document).split('\n');

// Tokenize each sentence
sentences.forEach((sentence, i) => {
  const words = textToWords(sentence).split(' ');
  console.log(`Sentence ${i + 1} has ${words.length} tokens`);
});

Unicode and Multilingual Text

import { textToSentences, textToWords } from 'blingfire';

// Works with Unicode
const multiLang = 'Hello world. 你好世界。Привет мир.';
const sentences = textToSentences(multiLang);
console.log(sentences.split('\n'));

// Handles emojis and special characters
const withEmoji = 'I love coding! 🚀 It makes me happy. 😊';
const words = textToWords(withEmoji);
console.log(words);

API Reference

textToSentences(text: string): string

Splits text into sentences using BlingFire's sentence boundary detection.

Parameters:

  • text (string): The input text to split into sentences

Returns:

  • (string): Sentences separated by newline characters (\n)

Throws:

  • TypeError: If input is not a string
  • Error: If BlingFire processing fails

Example:

const result = textToSentences('First. Second. Third.');
// Returns: "First.\nSecond.\nThird."

textToWords(text: string): string

Tokenizes text into words using BlingFire's word boundary detection.

Parameters:

  • text (string): The input text to tokenize

Returns:

  • (string): Words separated by space characters

Throws:

  • TypeError: If input is not a string
  • Error: If BlingFire processing fails

Example:

const result = textToWords('Hello, world!');
// Returns: "Hello , world !"

How It Works

This package wraps the BlingFire C++ library using Node.js N-API. During installation:

  1. Vendor Script: Copies BlingFire source files from a git submodule to a vendor/ directory
  2. Native Compilation: Compiles both BlingFire and the N-API binding using node-gyp
  3. TypeScript Build: Compiles TypeScript wrapper to JavaScript

The package includes all BlingFire source code, so no external dependencies are required at runtime.

Development

Setup

# Clone with submodules
git clone --recursive <repository-url>
cd blingfire

# Install dependencies
npm install

# Run tests
npm test

# Build (TypeScript + Native)
npm run build:typescript
npm run build:native

Testing

# Run tests once
npm test

# Watch mode
npm run test:watch

Building from Source

# Clean build artifacts
npm run clean

# Rebuild everything
npm install
npm run build:typescript
npm run build:native

License

The contents of this repository are licensed under the MIT License. See the LICENSE file for more information.

Credits

  • BlingFire by Microsoft
  • This package wraps the C++ library with Node.js N-API bindings

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.