npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@node-rs/jieba

v1.10.3

Published

Fastest Chinese word segmentation in Node.js

Downloads

202,927

Readme

@node-rs/jieba

jieba-rs binding to Node.js

Without node-gyp

node-rs/jieba was prebuilt into binary already, so you don't need fighting with node-gyp and c++ toolchain.

Performance

Due to jieba-rs is 33% faster than cppjieba, and N-API is faster than v8 C++ API, @node-rs/jieba is faster than nodejieba.

@node-rs/jieba x 3,763 ops/sec ±1.18% (92 runs sampled)
nodejieba x 2,783 ops/sec ±0.67% (91 runs sampled)
Cut 1184 words bench suite: Fastest is @node-rs/jieba

@node-rs/jieba x 16.10 ops/sec ±1.58% (44 runs sampled)
nodejieba x 9.81 ops/sec ±2.39% (29 runs sampled)
Cut 246568 words bench suite: Fastest is @node-rs/jieba

@node-rs/jieba x 1,739 ops/sec ±0.87% (92 runs sampled)
nodejieba x 931 ops/sec ±1.31% (89 runs sampled)
Tag 1184 words bench suite: Fastest is @node-rs/jieba

@node-rs/jieba x 6.19 ops/sec ±2.01% (20 runs sampled)
nodejieba x 3.06 ops/sec ±5.39% (12 runs sampled)
Tag 246568 words bench suite: Fastest is @node-rs/jieba

Support matrix

| | node12 | node14 | node16 | node18 | | ---------------- | ------ | ------ | ------ | ------ | | Windows x64 | ✓ | ✓ | ✓ | ✓ | | Windows x32 | ✓ | ✓ | ✓ | ✓ | | Windows arm64 | ✓ | ✓ | ✓ | ✓ | | macOS x64 | ✓ | ✓ | ✓ | ✓ | | macOS arm64 | ✓ | ✓ | ✓ | ✓ | | Linux x64 gnu | ✓ | ✓ | ✓ | ✓ | | Linux x64 musl | ✓ | ✓ | ✓ | ✓ | | Linux arm gnu | ✓ | ✓ | ✓ | ✓ | | Linux arm64 gnu | ✓ | ✓ | ✓ | ✓ | | Linux arm64 musl | ✓ | ✓ | ✓ | ✓ | | Android arm64 | ✓ | ✓ | ✓ | ✓ | | Android armv7 | ✓ | ✓ | ✓ | ✓ | | FreeBSD x64 | ✓ | ✓ | ✓ | ✓ |

Usage

const { load, cut } = require('@node-rs/jieba')

load()
// loadDict(fs.readFileSync(...))
// loadTFIDFDict(fs.readFileSync(...))

cut('我们中出了一个叛徒', false)

// ["我们", "中", "出", "了", "一个", "叛徒"]
const { load, cut } = require('@node-rs/jieba')

load()

extract(
  '今天纽约的天气真好啊,京华大酒店的张尧经理吃了一只北京烤鸭。后天纽约的天气不好,昨天纽约的天气也不好,北京烤鸭真好吃',
  3,
)

// [
//   { keyword: '北京烤鸭', weight: 1.3904870323222223 },
//   { keyword: '纽约', weight: 1.121759684755 },
//   { keyword: '天气', weight: 1.0766573240983333 }
// ]

Load custom dictionaries

const { loadDict, cut } = require('@node-rs/jieba')
const customDict = ['哪行 50', '干一行 51', '行一行 52', '行行 53']

const dictBuffer = Buffer.from(customDict.join('\n'), 'utf-8')
// loadDict doc: https://github.com/fxsjy/jieba?tab=readme-ov-file#%E8%BD%BD%E5%85%A5%E8%AF%8D%E5%85%B8
loadDict(dictBuffer)

const text = '人要是行干一行行一行,一行行行行行,行行行干哪行都行'
const output = cut(text, false)
console.log('分词结果⤵️\n', output.join('/'))
// Before: 人/要是/行/干/一行行/一行/,/一行行/行/行/行/,/行/行/行/干/哪/行/都行
// After:  人/要是/行/干一行/行一行/,/一行行/行行/行/,/行行/行/干/哪行/都行
// Pinyin: rén yào shi xíng gàn yì háng xíng yì háng , yì háng xíng háng háng xíng , háng háng xíng gàn nǎ háng dōu xíng