npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, πŸ‘‹, I’m Ryan HefnerΒ  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you πŸ™

Β© 2024 – Pkg Stats / Ryan Hefner

chunk-text

v2.0.1

Published

πŸ”ͺ chunk/split a string by length without cutting/truncating words.

Downloads

7,946

Readme

Chunk Text

chunk/split a string by length without cutting/truncating words.

const out = chunk('hello world how are you?', 7);
/* ['hello', 'world', 'how are', 'you?'] */

Installation

$ npm install chunk-text
# yarn add chunk-text

Usage

All number values are parsed according to Number.parseInt.

const chunk = require('chunk-text');

chunk(text, chunkSize);

Chunks the text string into an array of strings that each have a maximum length of chunkSize.

const out = chunk('hello world how are you?', 7);
/* ['hello', 'world', 'how are', 'you?'] */

If no space is detected before chunkSize is reached, then it will truncate the word to always ensure the resulting text chunks have at maximum a length of chunkSize.

const out = chunk('hello world', 4);
/* ['hell', 'o', 'worl', 'd'] */

chunk(text, chunkSize, chunkOptions);

Chunks the text string into an array of strings that each have a maximum length of chunkSize, as determined by chunkOptions.charLengthMask.

The default behavior if chunkOptions.charLengthMask is excluded is equal to chunkOptions.charLengthMask=-1.

For single-byte characters, chunkOptions.charLengthMask never changes the results.

For multi-byte characters, chunkOptions.charLengthMask allows awareness of multi-byte glyphs according to the following table:

| chunkOptions.charLengthMask | result | |-------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | -1 | - same as default, same as chunkOptions.charLengthMask=1- each character counts as 1 towards length | | 0 | - each character counts as the number of bytes it contains | | >0 | - each character counts as the number of bytes it contains, up to a limit of chunkOptions.charLengthMask=N- a 7-byte ZWJ emoji such as runningPerson+ZWJ+femaleSymbol (πŸƒπŸ½β€β™€οΈ) counts as 2, when chunkOptions.charLengthMask=2 |

You can also substitute from the default chunkOptions.charLengthType property of length to TextEncoder.

This enables you to pass any object to chunkOptions.textEncoder which matches the signature, chunkOptions.textEncoder.encode(text).length

If your environment natively contains the TextEncoder prototype and chunkOptions.textEncoder isn't provided,

the module attempts new TextEncoder() in order to use this chunkOptions.charLengthType.

If

  • chunkOptions.charLengthType is set to TextEncoder.
  • chunkOptions.textEncoder isn't provided.
  • TextEncoder prototype isn't provided by the environment.

Then

  • ReferenceError will occur.

End If

// one woman runner emoji with a colour is seven bytes, or five characters
// RUNNER(2) + COLOUR(2) + ZJW + GENDER + VS15
// (actually encodes to 17)
const runner = 'πŸƒπŸ½β€β™€οΈ';

const outDefault = chunk(runner+runner+runner, 4);
/* [ 'πŸƒπŸ½β€β™€οΈπŸƒπŸ½β€β™€οΈπŸƒπŸ½β€β™€οΈ' ] */

const outZero = chunk(runner+runner+runner, 4, { charLengthMask: 0 });
/* [ 'πŸƒπŸ½β€β™€οΈ', 'πŸƒπŸ½β€β™€οΈ', 'πŸƒπŸ½β€β™€οΈ' ] */

const outTwo = chunk(runner+runner+runner, 4, { charLengthMask: 2 });
/* [ 'πŸƒπŸ½β€β™€οΈπŸƒπŸ½β€β™€οΈ', 'πŸƒπŸ½β€β™€οΈ' ] */

// FLAG + RAINBOW
// 2 each as length, 4 each as TextEncoder
// 4 as length, 8 as TextEncoder
// Node v14.5.0 does not provide TextEncoder natively.
const flags = 'πŸ³οΈβ€πŸŒˆπŸ³οΈβ€πŸŒˆ';

// \/ will fail if your environment doesn't already have TextEncoder prototype \/
chunk(flags, 8, { charLengthMask: 0, charLengthType: 'TextEncoder' });
// [ 'πŸ³οΈβ€πŸŒˆ', 'πŸ³οΈβ€πŸŒˆ' ]
// /\ will fail if your environment doesn't already have TextEncoder prototype /\

chunk(flags, 4, {
  charLengthMask: 0,
  charLengthType: 'TextEncoder',
  textEncoder: new TextEncoder(),
})
// [ 'πŸ³οΈβ€πŸŒˆ', 'πŸ³οΈβ€πŸŒˆ' ]

chunk(flags, 999, {
  charLengthMask: 0,
  charLengthType: 'TextEncoder',
  textEncoder: {
    encode: () => ({ length: 999 }),
  },
})
// [ 'πŸ³οΈβ€πŸŒˆ', 'πŸ³οΈβ€πŸŒˆ' ]

Usage in Algolia context

This library was created by Algolia to ease the optimizing of record payload sizes resulting in faster search responses from the API.

In general, there is always a unique large "content attribute" per record, and this packages will allow to chunk that content into small chunks of text.

The text chunks can then be distributed over multiple records.

Here is an example of how to split an existing record into several ones:

var chunk = require('chunk-text');
var record = {
  post_id: 100,
  content: 'A large chunk of text here'
};

var chunks = chunk(record.content, 600); // Limit the chunk size to a length of 600.
var records = [];
chunks.forEach(function(content) {
  records.push(Object.assign({}, record, {content: content}));
});