npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

split-anything

v0.3.0

Published

split for files, streams, or any text really

Downloads

8

Readme

split-anything lets you read files line by line synchronously. As a bonus, you can use it

  • on strings like String.split,
  • as a stream transform like split2,
  • to read files line by line asynchronously, or
  • with your own interface around the underlying SplitAnything class.

What sets it apart from other text-splitting utilities is that it preserves line endings by default, unlike String.split-based solutions.

String interface - splitStr

splitStr takes a string an returns an array of strings where each element is a line:

const { splitStr } = require('split-anything')

const input = `so much depends
upon

a red wheel
barrow ♥♥♥`

const output = [
  'so much depends\n',
  'upon\n',
  '\n',
  'a red wheel\n',
  'barrow ♥♥♥'
]

tap.same(splitStr(input), output)

Usage: splitStr(text[, separator[, chomp]])

  • text String the text to split into lines.
  • separator RegExp the line boundary. Default: /\n/
  • chomp boolean whether to remove line boundaries from the end of lines. Default: false
  • Returns: Array text split into lines.

Set chomp to true if you don't want to keep line endings, e.g.

tap.same(splitStr(input, /\n/, true), [
  'so much depends',
  'upon',
  '',
  'a red wheel',
  'barrow ♥♥♥'
])

You can use arbitrary regular expressions as your "line boundary", e.g. if you just want to get rid of empty lines:

tap.same(splitStr(input, /(?<=\n)\n+/, true), [
  'so much depends\nupon\n',
  'a red wheel\nbarrow ♥♥♥'
])

File interface - SplitReader

const { SplitReader } = require('split-anything')

const fName = tmp.tmpNameSync()
fs.writeFileSync(fName, input)

const reader = new SplitReader(fName)
const readOut = []
let line
while ((line = reader.readSync(), line !== null)) {
  readOut.push(line)
}
tap.same(readOut, output)

Usage: new SplitReader(file[, encoding[, separator[, chomp]])

  • file String or integer name of the file to read from, or a file descriptor.
  • encoding String the file encoding. Default: 'utf8'
  • separator RegExp the line boundary. Default: /\n/
  • chomp boolean whether to remove line boundaries from the end of lines. Default: false

splitReader.readSync([bufLength])

  • bufLength integer the number of bytes to read. Default: 250.
  • Returns: String the next line of text, or null if EOF is reached.

If it's possible to avoid reading from the underlying file (i.e. if some lines have already been read and buffered), then this function doesn't read, and just returns the next line from the buffer. Conversely, if after reading bufLength bytes from the file it still hasn't found a complete line, then it reads again until it has a line to return.

const fd = fs.openSync(fName, 'r')
const fdReader = new SplitReader(fd, 'utf8', /\s/, true)
const words = []
while ((line = fdReader.readSync(1), line !== null)) {
  words.push(line)
}
tap.same(words, input.split(/\s/))

You can also read lines asynchronously:

tap.test('async read', t => {
  t.plan(1)
  const fd = fs.openSync(fName, 'r')
  const reader = new SplitReader(fd)
  const lines = []
  const readTilEnd = () => reader.read().then(line => {
    if (line === null) {
      t.same(lines, output)
    } else {
      lines.push(line)
      readTilEnd()
    }
  })
  readTilEnd()
})

SplitReader.read([bufLength])

  • bufLength integer the number of bytes to read. Default: 250.
  • Returns: Promise resolves to the next line of text, or null if the end of file has been reached.

This is the async counterpart of readSync.

It's safe to use both read and readSync on the same file:

tap.test('mixed read', t => {
  const fd = fs.openSync(fName, 'r')
  const reader = new SplitReader(fd)
  reader.read(20).then(line => {
    t.equals(line, 'so much depends\n')
    t.equals(reader.readSync(10), 'upon\n')
    reader.read(5).then(line => {
      t.equals(line, '\n')
      t.equals(reader.readSync(1), 'a red wheel\n')
      reader.read(1).then(line => {
        t.equals(line, 'barrow ♥♥♥')
        t.equals(reader.readSync(1), null)
        t.end()
      })
    })
  })
})

Stream interface - SplitTransform

const { SplitTransform } = require('split-anything')

tap.test('You can do the same thing with a stream transform', t => {
  const actualOutput = []
  const tx = new SplitTransform()
  tx.on('data', line => actualOutput.push(line))
  tx.on('end', () => {
    t.plan(1)
    t.same(actualOutput, output)
  })
  tx.end(input)
})

Usage: new SplitTransform([separator[, chomp[, streamOptions]]])

  • separator RegExp the line boundary. Default: /\n/
  • chomp boolean whether to remove line boundaries from the end of lines. Default: false
  • streamOptions Object options to pass to the streams.Transform constructor.
tap.test('separator & chomp just like splitStr and SplitReader', t => {
  const actualOutput = []
  const tx = new SplitTransform(/\b\w{1,3}\s/, true, { highWaterMark: 2 })
  tx.on('data', line => actualOutput.push(line))
  tx.on('end', () => {
    t.plan(1)
    t.same(actualOutput, [
      'much depends\nupon\n\n',
      'wheel\nbarrow ♥♥♥'
    ])
  })
  tx.end(input)
})

Generic interface - SplitAnything

splitStr, SplitTransform, and SplitReader are all wrappers around SplitAnything. If you have some text to split but these interfaces don't work for you, you can build your own by interacting with SplitAnything directly.

const { SplitAnything } = require('split-anything')

const sa = new SplitAnything()
sa.cat(input)
tap.equals(sa.getLine(true), 'so much depends\n')
tap.equals(sa.getLine(true), 'upon\n')
tap.equals(sa.getLine(true), '\n')
tap.equals(sa.getLine(true), 'a red wheel\n')
tap.equals(sa.getLine(true), 'barrow ♥♥♥')
tap.equals(sa.getLine(true), undefined)

new SplitAnything([separator[, chomp]])

  • separator RegExp the line boundary. Default: /\n/
  • chomp boolean whether to remove line boundaries from the end of lines. Default: false

SplitAnything.cat(str)

  • str String the chunk of text to concatenate
  • Returns: this so you can chain calls

Appends str to the internal text buffer.

const sa1 = (new SplitAnything(/ /, true)).cat('1 2').cat('3 4')
tap.equals(sa1.getLine(true), '1')
tap.equals(sa1.getLine(true), '23')
tap.equals(sa1.getLine(true), '4')

SplitAnything.getLine([last])

Returns the next complete line from the text that has been cat so far, or undefined if there isn't one. The last line from the text always counts as incomplete so it won't be returned, because SplitAnything expects you to cat more text. If you've reached the end of the text you want to split, set last to true (default: false), and the last line will be counted as complete and returned when its turn comes.

sa.cat('1\n2\n3')
tap.equals(sa.getLine(), '1\n')
tap.equals(sa.getLine(), '2\n')
tap.equals(sa.getLine(), undefined)
tap.equals(sa.getLine(true), '3')
tap.equals(sa.getLine(true), undefined)
sa.cat('4\n5\n')
tap.equals(sa.getLine(), '4\n')
tap.equals(sa.getLine(), '5\n')
tap.equals(sa.getLine(true), undefined)

Contributing

This project is left deliberately imperfect to encourage you to participate in its development. If you make a Pull Request that

  • explains and solves a problem,
  • follows standard style, and
  • maintains 100% test coverage

it will be merged: this project follows the C4 process.

To make sure your commits follow the style guide and pass all tests, you can add

./.pre-commit

to your git pre-commit hook.