npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

rp-paragraph-splitter

v1.0.0

Published

Splits walls of text into paragraphs, focusing on dialogue.

Downloads

15

Readme

RP Paragraph Splitter

This is a component of the website for my IRC RP's logs that takes a huge blob of text and attempts to cut it into paragraphs that make sense. It does not handle the rejoining of cut IRC messages, as that would couple it with my logs website's code.

It has no dependencies, for the sentence tokenizers available on npm did not handle all the odd formattings you'd find in the clash of writing styles that we call RP. The biggest thing is eclipses and . not always ending sentences when they're inside quotation marks.

The input is expected to all be from one perspective dialogue-wise, though it will split upon meeting a character name as the first word. This requires it be hooked into a character getter (see settings under reference).

The tag parameter is for anything you want to associate all posts with for the next step of your code. For example, I use the character name on my logs website to create separators when the character changes.

I put it here in the hopes that someone else might find it interesting or useful, and it's released under the permissive ISC license.

Goal

The goal is to improve the reading experience for people reading up on RP logs on my site. It doesn't have to be perfect, just good enough.

Rules

The paragraph can be split if either of the below rules are true. The numbers can be tuned with the settings object. The current sentence will make up the "topic" sentence of the next paragraph.

  1. Length is past 45 words, the current sentence is 7 words long, the first dialogue has been done, and it's not in the middle of dialogue.
  2. The last sentence is one complete quotation, and the current is 7 words long.
  3. The first word is a character name.

Example

var rpps = require('rp-paragraph-splitter');

let text = `Paste a huge block of text here.`;
let tag = 'Character Name';
let paragraphs = rpps.Paragraph.split(text, tag);

for(let i = 0; i < paragraphs.length; ++i) {
  let paragraph = paragraphs[i];

  console.log(`${paragraph.toString()}\n`);
}

Reference

All the objects below are children of the main module object.

Sentence

The sentence tokenizer. You don't have to touch this to use it, but here it is anyway.

Properties

  • string text: The entire sentence text.
  • char first: the first character
  • char last: The last character.
  • string firstWord: The first word.
  • string lastWord: The last word.
  • int quoteCoount: The number of quotation marks.
  • int length: The number of words.
  • bool dialogue: The sentence opened or closed dialogue; i.e. had an odd number of quotation marks.
  • bool fullDialogue: The sentence started and ended on a '"'.

Functions

  • Sentence(text): Creates a sentence with the text.
  • string .toString(): Returns the text property.
  • Sentence[] Sentence.split(text): Splits the text into multiple Sentences

Paragraph

Properties

  • Sentence[] sentences: The sentences in this paragraph.
  • string tag: Arbitrary tag.

Functions

  • Paragraph(sentences, tag): Creates a sentence with the sentences.
  • string .toString(): Displays the paragraph content as text.
  • Paragraph[] Paragraph.split(sentences, tag): Groups the Sentences together into paragraphs.
  • Paragraph[] Paragraph.split(text, tag): Splits the text into Sentencess and turns them into paragraphs.

settings

  • int paragraphLength: The minimum length for rule 1.
  • ìnt topicLength: The topic sentence length for rule 1 and 2.
  • function characterCallback(string name): Where to ask for character, the rule looks for true or any non-null object as success. The name argument is the first word in lowercase.