npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

retokenizer

v1.5.3

Published

Converts text into tokens, as defined by provided syntax; Supports recursive sub-syntaxes, regular expressions, and more..

Downloads

31

Readme

retokenizer

Retokenzier Version 1.0 Copyright (C) 2018, 2019 By Matthew C. Tedder Licensed under the MIT license (See LICENSE file)

DESCRIPTION

Retokenizer is an ES6 class that accepts some code and a syntax definition by which to break it up into the terms you want to work with by. Retokenizer is essentially a Lexical Analyzier but comes close to also being a Grammar Parser... In fact, it could be quickly morphed into one by using sub-syntaxes and looping them back on each other for infinite recursive parsing.

How Retokenizer Differs from Other Tokenizers

  • Retains the characters that split up tokens
  • Facilitates recursive sub-syntaxes
  • May also retain unrecognized portions
  • May provide line an character position numbers

Retokenizer is an ES6 class for churning text into an array of tokens, as per your specifications. Most typically, this may be used for developing a programming language.

Most other tokenizers merely split up text into tokens, discarding characters used to split it and with no support for recursive or recursively different syntaxes. Granted, recursiveness is usually the job of the other major part of any programming language -- the grammar parser. However by doing this, Retokenizer eases the hand-coding of a parser and also facilitates writing context-sensitive parsers. For example, context-insensitive languages typically use "=" for assignment operations and "==" for evaluations. A context-sensitive lanauge would know that "=" in an open statement is an assignment but in an if( .. ) condition, "=" is an evaluation. Furthermore, you may embed different sub-languages within each other.

Note: Retokenizer was developed for Node.js and is available as an npm. Node.js is a not only fast to develop in but also jit-compiles to exceptionally fast machine code. To produce a programming language stand-alone executable, you might use the "nexe" npm. It produces cross-platform.

PRE-REQUISITE

Retokenizer is written in and primarilly for Node.js, however, it has no outside dependencies and would can be easily adapted to work in any browser that supports ES6 (any modern Browser -- e.g., Edge or Chrome but not IE).

USE

The included "example.js" file demonstrates how to use the retokenizer in your own code.

Developing a Syntax

The following illustrates:

syntax = {
    splitters:[' ','\n','if','then','else'],  // tokens will be split up by these (in this order of precidence)
    removes:[' ','\t'],                       // any of these will be excluded in the output of tokens
    enclosures:[
		{ opener:'"', closer:'"', escaper:'\\' },  // "quoted text" with backspace for escaping
		{ opener:'/*', closer:'*/' },              // /* multi-line capable */ comments
		{ opener:'//', closer:'\n' },              // single-line (to end of line) comments 
    ]
};
syntax.enclosures.push({ opener:'{', closer:'}', syntax:syntax });  // code block (recursive syntax definition)
let evalPerens = {
    splitters:[ ' ', '\n', '!=', '=','<>', '>', '<', 'and', 'or', 'not' ], 
	removes:[ ' ','\t','\n' ],
	enclosures:[];
}
evalPerens.enclosures.push({ opener:'(', closer:')', syntax:evalPerens });  // Make perenthesis recursive
syntax.enclosures.push( evalPerens );                                       // add perenthesis to main syntax

Options

Retokenizer may be instantiation such as (for example):

let tokenizer = Retokenizer( syntax, { rich:true, betweens:'keep', condense:true, caseful:false } )

And the options you see there are described below.

rich

  • = false (default), return tokens as simple strings
  • = true, return tokens as richly detailed objects

betweens

  • = 'keep', return tokens of strings found between splitters.
  • = 'remove', do not return tokens of strings found between splitter.
  • = 'throw', throw an error for any strings found between splitters.

condense

  • = false (default), return three tokens for each enclosure (opener, enclosed, and closer)
  • = true, return one token for each enclosure (adding opener and closer attributes to enclosed token, if rich)

caseful

  • = false (default), splitters are caseless
  • = false, splitters are caseful