npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

y-scanner

v0.7.9

Published

Simple, but powerful lexical scanner that is a more minimal implementation of X-Scanner

Downloads

27

Readme

Y-Scanner


NPM Version   License

Testing   Code Coverage  

Simple, but powerful lexical scanner. Inspired by, but distinct from, Ruby's StringScanner.

This library is a simplified version of my other scanner library, X-Scanner.

It is similar in overall use as X-Scanner, but with most of the complex features stripped out. This leaves a scanner that is still very powerful, just more optimized for instances where you need a good scanner, but not one that is insanely extensible.

Features

Y-Scanner supports the same general functionality that you may be used to from other string scanning libraries.

However, some interesting features that Y-Scanner provides are:

  • Pointers
    • Save your scanner state, and jump back to it if needed.
    • Y-Scanner keeps track of the last state the scanner was in, and has a method unscan() to revert back to the previous state, so you do not have to create a pointer for simple cases.
  • Scanning for an arbitrary amount of options
    • The methods check() and scan() allow for infinite options of either string or RegExp, so that you don't have to repeatedly scan and check if an option was matched yourself.
  • Powerful convenience methods
    • Y-Scanner comes built-in with functionality that is incredibly useful when scanning more complicated text, such as:

      • scanDelimited() which will scan for text between two delimiters and supports recursively scanning instances of delimited text within the overall delimited text.
        • If that description sounds confusing, an example of what it can scan would be:
          [ This is stuff in a bracket and "a string with the \"[]\" inside it" ]
          which is text between [] with a string between two " and the [] inside it and also escaped quotes inside the string.
      • scanUntil() which will scan text until it reaches some option.
      • Other utility methods such as scanInteger(), scanDecimal()
  • Built-In Case Insensitivity
    • Y-Scanner allows ensuring that any scans it tries are done in a case-insensitive manner by simply setting the insensitive option to true (and false to turn it off).

      You can do this at any time, so you have full control of how the scanner acts.

How To Install

First, make sure you have Node.js and NPM ready to go on your machine.

Then, in the folder with your code, enter on a command line:

npm install y-scanner

To install it for use globally on your computer:

npm install -g y-scanner

Then, import it into your code like any other NPM library:

const { YScanner } = require('y-scanner')

If you are using ES Module syntax in your code, then it's:

import { YScanner } from 'y-scanner'

Properties

textstring

The text to be scanned.

posnumber

The position of the scanner along the scanned text.

lastPosnumber

The previous position of the scanner.

lastMatchstring or null

The previous match found from a scanner method that updates the scanner on a match.

insensitiveboolean

Whether or not the scanner scans in a case-insensitive manner or not.

Defaults to false.

unscannedText - string

The portion of the text that has yet to be scanned.

scannedText - string

The portion of the text that has already been scanned.

pointer - { pos, lastPos, lastMatch }

A pointer to the current state of the scanner.

lastState - { pos, lastPos, lastMatch }

A pointer to the previous state of the scanner.

eosboolean

A boolean representing if the scanner as at the end of the string being scanned.

An object representing the current state of the scanner.

Methods

For all the code examples for these methods, assume the following code comes before (unless done differently):

const scanner = new YScanner("Hello, World!")

 

checkString(pattern)

Checks to see if the string pattern is next in the scanned text.

Returns the matched string, or null if not found.

Does not update the scanner on a match.

const greeting = scanner.checkString('Hello')

console.log(greeting)           // "Hello"
console.log(scanner.lastMatch)  // null

scanString(pattern)

Checks to see if the string pattern is next in the scanned text.

Returns the matched string, or null if not found.

Updates the scanner on a match.

const greeting = scanner.scanString('Hello')

console.log(greeting)           // "Hello"
console.log(scanner.lastMatch)  // "Hello"

checkRegex(pattern)

Checks to see if the regex pattern is next in the scanned text.

Returns the matched string, or null if not found.

Does not update the scanner on a match.

const greeting = scanner.checkRegex(/Hello|Salutations/)

console.log(greeting)           // "Hello"
console.log(scanner.lastMatch)  // null

scanRegex(pattern)

Checks to see if the regex pattern is next in the scanned text.

Returns the matched string, or null if not found.

Updates the scanner on a match.

const greeting = scanner.scanRegex(/Hello|Salutations/)

console.log(greeting)           // "Hello"
console.log(scanner.lastMatch)  // "Hello"

check(...patterns)

Checks to see if any of the patterns are next in the scanned text. Each option can be a string or regex.

Returns the matched string, or null if not found.

Does not update the scanner on a match.

const greeting = scanner.check('Hello', /Salutations/)

console.log(greeting)           // "Hello"
console.log(scanner.lastMatch)  // null

scan(...patterns)

Checks to see if any of the patterns are next in the scanned text. Each option can be a string or regex.

Returns the matched string, or null if not found.

Updates the scanner on a match.

const greeting = scanner.scan('Hello', /Salutations/)

console.log(greeting)           // "Hello"
console.log(scanner.lastMatch)  // "Hello"

skip(...patterns)

Checks to see if any of the patterns are next in the scanned text. Each option can be a string or regex.

Returns the length of the matched string, or null if not found.

Updates the scanner position on match, but does not update what the last matched text was. Useful if you need to do something like skip whitespace.

const greeting = scanner.skip('Hello', /Salutations/)

console.log(greeting)           // "Hello"
console.log(scanner.lastMatch)  // null

scanDelimited(options)

Scans any text coming up in the scanned text that is between two delimiters.

The options argument is an object with the following attributes:

| name | description | default | |:---------------|:------------------------------------------------|:-----------:| | start | The starting delimiter for the text | " | | end | The ending delimiter for the text | " | | escape | The escape character for the text | \ | | keepDelimiters | Whether to keep the delimiters around the match | false | | inner | A list of info for any inner delimited text | [] | | autoNest | An object to automatically generate inners | undefined | | noEndFail | Whether not finding end is a scan fail or not | true |

As you may notice, every option has a default, meaning you can just use scanDelimited() without any arguments to scan a normal double-quoted string, if that's what you want.

const scanner = new YScanner('"This is some quoted text"')

const stringText = scanner.scanDelimited()

console.log(stringText)  // "This is some quoted text"

Of course, you can change a couple things to easily parse something else:

const scanner = new YScanner("`Look it's a backtick string`")

const backtickString = scanner.scanDelimited({ start: '`', end: '`' })

console.log(backtickString)  // "Look it's a backtick string"

If you have any delimited text inside your delimited text, you can use the inner option to specify this. Any inner delimited text is specified with an object with the following attributes:

| name | description | default | |:---------------|:------------------------------------------------|:----------------------:| | start | The starting delimiter for the text | NONE | | end | The ending delimiter for the text | NONE | | escape | The escape character for the text | whatever options has | | keepDelimiters | Whether to keep the delimiters around the match | true | | inner | A list of info for any inner delimited text | whatever options has | | autoNest | An object to automatically generate inners | undefined | | noEndFail | Whether not finding end is a scan fail or not | whatever options has |

It's effectively the same as what you set for options, but start and end are required.

As an example, we'll cover how to scan that multi-level delimited text used earlier in the features section.

const scanner = new YScanner(
  '[This is stuff in a bracket and "a string with the \\"[]\\" inside it"]'
)

const result = scanner.scanDelimited({
  start: '[', end: ']',
  inner: [{
    start: '"', end: '"'
  }]
})

console.log(result)
// 'This is stuff in a bracket and "a string with the "[]" inside it"'

Inner delimited items can have inner delimited items inside them, so you can scan some crazy things with this method, if you needed to.

Note: Nested Delimited Text Result Behavior

Note that if inner delimited text does not scan an end delimiter for itself, but has noEndFail set to false, then it will propagate the success to count for the whole scan, even if higher levels of delimited text have noEndFail to true.

This leads to more intuitive behavior, such as

const scanner = new YScanner('"Hello, World [Inner Text"')

const text = scanner.scanDelimited({
  noEndFail: true  // True by default, but here for demonstration.
  inner: [{ start: '[', end: ']', noEndFail: false }]
})

Returning a successful match of Hello, World [Inner Text instead of null, which is what would happen without this detail.

Note: Automatic Nested Delimited Text

Lastly, sometimes you may need automatically generated inner delimited text. Especially if you have something that needs to have arbitrarily nested instances of itself within itself, like braces in code like this example:

if (x) {
  if (y) {
    console.log('cool')
  }
}

You could not use inner alone to accomplish an indefinite amount of nested braced blocks of code here, as inner is something that is explicitly defined (and so literally not indefinite). But not doing anything would mean that the scan would end at the first }, which isn't very useful.

You can use the autoNest option to achieve this indefinite nesting. It will automatically generate copies of the main delimited text info and put them in inner (and do this for each layer of nested text), so that you can have automatic nested delimited text. You just need to give autoNest an object to say that you want it to automatically nest.

For example, to scan the code for a nested if-block given above:

const scanner = new YScanner(`
if (x) {
  if (y) {
    console.log('cool')
  }
}
`.trim())

const opening = scanner.scan('if (x) ')

const ifBlock = scanner.scanDelimited({
  start: '{', end: '}',
  autoNest: {},
  keepDelimiters: true
})

console.log(opening + ifBlock)
/*
"if (x) {
  if (y) {
    console.log('cool')
  }
}"
*/

 

That's cool, but you might ask: "How do I automatically nest my delimited text but have the automatically generated nested text be different than the base nested text?"

To make it so that anything generated via autoNest is different than what it's nested from, you can just simply give the object for autoNest any of the properties that you can give the scanDelimited method itself.

For example, let's say you want to scan a block of code and want the braces on the outside removed, but want any inner braces kept:

const Scanner = new YScanner(`
{
  if (stuff) {

  }
}
`.trim())

const codeBlock = scanner.scanDelimited({
  start: '{' end: '}',
  autoNest: { keepDelimiters: true },
  keepDelimiters: false
})

console.log(codeBlock)
/*
"
  if (stuff) {

  }
"
*/

Anything defined in autoNest will be set for any generated nested text, instead of defaulting to whatever the parent delimited text has.

The most fun part of this is that instances of delimited text in inner can also have an autoNest in them, and even an autoNest (inner or not) can have autoNest inside it.

So you can just go absolutely nuts and have crazy automatically generated infinitely nested delimited text that has different nested text that has automatically generated infinitely nested delimited text inside of it.

I don't believe the potential is infinite, but it's much closer to infinity than a single method maybe should have.

scanUntil(patterns, options)

Scans text until any of the patterns given are encountered in the scanner text. Will update the scanner if a good match.

The patterns argument is an array of patterns to try to scan until, and each pattern can be a string or regular expression.

The options argument takes an object with the following optional properties: | name | description | default | |:---------------|:-------------------------------------------------------|:-------:| | failIfNone | Whether not finding any pattern is a scan failure | false | | includePattern | Whether to include the matched pattern with the result | false |

As an example, let's say you have a small language and you have some text that can be anything up to the start of some square brackets. You can easily scan it by doing:

const scanner = new YScanner('Hello, World! [other stuff]')
const result = scanner.scanUntil(['['])

console.log(result)  // "Hello, World! "

If you wanted to include additional kinds of brackets to stop at, you could do:

const scanner = new YScanner('Hello, World! <other stuff again>')
const result = scanner.scanUntil(['[', '(', '{', '<'], { includePattern: true })

console.log(result)  // "Hello, World! <"

Compared to the methods next to this one in the list, this is pretty simple and straightforward.

checkInteger(options)

Scans to see if there is any text next in the scanner that fits the format of an integer value. However, this method provides the ability for you to extensively define just what "integer value" means for your use case.

As long as it generally fits the form of "text joined together without spaces, with maybe some decoration on it", this will scan it.

The options argument for this method takes an object with any of the following properties:

| name | description | default | |:---------------|:------------------------------------------------|:----------------------:| | sign | Pattern for a sign at the start | optional +/- | | prefix | Pattern for any number prefix | null | | leading | Pattern for any leading parts of the number | optional 0s | | digits | Pattern for the digits for the number | 0 - 9 | | separator | Pattern for any separators between digits | , | | postfix | Pattern for any part at the end of the number | null | | removeSeparators | Whether to remove separators from the number | true | | split | Whether to return an object instead of the number | false |

NOTE: Patterns can be a string, regular expression, or null. Everything except digits are optional. If you want something to be required, set split to true and see if the portion is null or not.

If split is set to true, then instead of the matched string of text for the integer, this method will return an object with the following properties:

| name | description | |:---------------|:------------------------------------------------| | sign | The matched sign, or null if none | | prefix | Any part before the number, or null if none | | leading | Any leading part of the number, or null | | number | The main part of the number | | postfix | Any part after the number, or null if none |

NOTE: A bad match will still return just null, not this object.

 

To show off how this method is used, let's see how it works with no options set (which will scan a normal decimal integer value):

const scanner = new YScanner('1,000,000')

const num = scanner.checkInteger()

console.log(num)  // "1000000"

If we set removeSeparators to false, it will keep them in the result:

const scanner = new YScanner('1,000,000')

const num = scanner.checkInteger({ removeSeparators: false })

console.log(num)  // "1,000,000"

Let's try a more fun example, like changing the number format to hexadecimal:

const scanner = new YScanner('0xDEAD_BEEF')

const num = scanner.checkInteger({
  prefix: /0x/i,
  digits: /[0-9a-f]/i,  // Note that "1234567890ABCDEF" would work too.
  separator: '_'        // We can have some fun and allow underscore separators.
})

console.log(num)  // "0xDEADBEEF"

Lastly, let's try a more complex example. Imagine you want to parse a signed hex value for a programming language you're making, and you also wanna allow it to have an i at the end to make it an imaginary integer:

const scanner = new YScanner('-0xDEAD_BEEFi')

const num = scanner.checkInteger({
  prefix: /0x/i,
  digits: /[0-9a-f]/i,  // Note that "1234567890ABCDEF" would work too.
  separator: '_',       // We can have some fun and allow underscore separators.
  postfix: 'i',
  split: true
})

console.log(num)
// { sign: '-', prefix: '0x', leading: null, number: 'DEADBEEF', postfix: 'i' }

scanInteger(options)

Same as #checkInteger, but it will also update the scanner position on a good match.

checkDecimal(options)

Scans to see if there is any text next in the scanner that fits the format of a decimal value. Like #checkInteger, there is a multitude of options that will allow you to define what you want a decimal to be.

The options argument for this method takes an object with any of the following properties: | name | description | default | |:-----------------|:---------------------------------------------------|:-----------------:| | sign | Pattern for a sign at the start | optional +/- | | prefix | Pattern for any number prefix | null | | leading | Pattern for any leading parts of the number | optional 0s | | digits | Pattern for the digits for the number | 0 - 9 | | separator | Pattern for any separators between digits | , | | radix | Pattern for the radix point in the decimal value | . | | trailing | Pattern for any trailing parts of the number | optional 0s | | postfix | Pattern for any part at the end of the number | null | | removeSeparators | Whether to remove separators from the number | true | | split | Whether to return an object instead of the number | false |

NOTE: Patterns can be a string, regular expression, or null. Everything except digits are optional. If you want something to be required, set split to true and see if the portion is null or not.

If split is set to true, then instead of the matched string of text for the decimal, this method will return an object with the following properties:

| name | description | |:-----------------|:--------------------------------------------------| | sign | The matched sign, or null if none | | prefix | Any part before the number, or null if none | | leading | Any leading part of the number, or null | | whole | The whole number portion of the decimal or null | | radix | The radix point for the decimal, or null | | fractional | The fractional portion of the decimal, or null | | number | The entire numeric portion of the number | | trailing | Any trailing part of the number, or null | | postfix | Any part after the number, or null if none |

NOTE: A bad match will still return just null, not this object. NOTE 2: If the fractional part of a decimal value has text at the end that matches trailing text, the trailing part will take it, but will leave one digit in the fractional part (so .0 will not be . with trailing 0).

 

Like with checkInteger, we'll go over some examples; the most basic is just calling checkDecimal() with no options, which will scan a standard signed decimal number:

const scanner = new YScanner('5.23')
const num = scanner.checkDecimal()

console.log(num)  // "5.23"

Additionally, split will return an object of all possible parts of the number split up:

const scanner = new YScanner('-5.23')
const num = scanner.checkDecimal({ split: true })

console.log(num)
/* { sign: '-', prefix: null, leading: null, whole: '5', radix: '.',
     fractional: '23', number: '5.23', trailing: null, postfix: null } */

To up the complexity, let's say you want to scan what you call a "signed, decimal hex value" with optional underscore separators (that are simply ignored by your parser) and the ability to put an i at the end of signify an imaginary number:

const scanner = new YScanner('-0xDEAD_BEEF.DEAF_CAFEi')
const num = scanner.checkDecimal({
  prefix: /0x/i,
  digits: /[0-9a-f]/i,
  separator: '_',
  postfix: 'i',
  split: true
})

console.log(num)
/* { sign: '-', prefix: '0x', leading: null, whole: 'DEADBEEF', radix: '.',
     fractional: 'DEAFCAFE', number: 'DEADBEEF.DEAFCAFE', trailing: null,
     postfix: 'i' } */

Note on how numbers get split

Just for a small reference on how your numbers should come out for questionable scenarios, here's a table of how some values get split up: Note: This is assuming leading/trailing zeroes are allowed. | value | leading | whole | fractional | trailing | note | |:------------------|:---------:|:------:|:----------:|:--------:|:--------:| | 0.0 | null | 0 | 0 | null | | | .1 | null | null | 1 | null | | | 3. | null | null | null | null | no match | | 00.0 | 0 | 0 | 0 | null | | | 020.000 | 0 | 20 | 0 | 00 | |

scanDecimal(options)

Same as #checkDecimal, but it will also update the scanner position on a good match.

loadPointer(pointer)

Loads up a pointer for some state for the scanner. This can either be a pointer saved from calling pointer, or an object that has the structure { pos, lastPos, lastMatch }.

const start = scanner.pointer

scanner.scan('Hello')
scanner.loadPointer(start)

console.log(scanner.lastMatch)  // null

movePosition(n)

Moves the scanner position n characters away from the current position. Negative numbers will move the scanner backwards.

If the resulting position is past the end of the text being scanned, it will be set to the end of the text.

If the resulting position is before the start of the text being scanned, it will be set to 0.

scanner.movePosition(1)

console.log(scanner.pos)  // 1

scanner.movePosition(-100)

console.log(scanner.pos)  // 0

setPosition(n)

Sets the scanner position to n.

Like movePosition(), if the resulting position is outside of the bounds of the text being scanned, it will be set to the start/end of the text.

scanner.setPosition(5)

console.log(scanner.pos)  // 5

scanner.setPosition(-100)

console.log(scanner.pos)  // 0

append(text)

Adds the given text to the end of the scanner text. Does not change anything except for the text, effectively just making it longer.

The text argument is simply a string of text.

const scanner = new YScanner('Hello')
console.log(scanner.text)             // "Hello"

scanner.append(', World!')
console.log(scanner.text)             // "Hello, World!"

prepend(text, options)

Adds the given text to the beginning of the scanner text. Will either adjust the scan pointer to adjust for the new text at the start, or reset the scanner as if it started with the new text.

The text argument is simply a string of text.

The options argument is an object with the following properties: | name | description | default | |:------|:----------------------------------------|:-------:| | reset | Whether to reset the scanner or not | false |

If reset is set to false, then the scanner will shift the current position forward the length of the prepended text.

const scanner = new YScanner(', World!')
console.log(scanner.text)             // ", World!"
console.log(scanner.pos)              // 0

scanner.prepend('Hello')

console.log(scanner.text)             // "Hello, World!"
console.log(scanner.pos)              // 5

If reset is true, then the scanner will simply reset itself.

const scanner = new YScanner(', World!')
console.log(scanner.text)             // ", World!"
console.log(scanner.pos)              // 0

scanner.prepend('Hello', { reset: true })

console.log(scanner.text)             // "Hello, World!"
console.log(scanner.pos)              // 0

reset()

Resets the scanner back to it's initial state, as if it was brand new.

scanner.scan('Hello')

scanner.reset()

console.log(scanner.lastMatch)  // null

terminate(options)

Moves the scanner to the end of the scanner text.

The options argument is an object with the optional properties: | name | description | default | |:------|:----------------------------------------|:-------:| | clear | Whether to clear the match data as well | false |

const scanner = new YScanner('Hello')
scanner.terminate()

console.log(scanner.pos)  // 5

If you set the clear option to true, then the last matched text will be cleared and set to null as well.

const scanner = new YScanner('Hello, World!')

scanner.scan('Hello')
console.log(scanner.lastMatch)  // "Hello"

scanner.terminate({ clear: true })

console.log(scanner.pos)        // 5
console.log(scanner.lastMatch)  // null

updateMatch(match)

Takes a match string and updates the scanner as if it had just scanned that string.

This method should generally not be used, as the scanning methods already use it, but the option is provided for any weird edge cases where you need it.

scanner.updateMatch('Hello')

console.log(scanner.lastMatch)  // "Hello"

unscan()

Reverts the scanner state to that of lastState, which is the previous scanner position.

scanner.scan('Hello')

scanner.unscan()

console.log(scanner.pos)  // 0

duplicate()

Creates a copy of the Y-Scanner instance. This will actually deep-copy the scanner, and not simply make a reference copy of it.

scanner.scan('Hello')

const newScanner = scanner.duplicate()

console.log(newScanner.lastMatch)  // "Hello"

Static Methods

backscan(text, pattern)

Scans the given input text from the end to see if the given option matches. (e.g. "I like eggs" would backwards match "eggs")

The text argument is the string of text to attempt to scan.

The pattern argument is a string or regular expression to try to match at the end of the text string.

This method will return an object with the following properties: | name | description | |:--------|:--------------------------------------------------------------| | result | The text matched at the end of the input (null if no match) | | newText | The input text with the match removed from the end |

As an example, let's say you are making a small language for programming the layout of something, and you want to see if some text you have ends with a certain kind of bracket:

const layout = 'Here is some text [This is a button]'
const brackets = /]|\)|}/

const endBracket = YScanner.backscan(layout, brackets)

console.log(endBracket)
/* { result: ']', newText: 'Here is some text [This is a button' } */

You may wonder: "Doesn't JavaScript have a method to check if a string ends with something?"

The thing that sets this method apart is that endsWith explicitly cannot use a regular expression, while backscan can. This allows for much more convenience and possible ways to use it, and you can give backscan a string or regex interchangably without any hassle.