npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

sequence-matching

v0.0.2

Published

A library for matching sequences of consecutive elements in an array by using a pattern.

Downloads

5

Readme

sequence-matching

A library to help you pattern match sequences of an array of elements against a pattern. Optionally, you can keep or remove the embedded matches.

When is this useful?

Let's say you have an array of token to represent a sentence in a natural language document. Each token is a word or punctuation and it is accompanied with its part of speech tag (e.g. is it a noun or adjective or else).

const items = [
	{ i: 0, text: `I`, tag: `N/A` },
	{ i: 1, text: `prefer`, tag: `VERB` },
	{ i: 2, text: `small`, tag: `ADJ` },
	{ i: 3, text: `fluffy`, tag: `ADJ` },
	{ i: 4, text: `dogs`, tag: `NOUN` },
	{ i: 5, text: `,`, tag: `PUNCT` },
	{ i: 6, text: `and`, tag: `N/A` },
	{ i: 7, text: `cats`, tag: `NOUN` },
	{ i: 8, text: `too`, tag: `N/A` },
]

Now, our pattern will be such as we want:

  • zero to three adjective(s)
  • followed by one noun to three nouns
const pattern = [
	{ predicate: (token) => token.tag === `ADJ`, min: 0, max: 3 },
	{ predicate: (token) => token.tag === `NOUN`, min: 1, max: 5 },
]

Which should give us the two matches:

const matcher = new SequenceMatcher(items, pattern)
await matcher.evaluate()
expect(matcher.matches).toStrictEqual([
	[2, 3, 4],
	[7],
])

What for?

In simple terms:

  • you have an array of elements (any type, even scalars)
  • you have a pattern as an array of conditions in which each condition is:
    • predicate – a boolean function to check each element against
    • min – how many elements should match this condition at least (inclusive)
    • max – how many elements should match this condition at most (inclusive)

And, in that array, you want to find sequences of n elements that match the pattern you want.

Workflow

Such library is full of edge cases so a regular loop was did not work.

Instead, we follow these steps:

  1. Build a predicate matrix
    • test each element against each predicate and save that result in a matrix for lookups later
  2. Compute all possible combinations (see combination-builder.ts)
    • each condition has a varying length based on its min and max
    • each match has a length between the pattern's minimum and maximum
    • based on that we compute every single combination possible
  3. Walk each combination and find matches
  4. Discard embedded matches (e.g. [5, 6] is embedded in [5, 6, 7])
  5. Sort the matches

Performances

The longest step in the above workflow is the walk (#3). Performances will vary based on how mnay items are in the array, and how many possible combinations there are. To improve speed, avoid "wide" conditions with a min and max far away from each other.

Initially, I built this using reduce, map, filter, forof, forin, Array(n).fill(x).map() in various places but the execution time was too high. Therefore, there are many for loops and they do shave off important milliseconds but we lose in readability.

Help

Please raise a PR if you think of an edge case and managed to fix it.