npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@gruhn/regex-utils

v2.9.1

Published

TypeScript library for regex equivalence, intersection, complement and other utilities that go beyond string matching.

Readme

Regex Utils

Zero-dependency TypeScript library for regex utilities that go beyond string matching. These are surprisingly hard to come by for any programming language. ✨

API Overview 🚀

Installation 📦

npm install @gruhn/regex-utils
import { RB } from '@gruhn/regex-utils'

Syntax Support

| Feature | Support | Examples | |---------|---------|-------------| | Quantifiers | ✅ | a*, a+, a{3,10}, a? | | Lazy Quantifiers | ✅ | a*?, a+?, a{3,10}?, a?? | | Alternation | ✅ | a\|b | | Character classes | ✅ | ., \w, [a-zA-Z], ... | | Escaping | ✅ | \$, \., ... | | (Non-)capturing groups | ✅ | (?:...), (...) | | Start/end anchors | ⚠️1 | ^, $ | | Lookahead | ⚠️2 | (?=...), (?!...) | | Lookbehind | ⚠️2 | (?<=...), (?<!...) | | Word boundary | ❌ | \b, \B | | Unicode property escapes | ❌ | \p{...}, \P{...} | | Backreferences | ❌ | \1 \2 ... | | dotAll flag | ✅ | /.../s, (?s:...) | | global flag | ✅ | /.../g | | hasIndices flag | ✅ | /.../d | | ignoreCase flag | ❌ | /.../i (?i:...) | | multiline flag | ❌ | /.../m (?m:...) | | unicode flag | ❌ | /.../u | | unicodeSets flag | ❌ | /.../v | | sticky flag | ❌ | /.../y |

  1. Some complex patterns are not supported like anchors inside quantifiers (^a)+ or anchors inside lookaheads (?=^a).
  2. Not supported are nested lookaheads/lookbehinds like (?=a(?=b)) and lookaheads/lookbehinds combinations like (?=a)b(?<=c).

An UnsupportedSyntaxError is thrown when unsupported patterns are detected. The library SHOULD ALWAYS either throw an error or respect the regex specification exactly. Please report a bug if the library silently uses a faulty interpretation.

Handling syntax-related errors:

import { RB, ParseError, UnsupportedSyntaxError } from '@gruhn/regex-utils'

try {
  RB(/^[a-z]*$/)
} catch (error) {
  if (error instanceof SyntaxError) {
    // Invalid regex syntax! Native error, not emitted by this library.
    // E.g. this will also throw a `SyntaxError`: new RegExp(')')
  } else if (error instanceof ParseError) {
    // The regex syntax is valid but the internal parser could not handle it.
    // If this happens it's a bug in this library.
  } else if (error instanceof UnsupportedSyntaxError) {
    // Regex syntax is valid but not supported by this library.
  }
}

Example use cases 💡

Generate test data from regex 📜

Generate 5 random email addresses:

const email = RB(/^[a-z]+@[a-z]+\.[a-z]{2,3}$/)
for (const str of email.sample().take(5)) {
  console.log(str)
}
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]

Generate 5 random email addresses, which have exactly 20 characters:

const emailLength20 = email.and(/^.{20}$/)
for (const str of emailLength20.sample().take(5)) {
  console.log(str)
}
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]

Refactor regex then check equivalence 🔄

ONLINE DEMO

Say we found this incredibly complicated regex somewhere in the codebase:

const oldRegex = /^a|b$/

This can be simplified, right?

const newRegex = /^[ab]$/

But to double-check we can use .isEquivalent to verify that the new version matches exactly the same strings as the old version. That is, whether oldRegex.test(str) === newRegex.test(str) for every possible input string:

RB(oldRegex).isEquivalent(newRegex) // false

Looks like we made some mistake. We can generate counterexamples using .without(...) and .sample(...). First, we derive new regex that match exactly what newRegex matches but not oldRegex and vice versa:

const onlyNew = RB(newRegex).without(oldRegex)
const onlyOld = RB(oldRegex).without(newRegex)

onlyNew turns out to be empty (onlyNew.isEmpty() === true) but onlyOld has some matches:

for (const str of onlyOld.sample().take(5)) {
  console.log(str)
}
aaba
aa
aba
bab
aababa

Why does oldRegex match all these strings with multiple characters? Shouldn't it only match "a" or "b" like newRegex? Turns out we thought that oldRegex is the same as ^(a|b)$ but in reality it's the same as (^a)|(b$).

Comment regex using complement 💬

How do you write a regex that matches HTML comments like:

<!-- This is a comment -->

A straightforward attempt would be:

<!--.*-->

The problem is that .* also matches the end marker -->, so this is also a match:

<!-- This is a comment --> and this shouldn't be part of it -->

We need to specify that the inner part can be any string that does not contain -->. With .not() (aka. regex complement) this is easy:

import { RB } from '@gruhn/regex-utils'

const commentStart = RB('<!--')
const commentInner = RB(/^.*-->.*$/).not()
const commentEnd = RB('-->')

const comment = commentStart.concat(commentInner).concat(commentEnd)

With .toRegExp() we can convert back to a native JavaScript regex:

comment.toRegExp()
/^<!--(---*[^->]|-?[^-])*---*>$/

Password regex using intersections 🔐

ONLINE DEMO

It's difficult to write a single regex for multiple independent constraints. For example, to specify a valid password. But with regex intersections it's very natural:

import { RB } from '@gruhn/regex-utils'

const passwordRegex = RB(/^[a-zA-Z0-9]{12,32}$/) // 12-32 alphanumeric characters
  .and(/[0-9]/) // contains a number
  .and(/[A-Z]/) // contains an upper case letter
  .and(/[a-z]/) // contains a lower case letter

We can convert this back to a native JavaScript RegExp with:

passwordRegex.toRegExp()

[!NOTE] The output RegExp can be very large.

We can also use other utilities like .size() to determine how many potential passwords match this regex:

console.log(passwordRegex.size())
2301586451429392354821768871006991487961066695735482449920n

With .sample() we can generate some of these matches:

for (const str of passwordRegex.sample().take(10)) {
  console.log(str)
}
NEWJIAXQISWT0Wwm
lxoegadrzeynezkmtfcIBzzQ9e
ypzvhvtwpWk4u6
MSZXXKIKEKWKXLQ8HQ7Ds
BCBSFBSMNOLKlgQN5L
8950244600709IW1pg
UOTQBLVOTZQWFSAJYBXZNQBEeom0l
520302447164378435bv4dp4ysC
71073970686490eY2Jt4
afgpnxqwUK5B

Solve Advent Of Code 2023 - Day 12 🎄

In the coding puzzle Advent Of Code 2023 - Day 12 you are given pairs of string patterns. An example pair is .??..??...?##. and 1,1,3. Both patterns describe a class of strings and the task is to count the number of strings that match both patterns.

In the first pattern, . and # stand for the literal characters "dot" and "hash". The ? stands for either . or #. This can be written as a regular expression:

  • for # we simply write #
  • for . we write o (since . is a reserved symbol in regular expressions)
  • for ? we write (o|#)

So the pattern .??..??...?##. would be written as:

const firstRegex = /^o(o|#)(o|#)oo(o|#)(o|#)ooo(o|#)##o$/

In the second pattern, each digit stands for a sequence of # separated by at least one o. This can also be written as a regular expression:

  • For a digit like 3 we write #{3}.
  • Between digits we write o+.
  • Additionally, arbitrary many o are allowed at the start and end, so we add o* at the start and end.

Thus, 1,1,3 would be written as:

const secondRegex = /^o*#{1}o+#{1}o+#{3}o*$/

To solve the task and find the number of strings that match both regex, we can use .and(...) and .size() from regex-utils. .and(...) computes the intersection of two regular expressions. That is, it creates a new regex which exactly matches the strings matched by both input regex.

const intersection = RB(firstRegex).and(secondRegex)

With .size() we can then determine the number of matched strings:

console.log(intersection.size())
4n

While at it, we can also try .enumerate() to list all these matches:

for (const str of intersection.enumerate()) {
  console.log(str)
}
oo#ooo#ooo###o
o#oooo#ooo###o
oo#oo#oooo###o
o#ooo#oooo###o

For a full solution checkout: ./benchmark/aoc2023-day12.ts.

References 📖

Heavily informed by these papers:

  • https://www.khoury.northeastern.edu/home/turon/re-deriv.pdf
  • https://courses.grainger.illinois.edu/cs374/fa2017/extra_notes/01_nfa_to_reg.pdf