npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

js-xre

v0.1.2

Published

Extended (and genuinely-multi-line) Regular Expressions in JavaScript using ES2015+ tagged template strings.

Downloads

10

Readme

xRE: extended RegExps for JavaScript ES2015+

Extended Regular Expressions in JavaScript using, ES2015+ tagged template literals.

Small: < 1 KB gzipped. Focused: it doesn't do a lot else (disclosure: it does do properly multiline expressions too). And forward-looking (which is a nice way of saying you'll need a recent node version or modern browser to enjoy it).

Installation and use

Browser

<script src="js-xre.js"></script>
<script>
  const myRegExp = xRE `^\d$ # just one digit` `x`;
</script>

Node

npm install js-xre

then

  const xRE = require('js-xre');
  const myRegExp = xRE `^\d$ # just one digit` `x`;

What's an extended RegExp?

Perl, Ruby, and some other languages support a readable extended regular expression syntax, in which literal whitespace is ignored and comments (starting with #) are available. This is triggered with the x flag.

(Don't confuse this with the 'extended' expressions of egrep, which are just modern regular expressions. The sort of extended expressions I am talking about might perhaps be better be described as commented or even literate).

For example, as far as Ruby is concerned,

/\d(?=(\d{3})+\b)/

and

/ \d          # a digit
  (?=         # followed by (look-ahead match)
    (\d{3})+  # one or more sets of three digits
    \b        # and then a word boundary
  )
/x

are equivalent. For humans, however, the extended second version is obviously much easier to get to grips with.

These languages also support a properly multi-line match mode, where the . character really does match anything, including \n.

JS: no dice —

JavaScript traditionally offers neither of these options.

It doesn’t recognise the extended syntax, and its multi-line support consists only in permitting the ^ and $ characters to match the beginnings and ends of lines within a string. It will never allow the . to match \n.

I first wrote a function to convert extended and fully-multi-line RegExp source strings to standard syntax in 2010. But it was tricky and error-prone to use it, because a standard JS string can't span multiple lines and you would have to backslash-escape all the backslashes.

— until now

ES2015's pleasingly flexible tagged template literals now make this a genuinely usable and useful capability.

As implemented here, the syntax is:

xRE `myregexp` `flags`

(Note: the flags argument is required — to specify no flags, use an empty literal, ``).

In addition to the standard flags (i, g, m, y, u), which are passed straight through to the native RegExp, three additional flags are provided:

  • x activates extended mode, stripping out whitespace and comments
  • mm activates genuinely-multi-line mode, where . matches anything, including newlines (achieved by replacing . with [\s\S])
  • b is for backslashes, and automatically escapes all template expressions so they are treated as literal text (alternatively, an xRE.escape method is provided so this can be done case-by-case).

Alternatives

You should also check out XRegExp, an impressive library that takes a rather more and-the-kitchen-sink approach. The complete version of XRegExp is 62 KB gzipped, against this library's few hundred bytes.

Examples

x for extended

An simple example with the extended flag x:

const xRE = require('js-xre');

const digitsThatNeedSeparators = xRE `
  \d          # a digit
  (?=         # followed by (look-ahead match)
    (\d{3})+  # one or more sets of three digits
    \b        # and then a word boundary
  )
` `xg`;

console.log(digitsThatNeedSeparators);  
// /\d(?=(\d{3})+\b)/g

const separate000s = (n, sep = '\u202f') =>
  String(n).replace(digitsThatNeedSeparators, '$&' + sep);

console.log(separate000s(1234567));
// 1 234 567

And a monstrously complex example: Daring Fireball's URL RegExp:

const xRE = require('js-xre');

const url = xRE `
  \b
  (?:
    [a-z][\w-]+:                        # URL protocol and colon
    (?:
      /{1,3}                              # 1-3 slashes
      |                                   # or
      [a-z0-9%]                           # single letter or digit or '%'
                                          # (trying not to match e.g. "URI::Escape")
    )
    |                                   # or
    www\d{0,3}[.]                       # "www.", "www1.", "www2." … "www999."
    |                                   # or
    [a-z0-9.\-]+[.][a-z]{2,4}/          # looks like domain name followed by a slash
  )
  (?:                                   # one or more:
    [^\s()<>]+                            # run of non-space, non-()<>
    |                                     # or
    \(([^\s()<>]+|(\([^\s()<>]+\)))*\)    # balanced parens, up to 2 levels
  )+
  (?:                                   # end with:
    \(([^\s()<>]+|(\([^\s()<>]+\)))*\)    # balanced parens, up to 2 levels
    |                                     # or
    [^\s\`!()\[\]{};:'".,<>?«»“”‘’]       # not a space or one of these punct chars
  )
` `xig`;

console.log(url); 
// /\b(?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s\`!()\[\]{};:'".,<>?«»“”‘’])/gi

console.log('Please visit http://mackerron.com.'.replace(url, '<a href="$&">$&</a>'));  
// Please visit <a href="http://mackerron.com">http://mackerron.com</a>.

mm for massively multiline

Serious HTML wrangling should be done with XPath or similar, of course. But:

const xRE = require('js-xre');

const html = `
  <p>A paragraph on one line.</p>
  <p>A paragraph which, by contrast,
  spans multiple lines.</p>
`;

const mPara  = xRE `<p\b.+?</p>` `mg`;
console.log(mPara);
// /<p\b.+?<\/p>/gm

console.log(html.match(mPara));
// [ '<p>A paragraph on one line.</p>' ]

const mmPara = xRE `<p\b.+?</p>` `mmg`;  // note: mm
console.log(mmPara);
// /<p\b[\s\S]+?<\/p>/gm

console.log(html.match(mmPara));
// [ '<p>A paragraph on one line.</p>',
//   '<p>A paragraph which, by contrast,\n  spans multiple lines.</p>' ]

b for backslashes

Since our syntax for extended regular expressions uses template strings, you can interpolate any ${value} in there. The b flag causes all values to be automatically escaped, so that they're treated as literal text rather then metacharacters.

For example, say you're allowing users to type in something to find all matches:

const xRE = require('js-xre');

const searchText = '12.6';  // this might come from an <input> field
const search = xRE `^${searchText}$` `bg`;

console.log(search);
// /^12\.6$/g

The alternative (useful if you want to mix-and-match your escaping for any reason) is to use the escape method of the main function:

const xRE = require('js-xre');

const searchText = '12.6';  // might come from an <input type="text" />
const anchorStart = true;   // might come from an <input type="checkbox" />
const anchorEnd = false;    // might come from an <input type="checkbox" />

const search = xRE `
  ${anchorStart ? '^' : ''}
  ${xRE.escape(searchText)}
  ${anchorEnd ? '$' : ''}
` `gx`;

console.log(search);
// /^12\.6/g

Usage as a regular function

xRE can also be called as a regular (non-tagged-template) function. This could be useful if you wanted to create an extended regular expression based on user input in a <textarea>, say.

For instance:

// earlier: <script src="js-xre.js"></script>

const make = (tag) => document.body.appendChild(document.createElement(tag));
const source = make('textarea');
const flags = make('input');
const output = make('div');

source.oninput = flags.oninput = () =>
  output.innerHTML = xRE(source.value)(flags.value);