pico-regex-builder
v0.1.1
Published
The tiniest and fastest builder for maintainable Regular Expressions (Regex).
Downloads
7
Readme
Pico Regex Builder
The smallest and fastest builder for maintainable regular expression.
Note Note: this library is a slimmed down version of TS Regex Builder. It removes some verbose features like character classes and escapes, and keeps just the core features to build complex yet maintainable regexes.
For fully fledge regex builder library use TS Regex Builder.
Goal
Regular expressions are a powerful tool for matching text patterns, yet they are notorious for their hard-to-parse syntax, especially in the case of more complex patterns.
This library allows users to create regular expressions in a structured way, making them easy to write and review. It provides a domain-specific langauge for defining regular expressions, which are finally turned into JavaScript-native RegExp objects for fast execution.
// Regular JS RegExp
const hexColor = /^#?([a-fA-F0-9]{6}|[a-fA-F0-9]{3})$/;
// TS Regex Builder DSL
const hexDigit = /[a-fA-F0-9]/;
const hexColor = buildRegExp([
startOfString,
optional("#"),
capture(
choiceOf(
repeat(hexDigit, 6), // #rrggbb
repeat(hexDigit, 3), // #rgb
),
),
endOfString,
]);Installation
npm install pico-regex-builderor
yarn add pico-regex-builderor
pnpm add pico-regex-builderBasic usage
import * as r from "pico-regex-builder";
// /Hello (\w+)/
const regex = r.buildRegExp(["Hello ", r.capture(/\w+/)]);Regex domain-specific language
TS Regex Builder allows you to build complex regular expressions using domain-specific language.
Terminology:
- regex construct (
RegexConstruct) - common name for all regex constructs like character classes, quantifiers, and anchors. - regex element (
RegexElement) - a fundamental building block of a regular expression, defined as either a regex construct, a string, orRegExpliteral (/.../). - regex sequence (
RegexSequence) - a sequence of regex elements forming a regular expression. For developer convenience, it also accepts a single element instead of an array.
Most of the regex constructs accept a regex sequence as their argument.
Examples of sequences:
- single element (construct):
capture('Hello') - single element (string):
'Hello' - single element (
RegExpliteral):/Hello/ - array of elements:
['USD', /\d+/, /Hello/]
Regex constructs can be composed into a tree structure:
const currencyCode = repeat(/[A-Z]/, 3);
const currencyAmount = buildRegExp([
choiceOf("$", "€", currencyCode), // currency
capture(
/\d+/, // integer part
optional([".", repeat(/\d/, 2)]), // fractional part
),
]);See Types API doc for more info.
Regex Builders
| Builder | Regex Syntax | Description |
| ---------------------------------------- | ------------ | ----------------------------------- |
| buildRegExp(...) | /.../ | Create RegExp instance |
| buildRegExp(..., { ignoreCase: true }) | /.../i | Create RegExp instance with flags |
See Builder API doc for more info.
Regex Constructs
| Construct | Regex Syntax | Notes |
| ------------------- | ------------ | ------------------------------- |
| choiceOf(x, y, z) | x\|y\|z | Match one of provided sequences |
| capture(...) | (...) | Create a capture group |
See Constructs API doc for more info.
[!NOTE] TS Regex Builder does not have a construct for non-capturing groups. Such groups are implicitly added when required.
Quantifiers
| Quantifier | Regex Syntax | Description |
| -------------------------------- | ------------ | ------------------------------------------------- |
| zeroOrMore(x) | x* | Zero or more occurrence of a pattern |
| oneOrMore(x) | x+ | One or more occurrence of a pattern |
| optional(x) | x? | Zero or one occurrence of a pattern |
| repeat(x, n) | x{n} | Pattern repeats exact number of times |
| repeat(x, { min: n, }) | x{n,} | Pattern repeats at least given number of times |
| repeat(x, { min: n, max: n2 }) | x{n1,n2} | Pattern repeats between n1 and n2 number of times |
See Quantifiers API doc for more info.
Assertions
| Assertion | Regex Syntax | Description |
| ------------------------- | ------------ | ------------------------------------------------------------------------ |
| startOfString | ^ | Match the start of the string (or the start of a line in multiline mode) |
| endOfString | $ | Match the end of the string (or the end of a line in multiline mode) |
| wordBoundary | \b | Match the start or end of a word without consuming characters |
| lookahead(...) | (?=...) | Match subsequent text without consuming it |
| negativeLookahead(...) | (?!...) | Reject subsequent text without consuming it |
| lookbehind(...) | (?<=...) | Match preceding text without consuming it |
| negativeLookbehind(...) | (?<!...) | Reject preceding text without consuming it |
See Assertions API doc for more info.
Examples
See Examples.
Performance
Regular expressions created with this library are executed at runtime, so you should avoid creating them in a context where they would need to be executed multiple times, e.g., inside loops or functions. We recommend that you create a top-level object for each required regex.
Contributing
See the contributing guide to learn how to contribute to the repository and the development workflow. See the project guidelines to understand our core principles.
License
MIT
Inspiration
TS Regex Builder is inspired by Swift Regex Builder API.
Reference
- ECMAScript Regular Expression BNF Grammar
- Unicode Regular Expressions
- Swift Evolution 351: Regex Builder DSL
- Swift Regex Builder API docs
- TS Regex Builder
Made with create-react-native-library
