semreg
v1.3.0
Published
A TypeScript library for building regular expressions in a readable, maintainable way using a functional, pipe-based approach
Maintainers
Readme
SemReg: Semantic Regular Expressions
Table of Contents
SemReg is a TypeScript library for building regular expressions in a readable, maintainable way. It uses a functional, pipe-based approach that allows developers to compose regex patterns with a clear and expressive syntax.
Installation
npm install semreg
# or
yarn add semreg
# or
pnpm add semregKey Features
- 🔍 Readable Syntax: Replace cryptic regex patterns with a clear, expressive API
- 🧩 Composable: Build complex patterns by combining simple, reusable components
- 🛠️ Fully Typed: Complete TypeScript support with helpful type definitions
- 🧪 Well Tested: Comprehensive test suite ensures reliability
Basic Usage
import {
regex,
startOfLine,
endOfLine,
letters,
digits,
oneOrMore,
literal,
} from "semreg";
// Create a simple regex for validating usernames (letters, digits, and underscores)
const usernameRegex = regex(
startOfLine,
oneOrMore(anyOf(letters, digits, literal("_"))),
endOfLine
);
// Test the regex
usernameRegex.test("john_doe123"); // true
usernameRegex.test("invalid-username"); // falseAPI Reference
Core Function
regex(...components): Combines multiple components to produce a RegExp object
Position Operators
startOfLine: Matches the start of a line (^)endOfLine: Matches the end of a line ($)wordBoundary: Matches a word boundary (\b)nonWordBoundary: Matches a non-word boundary (\B)startOfInput: Matches the start of the input (\A)endOfInput: Matches the end of the input (\Z)
Character Generators
letters: Matches any alphabetic character ([a-zA-Z])lowerLetters: Matches lowercase letters ([a-z])upperLetters: Matches uppercase letters ([A-Z])digits: Matches any digit ([0-9])literal(str): Matches the literal string provided, with special characters escapedwhitespace(): Matches any whitespace character (\s)nonWhitespace(): Matches any non-whitespace character (\S)word(): Matches any word character (alphanumeric + underscore) (\w)nonWord(): Matches any non-word character (\W)any(): Matches any character except newline (.)range(from, to): Matches any character within the specified range ([from-to])
Quantifiers
oneOrMore(component): Matches one or more occurrences (+)zeroOrMore(component): Matches zero or more occurrences (*)optional(component): Matches zero or one occurrence (?)repeat(component, options): Generic quantification. Use withexactly(n),atLeast(n), orbetween(min, max)to specify repetitions.exactly(n): Helper forrepeat. Specifies exactly n occurrences ({n}).atLeast(n): Helper forrepeat. Specifies at least n occurrences ({n,}).between(min, max): Helper forrepeat. Specifies between min and max occurrences ({min,max}).
Groups
group(...components): Creates a capturing group ((...))nonCapturingGroup(...components): Creates a non-capturing group ((?:...))namedGroup(name, ...components): Creates a named capturing group ((?<name>...))numberedBackreference(n): Backreference to the nth capturing group (\n).namedBackreference(name): Backreference to a named capturing group (\k<name>).
Compositors
anyOf(...components): Matches any of the specified patterns ([...])sequence(...components): Defines an explicit sequence of patterns
Logical Operators
or(...components): Creates an alternation between patterns (|)not(component): Creates a negated character set for the given component ([^...])
Examples
Email Validation
import {
regex,
startOfLine,
endOfLine,
letters,
digits,
literal,
anyOf,
oneOrMore,
repeat,
exactly,
atLeast,
between,
} from "semreg";
const emailRegex = regex(
startOfLine,
oneOrMore(anyOf(letters, digits, literal("._%+-"))),
literal("@"),
oneOrMore(anyOf(letters, digits, literal(".-"))),
literal("."),
repeat(letters, atLeast(2)),
endOfLine
);
// Testing valid emails
emailRegex.test("[email protected]"); // true
emailRegex.test("[email protected]"); // true
emailRegex.test("[email protected]"); // true
// Testing invalid emails
emailRegex.test("invalid-email"); // false
emailRegex.test("@missingusername.com"); // false
emailRegex.test("user@domain"); // falseURL Validation
import {
regex,
startOfLine,
endOfLine,
letters,
digits,
literal,
anyOf,
oneOrMore,
optional,
zeroOrMore,
repeat,
exactly,
atLeast,
between,
nonCapturingGroup,
or,
} from "semreg";
const urlRegex = regex(
startOfLine,
or(literal("http"), literal("https")),
literal("://"),
optional(nonCapturingGroup(literal("www."))),
oneOrMore(anyOf(letters, digits, literal(".-"))),
literal("."),
repeat(letters, between(2, 6)),
optional(
nonCapturingGroup(
literal("/"),
zeroOrMore(anyOf(letters, digits, literal("/._-")))
)
),
endOfLine
);
// Testing valid URLs
urlRegex.test("http://example.com"); // true
urlRegex.test("https://www.google.com"); // true
urlRegex.test("https://dev.to/path/to/resource"); // true
// Testing invalid URLs
urlRegex.test("ftp://example.com"); // false
urlRegex.test("http:/example.com"); // false
urlRegex.test("example.com"); // falseCustom Patterns
You can create your own reusable patterns:
import { regex, oneOrMore, letters, digits, literal, anyOf } from "semreg";
// Create a reusable pattern for alphanumeric strings
const alphanumeric = () => oneOrMore(anyOf(letters, digits));
// Use it in different contexts
const usernameRegex = regex(startOfLine, alphanumeric(), endOfLine);
const productCodeRegex = regex(
startOfLine,
literal("PROD-"),
alphanumeric(),
endOfLine
);TODO
Operators that could be implemented soon
1. Lookahead and Lookbehind
positiveLookahead(...): Positive lookahead ((?=...))negativeLookahead(...): Negative lookahead ((?!...))positiveLookbehind(...): Positive lookbehind ((?<=...))negativeLookbehind(...): Negative lookbehind ((?<!...))
2. Flags and Options
caseInsensitive: Enable case-insensitive matching (i)global: Enable global matching (g)multiline: Enable multiline matching (m)
License
MIT
