unitas
v0.2.5
Published
<img src="doc/logo.png" style="margin-left: 16px" height="64" align="right" alt="Unitas logo">
Downloads
93
Readme
unitas — composing parsers into a unified whole
A lightweight, TypeScript-first parser combinator library for building expressive and composable parsers.
Features
- Parser Combinators: Compose small parsers into complex ones using combinators like
many,choice,sequence, and more - Terminals: Factory functions for common patterns (
char,string,regex, etc.) - Primitives: Pre-built parser instances ready to use (
digit,letter,whitespace, etc.) - TypeScript: Full TypeScript support with generic types and inference
- Tree-shakeable: ESM-only with separate exports for
combinators,terminals,primitives, andutils - No dependencies: Zero external runtime dependencies
Note: This library is in active development. The API may change before v1.0.0.
Installation
npm install unitasQuick Start
CSV parser — parsing comma-separated values with quoted fields
import { grammar, run } from 'unitas';
import { choice, inner, separatedBy } from 'unitas/combinators';
import { char, regex } from 'unitas/terminals';
import { letters } from 'unitas/primitives';
const csv = grammar({
row: (p) => separatedBy(p.value, char(',')),
value: (p) => choice(p.quoted, p.unquoted),
quoted: () => inner(char('"'), regex(/^[^"]*/), char('"')),
unquoted: () => letters,
});
run(csv.row, 'a,b,c'); // ['a', 'b', 'c']
run(csv.row, '"a,b",c'); // ['a,b', 'c']JSON value parser — parsing simple json values
import { grammar, run } from 'unitas';
import { choice, map, quoted } from 'unitas/combinators';
import { string } from 'unitas/terminals';
import { bool, digits, letters } from 'unitas/primitives';
const json = grammar({
value: (p) => choice(p.string, p.number, p.bool, p.null),
string: () => quoted(letters),
number: () => digits,
bool: () => bool,
null: () => map(string('null'), () => null),
});
run(json.value, '"hello"'); // 'hello'
run(json.value, '42'); // 42
run(json.value, 'true'); // true
run(json.value, 'null'); // nullQuery string parser — parsing URL query parameters
import { grammar, run } from 'unitas';
import { map, outer, separatedBy } from 'unitas/combinators';
import { char } from 'unitas/terminals';
import { letters } from 'unitas/primitives';
type Query = {
params: Record<string, string>;
param: [string, string];
key: string;
value: string;
};
const query = grammar<Query>({
params: (p) =>
map(separatedBy(p.param, char('&')), (pairs) =>
Object.fromEntries(pairs),
),
param: (p) => outer(p.key, char('='), p.value),
key: () => letters,
value: () => letters,
});
run(query.params, 'foo=bar&baz=qux'); // { foo: 'bar', baz: 'qux' }INI file section — parsing section headers and key-value pairs
import { grammar, run } from 'unitas';
import { map, outer, sequence } from 'unitas/combinators';
import { char, regex } from 'unitas/terminals';
import { letters, nl } from 'unitas/primitives';
import { pick } from 'unitas/utils';
const ini = grammar({
section: (p) =>
map(
sequence(char('['), p.name, char(']'), nl, p.entry),
pick(1, 4),
([name, entry]) => ({ name, entry }),
),
name: () => letters,
entry: (p) => outer(p.key, char('='), p.value),
key: () => letters,
value: () => regex(/^[^\n]+/),
});
run(ini.section, '[database]\nhost=localhost'); // { name: 'database', entry: ['host', 'localhost'] }Table of Contents
- Core
- Terminals
- Primitives
- Combinators
attemptbindbracedbracketedchainLeftchainLeft1chainRightchainRight1choiceconcatconsumeendByendBy1exactlyfirstflagfoldfold1foldRightfoldRight1fuseguardinnerinterleavedlastleftlexememanymany1manyAtLeastmanyAtMostmanyBetweenmanyTillmapnodenotnthoptionaloptionalConsumeoptionalSeparatedByouterpaddedparenthesizedpeekpostfixprefixpurequotedrecoverrightseparatedByseparatedBy1separatedEndByseparatedEndBy1separatedUntilsequenceskipskipManyskipMany1surroundedunlessuntilvalidatevaluewhen
- Utils
Core Concepts
The Parser Type
A Parser<T> is a function that takes an input string and returns a Result<T>. The generic T represents the type of value the parser produces.
type Parser<T> = (input: string) => Result<T>;The Result Type
Every parser returns a Result<T> which is either:
- Success — The parser matched and produced a value
- Failure — The parser did not match
type Success<T> = { ok: true; value: T; remaining: string };
type Failure = { ok: false; error?: string };
type Result<T> = Success<T> | Failure;The remaining string is crucial — it represents what input is left after the parser has done its work. This is how we "consume" input and chain parsers together.
Creating a Parser
Use create to wrap a parsing function:
import { create, success, failure } from 'unitas';
const parser = create<string>((input) => {
if (input.startsWith('hello')) {
return success('hello', input.slice(5));
}
return failure('expected "hello"');
});Understanding the Monadic Nature
Parsers are monadic, which means they follow certain laws that make them composable:
- Left identity:
create(success(a, input))behaves likea - Right identity:
parsercomposed withsuccessreturns equivalent result - Associativity: Composition order doesn't affect final result
The practical implication is that you can chain and combine parsers predictably.
Success Results
When a parser successfully matches, it returns:
{ ok: true, value: 'hello', remaining: ' world' }
│ │ │
│ │ └── What's left to parse
│ └── The parsed value
└── Always true for successFailure Results
When a parser fails, it returns:
{ ok: false } // Generic failure
{ ok: false, error: 'expected a' } // Failure with messageThe error field is optional — you can always add meaningful error messages later using label.
Core (unitas)
Core provides the fundamental types and functions for building parsers.
failure('unexpected input'); // { ok: false, error: 'unexpected input' }type Math = {
expr: number;
term: number;
value: number;
};
const g = grammar<Math>({
expr: (p) =>
chainLeft1(
p.term,
map(char('+'), () => (l, r) => l + r),
),
term: (p) =>
choice(
p.value,
map(sequence(char('('), p.expr, char(')')), ([, v]) => v),
),
value: () => digits,
});
run(g.expr, '1+2'); // 3
run(g.expr, '1+2+3'); // 6
run(g.expr, '(1+2)'); // 3label(char('x'), 'letter x')(''); // { ok: false, error: 'expected letter x' }lazy(() => char('a'))('abc'); // { ok: true, value: 'a', remaining: 'bc' }match(success('hello', ''), { success: (v) => v, failure: () => 'failed' }); // 'hello'const memoDigits = memoize(digits);
memoDigits('123'); // { ok: true, value: 123, remaining: '' }create((input) => success('parsed', input.slice(6)))('hello world'); // { ok: true, value: 'parsed', remaining: 'world' }run(string('hello'), 'hello'); // 'hello'success('hello', ' world'); // { ok: true, value: 'hello', remaining: ' world' }Terminals (unitas/terminals)
Terminals are the basic building blocks that match specific parts of the input. They don't combine other parsers — they directly inspect the input string.
char('A')('ABC'); // { ok: true, value: 'A', remaining: 'BC' }charOf(['a', 'b', 'c'])('abc'); // { ok: true, value: 'a', remaining: 'bc' }noneOf(['a', 'b', 'c'])('xyz'); // { ok: true, value: 'x', remaining: 'yz' }oneOf(['hello', 'hell', 'help'])('helpful'); // { ok: true, value: 'help', remaining: 'ful' }regex(/^\w+/)('hello world'); // { ok: true, value: 'hello', remaining: ' world' }satisfy((c) => c === 'a')('abc'); // { ok: true, value: 'a', remaining: 'bc' }string('hello')('hello world'); // { ok: true, value: 'hello', remaining: ' world' }stringOf('abc')('abcdef'); // { ok: true, value: 'a', remaining: 'bcdef' }take(3)('abcdef'); // { ok: true, value: 'abc', remaining: 'def' }takeWhile((c) => c !== 'x')('abcx'); // { ok: true, value: 'abc', remaining: 'x' }token('let')('let x'); // { ok: true, value: 'let', remaining: 'x' }
token('let')('let1'); // { ok: true, value: 'let', remaining: '1' }
token('let')('let x'); // { ok: true, value: 'let', remaining: 'x' }word('let')('let x'); // { ok: true, value: 'let', remaining: 'x' }
word('let')('let1'); // { ok: false }
word('if')('if (x)'); // { ok: true, value: 'if', remaining: '(x)' }Primitives (unitas/primitives)
Primitives are pre-built parser instances ready to use. Unlike terminals which are factory functions (like char('x')), primitives are constants you can pass directly to combinators.
alphaNum('a1'); // { ok: true, value: 'a', remaining: '1' }
alphaNum('1a'); // { ok: true, value: '1', remaining: 'a' }alphaNums('abc123'); // { ok: true, value: 'abc123', remaining: '' }anyChar('abc'); // { ok: true, value: 'a', remaining: 'bc' }bool('true'); // { ok: true, value: true, remaining: '' }
bool('false'); // { ok: true, value: false, remaining: '' }
bool('trueABC'); // { ok: true, value: true, remaining: 'ABC' }crlf('\r\nabc'); // { ok: true, value: '\r\n', remaining: 'abc' }digit('5abc'); // { ok: true, value: 5, remaining: 'abc' }digits('123abc'); // { ok: true, value: 123, remaining: 'abc' }eof(''); // { ok: true, value: null, remaining: '' }eol('\nabc'); // { ok: true, value: '\n', remaining: 'abc' }float('1.23'); // { ok: true, value: 1.23, remaining: '' }
float('-2.5'); // { ok: true, value: -2.5, remaining: '' }
float('1.23abc'); // { ok: true, value: 1.23, remaining: 'abc' }hexDigit('fF9'); // { ok: true, value: 'f', remaining: 'F9' }hexDigits('deadbeef'); // { ok: true, value: 'deadbeef', remaining: '' }identifier('variable_name'); // { ok: true, value: 'variable_name', remaining: '' }integer('42'); // { ok: true, value: 42, remaining: '' }
integer('-7'); // { ok: true, value: -7, remaining: '' }
integer('123abc'); // { ok: true, value: 123, remaining: 'abc' }letter('abc'); // { ok: true, value: 'a', remaining: 'bc' }letters('abc123'); // { ok: true, value: 'abc', remaining: '123' }line('hello\nworld'); // { ok: true, value: 'hello', remaining: '\nworld' }literal('foo-bar'); // { ok: true, value: 'foo-bar', remaining: '' }
literal('123abc'); // { ok: true, value: '123abc', remaining: '' }lowercase('abc'); // { ok: true, value: 'a', remaining: 'bc' }lowercases('abcDEF'); // { ok: true, value: 'abc', remaining: 'DEF' }nl('\ntext'); // { ok: true, value: '\n', remaining: 'text' }number('42'); // { ok: true, value: 42, remaining: '' }
number('3.14'); // { ok: true, value: 3.14, remaining: '' }
number('-7'); // { ok: true, value: -7, remaining: '' }
number('-2.5'); // { ok: true, value: -2.5, remaining: '' }octDigit('7abc'); // { ok: true, value: '7', remaining: 'abc' }octDigits('0777abc'); // { ok: true, value: '0777', remaining: 'abc' }position('abc'); // { ok: true, value: 3, remaining: 'abc' }rest('hello'); // { ok: true, value: 'hello', remaining: '' }space(' abc'); // { ok: true, value: ' ', remaining: 'abc' }spaces(' abc'); // { ok: true, value: ' ', remaining: 'abc' }tab('\ttext'); // { ok: true, value: '\t', remaining: 'text' }uppercase('ABC'); // { ok: true, value: 'A', remaining: 'BC' }uppercases('ABCdef'); // { ok: true, value: 'ABC', remaining: 'def' }whitespace(' abc'); // { ok: true, value: ' ', remaining: 'abc' }whitespaces(' abc'); // { ok: true, value: ' ', remaining: 'abc' }Combinators (unitas/combinators)
Combinators are functions that take one or more parsers and return a new parser. They are the "glue" that lets you compose complex parsers from simple ones.
attempt(string('hello'))('hello world'); // { ok: true, value: 'hello', remaining: ' world' }bind(digits, (n) => take(n))('3abc'); // { ok: true, value: 'abc', remaining: '' }braced(string('hi'))('{hi}'); // { ok: true, value: 'hi', remaining: '' }bracketed(string('hi'))('[hi]'); // { ok: true, value: 'hi', remaining: '' }chainLeft(digits, operation)('1+2+3'); // { ok: true, value: 6, remaining: '' }
chainLeft(digits, operation)('10-3+2'); // { ok: true, value: 9, remaining: '' }chainLeft1(digits, operation)('1+2+3'); // { ok: true, value: 6, remaining: '' }
chainLeft1(digits, operation)('8/2*3'); // { ok: true, value: 12, remaining: '' }chainRight(digits, operation)('2-1-1'); // { ok: true, value: 2, remaining: '' }
chainRight(digits, operation)('4/2/2'); // { ok: true, value: 4, remaining: '' }chainRight1(digits, operation)('2-1-1'); // { ok: true, value: 2, remaining: '' }
chainRight1(digits, operation)('4/2/2'); // { ok: true, value: 4, remaining: '' }choice(string('hello'), string('world'))('hello'); // { ok: true, value: 'hello', remaining: '' }concat(many(letter))('abc123'); // { ok: true, value: 'abc', remaining: '123' }
concat(many(letter), '-')('abc123'); // { ok: true, value: 'a-b-c', remaining: '123' }consume(string('hello'))('hello world'); // { ok: true, value: null, remaining: ' world' }endBy(string('item'), char(';'))('item;item;item;'); // { ok: true, value: ['item', 'item', 'item'], remaining: '' }endBy1(string('item'), char(';'))('item;item;item;'); // { ok: true, value: ['item', 'item', 'item'], remaining: '' }exactly(char('a'), 3)('aaa'); // { ok: true, value: ['a', 'a', 'a'], remaining: '' }first(sequence(char('a'), digit))('a1bc'); // { ok: true, value: 'a', remaining: 'bc' }flag(string('*'))('*abc'); // { ok: true, value: true, remaining: 'abc' }
flag(string('*'))('abc'); // { ok: true, value: false, remaining: 'abc' }fold(digit, [], (acc, d) => [...acc, d])('123'); // { ok: true, value: [1, 2, 3], remaining: '' }fold1(digit, 0, (acc, d) => acc + d)('123'); // { ok: true, value: 6, remaining: '' }foldRight(digit, [], (acc, d) => [...acc, d])('123'); // { ok: true, value: [3, 2, 1], remaining: '' }foldRight1(digit, [], (acc, d) => [...acc, d])('123'); // { ok: true, value: [3, 2, 1], remaining: '' }fuse(char('a'), char('b'), char('c'))('abc'); // { ok: true, value: 'abc', remaining: '' }
fuse(string('hello'), char(' '), string('world'))('hello world'); // { ok: true, value: 'hello world', remaining: '' }guard(true, string('hello'))('hello'); // { ok: true, value: 'hello', remaining: '' }
guard(false, string('hello'))('hello'); // { ok: false }inner(char('('), string('hi'), char(')'))('(hi)'); // { ok: true, value: 'hi', remaining: '' }interleaved(char('a'), char(','))('a,a,a'); // { ok: true, value: ['a', ',', 'a', ',', 'a'], remaining: '' }last(sequence(char('a'), char('b')))('ab'); // { ok: true, value: 'b', remaining: '' }left(string('hello'), string('world'))('helloworld'); // { ok: true, value: 'hello', remaining: '' }lexeme(string('hello'))('hello world'); // { ok: true, value: 'hello', remaining: 'world' }many(char('a'))('aaa'); // { ok: true, value: ['a', 'a', 'a'], remaining: '' }many1(char('a'))('aaa'); // { ok: true, value: ['a', 'a', 'a'], remaining: '' }manyAtLeast(char('a'), 2)('aaa'); // { ok: true, value: ['a', 'a', 'a'], remaining: '' }manyAtMost(char('a'), 2)('aaa'); // { ok: true, value: ['a', 'a'], remaining: 'a' }manyBetween(char('a'), 2, 3)('aaa'); // { ok: true, value: ['a', 'a', 'a'], remaining: '' }manyTill(char('a'), char('b'))('aaab'); // { ok: true, value: ['a', 'a', 'a'], remaining: '' }map(string('hello'), (v) => v.toUpperCase())('hello'); // { ok: true, value: 'HELLO', remaining: '' }node('binop', { left: digits, op: char('+'), right: digits })('1+2'); // { ok: true, value: { type: 'binop', left: 1, op: '+', right: 2 }, remaining: '' }
node('number', { value: digits })('123'); // { ok: true, value: { type: 'number', value: 123 }, remaining: '' }not(string('hello'))('world'); // { ok: true, value: null, remaining: 'world' }nth(sequence(char('a'), char('b'), char('c')), 1)('abc'); // { ok: true, value: 'b', remaining: '' }optional(string('hello'))('hello'); // { ok: true, value: 'hello', remaining: '' }
optional(string('hello'))('world'); // { ok: true, value: null, remaining: 'world' }optionalConsume(string('hello'))('hello world'); // { ok: true, value: undefined, remaining: ' world' }
optionalConsume(string('hello'))('world'); // { ok: true, value: undefined, remaining: 'world' }optionalSeparatedBy(digits, char(','))('1,2'); // { ok: true, value: [1, 2], remaining: '' }
optionalSeparatedBy(digits, char(','))(',1'); // { ok: true, value: [null, 1], remaining: '' }
optionalSeparatedBy(digits, char(','))('1,'); // { ok: true, value: [1], remaining: '' }outer(char('('), string('hi'), char(')'))('(hi)'); // { ok: true, value: ['(', ')'], remaining: '' }padded(string('hi'))(' hi '); // { ok: true, value: 'hi', remaining: '' }parenthesized(string('hi'))('(hi)'); // { ok: true, value: 'hi', remaining: '' }peek(string('hello'))('hello world'); // { ok: true, value: 'hello', remaining: 'hello world' }postfix(
char('a'),
map(char('!'), () => (x) => x),
)('a!'); // { ok: true, value: 'a', remaining: '' }prefix(
map(char('-'), () => (x) => -x),
digit,
)('-5'); // { ok: true, value: -5, remaining: '' }pure(42)('abc'); // { ok: true, value: 42, remaining: 'abc' }
pure('ok')(''); // { ok: true, value: 'ok', remaining: '' }quoted(string('hello'))('"hello"'); // { ok: true, value: 'hello', remaining: '' }recover(string('hello'), 'default')('world'); // { ok: true, value: 'default', remaining: 'world' }right(string('hello'), string('world'))('helloworld'); // { ok: true, value: 'world', remaining: '' }separatedBy(char('a'), char(','))('a,a,a'); // { ok: true, value: ['a', 'a', 'a'], remaining: '' }separatedBy1(char('a'), char(','))('a,a,a'); // { ok: true, value: ['a', 'a', 'a'], remaining: '' }separatedEndBy(char('a'), char(';'))('a;a;a;'); // { ok: true, value: ['a', 'a', 'a'], remaining: '' }separatedEndBy1(char('a'), char(';'))('a;a;a;'); // { ok: true, value: ['a', 'a', 'a'], remaining: '' }separatedUntil(char('a'), char(','), char(';'))('a,a,a;'); // { ok: true, value: ['a', 'a', 'a'], remaining: '' }sequence(char('a'), char('b'), char('c'))('abc'); // { ok: true, value: ['a', 'b', 'c'], remaining: '' }skip(char('a'), 2)('aabc'); // { ok: true, value: null, remaining: 'bc' }skipMany(char('a'))('aaabc'); // { ok: true, value: null, remaining: 'bc' }skipMany1(char('a'))('aaabc'); // { ok: true, value: null, remaining: 'bc' }surrounded(char('['), string('hi'), char(']'))('[hi]'); // { ok: true, value: 'hi', remaining: '' }
surrounded(char('a'), char('b'), char('c'))('abc'); // { ok: true, value: 'b', remaining: '' }unless(false, string('hello'))('hello'); // { ok: true, value: 'hello', remaining: '' }
unless(true, string('hello'))('hello'); // { ok: true, value: null, remaining: 'hello' }until(char('a'), char('b'))('baaa'); // { ok: true, value: [], remaining: 'baaa' }
until(char('a'), char('b'))('aaba'); // { ok: true, value: ['a', 'a'], remaining: 'ba' }validate(digit, (n) => n > 5)('7'); // { ok: true, value: 7, remaining: '' }
validate(digit, (n) => n > 5)('3'); // { ok: false }value(string('true'), true)('true'); // { ok: true, value: true, remaining: '' }
value(string('null'), null)('null'); // { ok: true, value: null, remaining: '' }when(flag(char('*')), pure('many'), pure('one'))('*rest'); // { ok: true, value: 'many', remaining: 'rest' }
when(flag(char('*')), pure('many'), pure('one'))('abc'); // { ok: true, value: 'one', remaining: 'abc' }Utils (unitas/utils)
Utils are utility functions for working with parser results, arrays, and function composition.
filter([1, 2, 3])([1, 2, 3, 4, 5]); // [4, 5]
filter([1, 2], true)([1, false, 3]); // [3]flatten()([1, [2, [3]]]); // [1, 2, [3]]
flatten(2)([1, [2, [3]]]); // [1, 2, 3]join()([1, 2, 3]); // '123'
join('-')([1, 2, 3]); // '1-2-3'pick(0, 2)(['a', 'b', 'c']); // ['a', 'c']
pick(2, 4)(['a', 'b', 'c', 'd', 'e']); // ['c', 'e']pipe(lexeme)(letters)('xyz abc'); // { ok: true, value: 'xyz', remaining: 'abc' }pop()([1, 2, 3]); // 3shift()([1, 2, 3]); // 1spread()(1, 2, 3); // [1, 2, 3]License
MIT
