npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@danielx/hera

v0.8.13

Published

Small and fast parsing expression grammars

Downloads

191

Readme

Hera

I sing of golden-throned Hera whom Rhea bare. Queen of the Immortals is she, surpassing all in beauty: she is the sister and wife of loud-thundering Zeus,--the glorious one whom all the blessed throughout high Olympos reverence and honour even as Zeus who delights in thunder.

— Homeric Hymn 12 to Hera (trans. Evelyn-White) (Greek epic C7th to 4th B.C.)

Build Coverage Status

The mother of all parsers.

Quickstart

npm install @danielx/hera
import Hera from "@danielx/hera"
import { readFileSync, writeFileSync } from "fs"

const filename = "cool-grammar.hera"
const source = readFileSync(filename, "utf-8")

rules = Hera.parse(source, {
  filename
})

// Generate parser.js source code
parserSource = Hera.compile(rules)
writeFileSync("cool-parser.js", parserSource)

import { createRequire } from "module";
const require = createRequire(import.meta.url);
parser = require("./cool-parser")
// or generate parser object
parser = Hera.generate(source)

parser.parse("text that my cool parser should parse")

Overview

Hera uses Parsing Expression grammars to create parsers for programatic languages.

Hera grammars are indentation based, with each rule consisting of a name, indented choices that the name could expand to, and an optional further-indented code block (handler) for each choice:

RuleName
  Choice1
  Choice2 ->
    ...code...

Parsing makes heavy use of the built-in regular expression capabilities of JavaScript. Terminals are either literal strings or regular expressions. Rules are composed of choices or sequences of other rules and terminals.

The first rule listed in the grammar is the starting point. Each choice for the rule is checked in order, returning on the first match.

Definitions

Rule: A named production. The name is written on one unindented line by itself, and the choices (possible expansions) are written on separate lines with common indentation. For example:

RuleName
  Choice1
  Choice2

Choices are attempted in order, and the first one to succeed wins. Note that this property is recursive, so may involve backtracking. Each choice can be any expression, as defined below, together with an optional handler.

Expression: An expression can be a sequence, choice expression, or repetition of terminals, rule names, or expressions (recursive sequences, choice expressions, or repetitions). When mixing sequences, choice expressions, and repetitions, use parentheses to separate them. For example, Part ( "," Part )* is a sequence of a rule name and a repetition of a terminal and a rule name, representing one or more Parts separated by commas.

Choice expression (/): A short inline way to specify a choice between two or more subchoices. For example, This / That / Other matches This or That or Other, whichever succeeds first. This is equivalent to AnonymousRule where

AnonymousRule
  This
  That
  Other

Sequence ( ): One thing after another, separated by spaces. For example, "(" Expr ")" matches the character "(" followed by a match of Expr followed by the character ")". Sequences with more than one part return an array of the parts.

Terminal ("...", /.../): A string literal (surrounded by double quotes) or a regular expression (normally surrounded by forward slashes). Simple regular expressions consisting of just . or character classes like [A-Z][a-z]* do not need surrounding slashes. In any case, the entire terminal must be matched at the exact position. (For regular expressions, this is as if the expressions started with ^ and it was applied to the rest of the string.) Terminals return a string when they match. If the entire choice of a rule is a regular expression, then the groups of the regular expression are available as $1, $2, ... and the matching string is available as $0.

Repetition (*, +): ...* means "zero or more expansions of ...", and ...+ means one or more repetitions of Choice. Repetitions return an array of the matches.

Lookahead predicates (&, !): &... and !... assert the existence or non-existence, respectively, of a match of ..., without advancing the position or consuming any input. For example, &/\s/ is like the look-ahead regular expression /(?=\s)/.

Stringify (*): *... matches ... but returns just the string of the input that matched, instead of the computed return value from the matching process (from handlers and the arrays from sequences and repetitions).

Handler: A mapping from the matched choice to a language primitive. Handlers are attached to rule choices by adding -> after the choice. Optionally, -> can be preceded by a return type annotation of the form ::type. The most general handler is JavaScript/TypeScript code indented beneath the choice, which returns the desired value for the matched choice. This code can also refer to the default value (strings for terminals, arrays for sequences or repetitions) via $0, or to the nth matching item in the topmost sequence via $n. Each item in the topmost sequence can also be named via a :name suffix (for example, Block:name), and then the code can also refer to it as name. If the expansion is a single regular expression, $n instead refers to the nth group in the regex. The $n notation can also be put on the same line as the -> as a shorthand for return $n on a separate line; this also works for simple expressions like JavaScript literals. JavaScript code can return the special value $skip to indicate a failed match.

Comment (#...): Outside of handlers, lines starting with # (after possible indentation) are treated as comments. Inside handlers, use JavaScript // or /*...*/ comments.

Code Blocks

You can use three backticks to create a code block that is inserted directly into the compiled file. These are useful for creating utility functions or adding exports.

```
function toInt(n) {
  parseInt(n)
}
```

Number
  [0-9]+ ->
    return toInt($0)

Demos

If these demos are not interactive then view this page at https://danielx.net/hera/docs/README.html


URL Parser https://tools.ietf.org/html/rfc3986

#! hera url

Math example.

#! hera math

Hera is self generating:

#! hera hera

Token location example

#! hera
Grammar
  Punctuation? A+ Punctuation? ->
    return [].concat($1, $2, $3)

A
  ("a" / "A") ->
    return {type: "A", loc: $loc, value: $1}

Punctuation
  "!" / "." / "?" ->
    return {type: "Punctuation", loc: $loc, value: $1}

Regex Groups

#! hera
Phone
  /1-(\d{3})-(\d{3})-(\d{4})/ -> [1, 2, 3]

#! hera
Grammar
  NamedMapping NamedMapping

NamedMapping
  Punctuation -> ["P", 0]

Punctuation
  "."

Changelog

  • 0.7.3
    • Using compiled parser instead of VM for performance boost.
    • Publishing TypeScript types with package.
    • cjs loader
      require("@danielx/hera/register")
      const {parse} = require("./parser.hera")
    • esbuild plugin require("@danielx/hera/esbuild-plugin")
    • hera CLI tool with experimental TypeScript output support hera --types < cool-app.hera > parser.ts
    • Added support for number literals in structural handlers.
    • Changed structural handling to use $1, etc. instead of 1 for positional variables.
    • Added prefix $ text operator.
    • Added . any character matcher.
    • Fixed structural mapping bug where in ["R", $1] the $1 would take the first element of the result rather than the whole result on non-sequence handlers.
  • 0.7.2

Experiments

Compiling parsers to TypeScript.

Glossary

EOS - End of statement

EOF - End of file/input

V2 Ideas

Easier way to output a string from a portion of a matching sequence. Maybe add a caret/select prefix operator.

Optimize option, sequence, and repetition of regexes (combine together) to reduce calls to invoke.

Splat in mapping and other convience mappings.

Named arguments to handlers.

Reduce backtracking on common subsequence:

RuleBody
  Indent Sequence EOS (Indent ^Sequence EOS)+ -> ["/", [2, 4...]]
  Indent Sequence EOS -> 2

The above rule should be able to be made efficient (won't need to backtrack all the way to the beginning) since it has a common subsequence it should be able to re-use the work already done.

One alternative is to make it one rule with an optional section and add logic into the handler, but that seems crude.


#! setup
require("./interactive")(register)