npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@prostojs/parser

v0.6.0

Published

Parse anything — a composable, hooks-based parser toolkit

Readme

Build a parser for anything — in minutes, not months.

Stop writing ad-hoc regex spaghetti or reaching for heavyweight parser generators. @prostojs/parser gives you composable building blocks: define your nodes, wire them together, and get a working parser with structured output — fast.

Why This Parser?

It's LEGO for parsers. Each node is a self-contained piece — a tag, a string, a comment, an attribute. Snap them together and you have a full grammar. Need to change something? Swap one block, everything else stays.

Output is built during parsing. Hooks fire as tokens are matched — onOpen, onClose, onContent, onChild. Your data is in its final shape the moment parsing ends. No AST-to-output conversion step. No tree walking.

Near-zero boilerplate. Write data: { tag: '', attrs: {} } and it just works — auto-cloned per match, regex named groups auto-mapped to fields. A full XML-to-JSON parser is ~400 lines.

Competitive performance. A general-purpose toolkit parsing XML is within 4-36% of fast-xml-parser, a dedicated XML-only library. For most formats you'll parse, there is no dedicated alternative — and this is fast enough.

Install

npm install @prostojs/parser

30-Second Overview

Every parser is a tree of Nodes. Each node knows how to start, how to end, and what it can contain:

import { Node, parse } from '@prostojs/parser'

// A string: starts with a quote, ends with the same quote
const string = new Node<{ quote: string }>({
  name: 'string',
  start: { token: /(?<quote>["'])/, omit: true },
  end: { token: (ctx) => ctx.node.data.quote, omit: true },
  data: { quote: '' },
})

// A key=value pair: key captured from regex, value from content
const pair = new Node<{ key: string; value: string }>({
  name: 'pair',
  start: { token: /(?<key>\w+)\s*=\s*/, omit: true },
  end: { token: /\n|$/, omit: true },
  recognizes: [string],
  data: { key: '', value: '' },
  mapContent: 'value',
})

// Root: contains pairs, closes at EOF
const root = new Node({ name: 'root', eofClose: true, recognizes: [pair] })

const result = parse(root, 'name = "Alice"\nage = "30"')
// result.content → [ParsedNode{key:'name', value:'Alice'}, ...]

That's a working config file parser. No grammar files, no build step, no code generation.

How It Works

1. Define Nodes

A node is a pattern with a start token, an end token, and typed data:

const comment = new Node<{ text: string }>({
  name: 'comment',
  start: { token: '<!--', omit: true },
  end: { token: '-->', omit: true },
  data: { text: '' },
  mapContent: 'text',  // auto-joins text content into data.text
})

Tokens can be strings, RegExps (with named capture groups), or dynamic functions:

// String — exact match
start: '{'

// RegExp — captures data automatically
start: { token: /<(?<tag>\w+)/, omit: true }

// Dynamic — computed from current node's data
end: { token: (ctx) => `</${ctx.node.data.tag}>`, omit: true }

Token modifiers:

  • omit — strip the token from node content
  • eject — don't consume the match, let the parent handle it
  • backslash — ignore the token if preceded by \

2. Compose Them

Tell each node what children it can contain:

const root = new Node({ name: 'root', eofClose: true })
root.recognize(comment, tag, cdata)
tag.recognize(attribute, innerContent)
innerContent.recognize(comment, tag, cdata)

That's your grammar. No separate DSL — it's just JavaScript.

3. Add Hooks to Shape Output

Hooks fire during parsing — use them to build your output in its final format:

tag
  .onOpen((node, match) => {
    // start token matched — node.data is ready (named groups already mapped)
    // return false to reject this match
  })
  .onChild((child, node) => {
    // a child node was fully parsed
    // route its data wherever you need it
    if (child.node === attribute) {
      node.data.attrs[child.data.key] = child.data.value
    }
  })
  .onContent((text, node) => {
    // text is about to be added — transform or suppress it
    return text.trim()
  })
  .onClose((node) => {
    // end token matched — finalize the output
  })

4. Parse

import { parse } from '@prostojs/parser'

const result = parse(root, sourceString)
// result: ParsedNode with .content, .data, .start, .end

Key Features

Named Group Auto-Mapping

Regex named groups map directly to data fields — available before onOpen fires:

const tag = new Node<{ tag: string }>({
  start: { token: /<(?<tag>\w+)/ },
  data: { tag: '' },
})
.onOpen((node) => {
  console.log(node.data.tag) // already populated
})

Plain Data Templates

No factory functions. Just declare a plain object — it's auto-cloned per match with an optimized cloner:

data: { tag: '', attrs: {}, children: [] }
// primitives → spread clone
// objects/arrays → shallow clone

mapContent

Auto-join all text content into a data field on node close. Replaces the most common onClose pattern:

data: { text: '' },
mapContent: 'text',
// equivalent to: .onClose(node => { node.data.text = textContent(node) })

Utilities

import { textContent, children, findChild, findChildren, walk, printTree } from '@prostojs/parser'

textContent(node)              // joined string content
children(node)                 // child ParsedNodes (no strings)
findChild(node, targetNode)    // first child of a specific node type
findChildren(node, targetNode) // all children of a specific node type
walk(node, (child, depth) => { ... })  // depth-first walk
printTree(node)                // debug visualization

Node Options Reference

| Option | Type | Description | |--------|------|-------------| | name | string | Identifier (for debugging / printTree) | | start | TokenDef \| TokenDef[] | Start token(s) | | end | TokenDef \| TokenDef[] | End token(s) | | recognizes | Node[] | Child nodes this node can contain | | skip | Token \| Token[] | Tokens to silently skip (e.g. whitespace) | | bad | Token \| Token[] | Tokens that trigger a parse error | | eofClose | boolean | Allow this node to close at end of input | | data | T \| () => T | Data template (auto-cloned) or factory | | mapContent | string | Auto-join text content into this data field | | hooks | NodeHooks<T> | Inline hook definitions |

Error Handling

import { ParseError } from '@prostojs/parser'

try {
  parse(root, source)
} catch (e) {
  if (e instanceof ParseError) {
    console.log(e.message) // includes line, column, and context
  }
}

Throws on unclosed nodes and bad tokens with precise source positions.

Examples

Each example is a standalone parser showcasing different aspects of the API. All source is in the examples/ directory on GitHub.

| Example | What it parses | Highlights | |---------|---------------|------------| | XML-to-JSON | Full XML → JSON (fast-xml-parser compatible) | Dynamic end tokens, hooks-based output, entity decoding, ~400 lines | | JSON | JSON strings → JS values | onContent for bare primitives, state tracking for key/value disambiguation | | Math Evaluator | 2 + 3 * (4 - 1)11 | Recursive group nodes, result computed during parsing — no AST | | Template String | Hello, {{name}}! → parts array | Minimal 2-node parser, mapContent for zero-hook data capture | | CSS Selector | div.cls > span:hover → structured parts | Dynamic quote matching, regex tokenization in onContent | | URL Parser | URLs → protocol/host/path/query/hash | Named group auto-mapping, eject for boundary detection | | ESM Analyzer | JS/TS source → imports, exports, unused | String/comment nodes as "shields" against false positives |

Migration from v0.5

See MIGRATION.md for a comprehensive guide.

License

MIT