npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

flex-parse

v0.4.0

Published

A flexible document/ml parser, allowing non-standard practices and preferring a greedy 'preserve all' approach

Readme

Flex-Parse

⚠️ This library is in its early stages of development. Expect bugs, and expect them to be plentiful. Until a v1.0.0 release, this message will persist.

Flex-parse is a document parser that tries to abide by the following rules:

  1. Regardless of syntactic rules, if it can be understood, it will be parsed
  2. Preserve everything unless otherwise noted by the user through options

It returns a document structure utilizing virty, allowing you to query and manipulate the results.

A Note on Performance

While it would like to be, Flex-parse doesn't strive to be the fastest parser out there. As I near v1.0.0, I'll try to get some benchmarking done, but if you're looking for performance over anything else, this is not the library for you.

Installation

The release on Github prior to v1.0.0 will likely be the best place to install from. Development may be sporadic and/or rapid at times, and I won't always be pushing the latest updates to NPM.

# Latest release
pnpm i "https://github.com/jacoblockett/flex-parse"
# Release hosted on NPM
pnpm i flex-parse

Usage

Signature:

function parse(
	data: string | Buffer,
	options?: {
		ignoreEmptyText?: boolean
		onText?: (text: string) => string
		trimAttributes?: boolean
		trimText?: boolean
		truncateAttributes?: boolean
		truncateText?: boolean
	}
): Node

Basic:

import fp from "flex-parse"

const data = `<body>
    <h1>Hello, world!</h1>
    <h2>Sub-header</h2>
    <div id="some-id" bool-attr>
        <!-- todo -->
    </div>
</body>`

const parsed = fp(data)

console.log(parsed)

Using this basic example, you'll receive a structure that, when unmasked, looks something like this:

{
	"type": "element",
	"tagName": "ROOT",
	"children": [
		{
			"type": "element",
			"tagName": "body",
			"children": [
				{ "type": "text", "value": "\n    " },
				{
					"type": "element",
					"tagName": "h1",
					"children": [{ "type": "text", "value": "Hello, world!" }]
				},
				{ "type": "text", "value": "\n    " },
				{
					"type": "element",
					"tagName": "h2",
					"children": [{ "type": "text", "value": "Sub-header" }]
				},
				{ "type": "text", "value": "\n    " },
				{
					"type": "element",
					"tagName": "div",
					"attributes": { "id": "some-id", "bool-attr": "" },
					"children": [
						{ "type": "text", "value": "\n        " },
						{ "type": "comment", "value": "<!-- todo -->" },
						{ "type": "text", "value": "\n    " }
					]
				},
				{ "type": "text", "value": "\n" }
			]
		}
	]
}

💡 Flex-parse will always wrap the provided data in a "ROOT" element.

To learn more about what each option for the parser does, keep reading. And if you'd like to learn more about and how to use the structure that's returned, you can visit my virty library for details.

Options

All options default in such a way to preserve as much about the original data as possible. You must be explicit if you want QOL results, such as ignoring empty/structural text nodes, etc.

const options = {
	ignoreEmptyText: boolean,         // false
	onText: (text: string) => string, // undefined
	trimAttributes: boolean,          // false
	trimText: boolean,                // false
	truncateAttributes: boolean,      // false
	truncateText: boolean             // false
}

Table of Contents

Future Options Plans

🥸 Plans change. Not all of the options listed here will be sure to exist. Their current implementation notes might differ from their eventual implementation, their name might change, etc.

  • [ ] ignoreAttributes (ignores all attributes, removing them from the results)
  • [ ] ignoreCommentNodes (ignores all comment nodes, removing them from the results)
  • [ ] ignoreElementNodes (ignores all element nodes, removing them from the results)
  • [ ] ignoreTextNodes (ignores all text nodes, removing them from the results)
  • [ ] mustNotContainElementNodes (a list of case-sensitive element tag names that will throw an error if they contain any element nodes as a direct descendent)
  • [ ] mustNotContainTextNodes (a list of case-sensitive element tag names that will throw an error if they contain any text nodes as a direct descendent)
  • [ ] mustNotContainTextNodesStrict (a list of case-sensitive element tag names that will throw an error if they contain any text nodes as a direct or nested descendent)
  • [ ] mustNotSelfClose (a list of case-sensitive element tag names that will throw an error if they self-close)
  • [ ] mustPreserveWhitespace (a list of case-sensitive element tag names that, regardless of other options, will preserve their whitespace)
  • [ ] mustSelfClose (a list of case-sensitive element tag names that will throw an error if they don't self-close)
  • [ ] onAttribute (event fired when an attribute value is about to be pushed)
  • [ ] onComment (event fired when a comment node is about to be pushed)
  • [ ] onElement (event fired when an element node is about to be pushed)
  • [x] ~~onText (event fired when a text node is about to be pushed)~~
  • [ ] parseChildrenAsText (a list of case-sensitive element tag names that will not have its children parsed as anything more than text. useful for script tags in html, etc.)
  • [ ] parseAttributes (parses attributes into normalized js values, such as boolean attributes, numbers, dates, etc.)

🛣️ Roadmap to v1

This list isn't exhaustive and will likely be added to. These are some of the things required in order to satisfy a version 1 release that I haven't yet had the chance to implement parsing logic for yet:

  • [ ] CDATA
  • [ ] HTML tags that imply closure without needing an explicit closing tag
  • [ ] HTML foreign context elements
  • [ ] Pi elements
  • [ ] Namespaces

ignoreEmptyText

| Type | Default Value | Description | | - | - | - | | boolean | false | Ignores any empty or whitespace-only text nodes, removing them from the resulting structure. |

Example:

const html = `<body>
	<div id="main"></div>
</body>`
const parsedWithEmptyText = fp(html)
const parsedWithoutEmptyText = fp(html, { ignoreEmptyText: true })

// toObject is a wrapper function that creates an object from a virty node
console.dir(toObject(parsedWithEmptyText.firstChild), { depth: null })
console.dir(toObject(parsedWithoutEmptyText.firstChild), { depth: null })

Output:

$ node example.js
{
  type: 'element',
  tagName: 'body',
  children: [
    { type: 'text', value: '\n\t' },
    { type: 'element', tagName: 'div', attributes: { id: 'main' } },
    { type: 'text', value: '\n' }
  ]
}
{
  type: 'element',
  tagName: 'body',
  children: [ { type: 'element', tagName: 'div', attributes: { id: 'main' } } ]
}

onText

| Type | Default Value | Description | | - | - | - | | function | undefined | A function that fires every time a new text node has been parsed and written to the structure. Its return value will replace whatever the original text was. |

Signature:

function onText(text: string): string

Example:

const html = "<div><span>First</span> <span>Second</span></div>"
const parsed = fp(html, {
	onText: text => {
		if (text === "Second") return "Last"

		return text
	}
})

console.log(parsed.firstChild.lastChild.text)

Output:

$ node example.js
Last

trimAttributes

| Type | Default Value | Description | | - | - | - | | boolean | false | Trims leading and trailing whitespace surrounding each attribute value. |

Example:

const html = `<p class=" lorem ">Lorem ipsum dolor sit amet...</p>`
const parsedWithoutTrimmedAttributes = fp(html)
const parsedWithTrimmedAttributes = fp(html, { trimAttributes: true })

console.log(`"${parsedWithoutTrimmedAttributes.firstChild.attributes.class}"`)
console.log(`"${parsedWithTrimmedAttributes.firstChild.attributes.class}"`)

Output:

$ node example.js
" lorem "
"lorem"

trimText

| Type | Default Value | Description | | - | - | - | | boolean | false | Trims leading and trailing whitespace surrounding each text node. |

Example:

const html = `<p>  Lorem ipsum dolor sit amet...  </p>`
const parsedWithoutTrimmedText = fp(html)
const parsedWithTrimmedText = fp(html, { trimText: true })

console.log(`"${parsedWithoutTrimmedText.firstChild.text}"`)
console.log(`"${parsedWithTrimmedText.firstChild.text}"`)

Output:

$ node example.js
"  Lorem ipsum dolor sit amet...  "
"Lorem ipsum dolor sit amet..."

💡 trimText does not remove text nodes whose value become "" after trimming. If you need to remove empty or whitespace-only text nodes, use ignoreEmptyText instead.


truncateAttribute

| Type | Default Value | Description | | - | - | - | | boolean | false | Truncates all whitespace within each attribute value into a single U+0020 space value. |

Example:

const html = `<p class="a  b    c">Lorem ipsum dolor sit amet...</p>`
const parsedWithoutTruncatedAttributes = fp(html)
const parsedWithTruncatedAttributes = fp(html, { truncateAttributes: true })

console.log(`"${parsedWithoutTruncatedAttributes.firstChild.attributes.class}"`)
console.log(`"${parsedWithTruncatedAttributes.firstChild.attributes.class}"`)

Output:

$ node example.js
"a  b    c"
"a b c"

truncateText

| Type | Default Value | Description | | - | - | - | | boolean | false | Truncates all whitespace within each text node into a single U+0020 space value. |

Example:

const html = `<p>Lorem   ipsum     dolor sit amet...  </p>`
const parsedWithoutTruncatedText = fp(html)
const parsedWithTruncatedText = fp(html, { truncateText: true })

console.log(`"${parsedWithoutTruncatedText.firstChild.text}"`)
console.log(`"${parsedWithTruncatedText.firstChild.text}"`)

Output:

$ node example.js
"Lorem   ipsum     dolor sit amet...  "
"Lorem ipsum dolor sit amet... "