@pehotin/html-parser
v1.0.5
Published
Small, strict HTML‑like parser that produces a simple AST using a finite state machine.
Readme
html-parser
Small, strict HTML‑like parser that produces a simple AST using a finite state machine.
Features
- Finite state machine parser: implemented on top of
@prender-company/fsm. - Simple AST: root, element, and text nodes with attributes.
- Strict error handling: reports line and column on parse errors.
- Node.js friendly: plain CommonJS, no runtime dependencies beyond the FSM and
graceful-fs(used by the playground).
Installation
From this repository:
npm installIf you publish this package, you can then install it in another project as:
npm install html-parserBasic usage
const HtmlParser = require('html-parser')
const source = `<div class="wrapper">
<span>Hello</span>
</div>`
const ast = new HtmlParser(source).parse()
console.log(JSON.stringify(ast, null, 2))parse() returns the root AST node. On invalid input, the parser logs a message like:
Parse error: Unexpected character 'x' on line 3 at column 5and returns undefined.
AST structure
The AST is built from two node types, defined in src/types.js.
Root and element nodes (
Node):{ type: 'root' | 'element', tagName: 'div', // empty string for the root node attrs: [ { name: 'class', value: 'wrapper' }, // ... ], children: [ /* Node | TextNode */ ] }Text nodes (
TextNode):{ type: 'text', value: 'Hello' }
Attributes without an explicit value are treated as boolean and will have value: true.
Playground
There is a small playground that reads playground/index.act, parses it, and prints the resulting AST:
npm run playUnder the hood, this runs node playground/index.js, which:
- reads
playground/index.act, - parses it with
HtmlParser, - logs the AST using Node's
util.inspect.
Scripts
The following npm scripts are available:
npm start: runsnode ./src/index.js(exports the parser; mainly for quick manual testing).npm run build: transpilessrc/intodist/using Babel.npm run play: runs the playground described above.
License
MIT © Artem P
