xml-to-html-converter
v0.2.0
Published
Zero dependency XML to HTML converter for Node environments
Maintainers
Readme
xml-to-html-converter
A zero-dependency Node.js package for converting XML to HTML. Currently in pre-1.0.0 development, building the foundation one functional part at a time. Full XML-to-HTML conversion is the goal of v1.0.0.
v0.x.x: XML Node Extraction & Scaffolding
Version 0.2.x is focused entirely on parsing raw XML into a structured tree of nodes. The scaffold function walks an XML string and produces an array of XmlNode objects, each carrying its role, its raw source text, and its position in the document, both globally across the full document and locally within its parent.
interface XmlAttribute {
name: string;
value: string;
}
interface XmlNode {
role: XmlNodeRole;
raw: string;
xmlTag?: string;
xmlInner?: string;
xmlAttributes?: XmlAttribute[];
globalIndex: number;
localIndex: number;
children?: XmlNode[];
malformed?: true;
}
type XmlNodeRole =
| "closeTag"
| "comment"
| "doctype"
| "openTag"
| "processingInstruction"
| "selfTag"
| "textLeaf";This scaffold is the foundation everything else will be built on. No transformation, no HTML output, no opinions about content, just an accurate, traversable representation of what the XML says.
Where I am right now
v0.xis building the scaffold and the first render pass.
scaffold(xml)reads any XML string and returns a nested node tree- Every node knows its
role, itsrawsource string, itsglobalIndexin the document, and itslocalIndexwithin its parent- Tag nodes (
openTag,selfTag) also carryxmlTag,xmlInner, andxmlAttributes— the parsed tag name, raw attribute string, and structured attribute array- Broken XML is never thrown — malformed nodes are flagged with
malformed: truein place and the tree is built regardlessrender(nodes)takes the scaffold output and converts it to an HTML string — every XML element becomes a<div>withdata-taganddata-attrs-*attributes
v1.0.0is when this package becomes what it says it is: a full XML-to-HTML converter. Everything before that is the work to get there.
Install
npm install xml-to-html-converterUsage
Parsing XML into a node tree
import { scaffold } from "xml-to-html-converter";
const tree = scaffold(`
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book category="cooking">
<title lang="en">Everyday Italian</title>
</book>
</bookstore>
`);scaffold returns a flat array of root-level nodes. Each openTag node carries its children nested inside it:
[
{
"role": "processingInstruction",
"raw": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>",
"globalIndex": 0,
"localIndex": 0
},
{
"role": "openTag",
"raw": "<bookstore>",
"globalIndex": 1,
"localIndex": 1,
"children": [
{
"role": "openTag",
"raw": "<book category=\"cooking\">",
"xmlTag": "book",
"xmlInner": "category=\"cooking\"",
"xmlAttributes": [{ "name": "category", "value": "cooking" }],
"globalIndex": 2,
"localIndex": 0,
"children": [
{
"role": "openTag",
"raw": "<title lang=\"en\">",
"xmlTag": "title",
"xmlInner": "lang=\"en\"",
"xmlAttributes": [{ "name": "lang", "value": "en" }],
"globalIndex": 3,
"localIndex": 0,
"children": [
{
"role": "textLeaf",
"raw": "Everyday Italian",
"globalIndex": 4,
"localIndex": 0
}
]
}
]
}
]
}
]Converting the tree to HTML
import { scaffold, render } from "xml-to-html-converter";
const html = render(
scaffold(`
<bookstore>
<book category="cooking">
<title lang="en">Everyday Italian</title>
</book>
</bookstore>
`),
);render walks the node tree and converts every XML element to a <div>. The original tag name is preserved in data-tag and each attribute becomes its own data-attrs-* attribute:
<div data-tag="bookstore">
<div data-tag="book" data-attrs-category="cooking">
<div data-tag="title" data-attrs-lang="en">Everyday Italian</div>
</div>
</div>Processing instructions and doctypes are dropped. Comments are passed through unchanged.
Node Shape
Every node in the tree has the following fields:
| Field | Type | Description |
| --------------- | ---------------- | --------------------------------------------------------------------------------------------------------------------- |
| role | XmlNodeRole | What kind of node this is |
| raw | string | The exact source string, untouched |
| xmlTag | string | Tag name only, e.g. "book" or "env:Envelope". Present on openTag, selfTag, closeTag |
| xmlInner | string | Everything after the tag name inside the brackets, verbatim. Present on openTag and selfTag when attributes exist |
| xmlAttributes | XmlAttribute[] | Parsed array of { name, value } attribute objects. Present on openTag and selfTag when attributes exist |
| globalIndex | number | Position in the entire document (never resets) |
| localIndex | number | Position within the parent's children array |
| children | XmlNode[] | Present only on openTag - the nested nodes inside |
| malformed | true | Present only when the structure is broken |
Node Roles
| Role | Has children | Description |
| ----------------------- | ------------ | --------------------------------------------------- |
| openTag | yes | An opening tag, e.g. <book category="web"> |
| selfTag | no | A self-closing tag, e.g. <br/> |
| closeTag | no | Only appears when stray (no matching open) |
| processingInstruction | no | e.g. <?xml version="1.0"?> |
| comment | no | e.g. <!-- a comment --> |
| textLeaf | no | Text content between tags, including CDATA sections |
| doctype | no | e.g. <!DOCTYPE html> or <!DOCTYPE root [...]> |
Malformed XML
scaffold never throws. No matter what the input looks like, it always returns a complete tree. Malformed structures are flagged with malformed: true in place and the walk continues.
Four cases are handled:
- Unclosed tags - opens but never closes, gets
malformed: true, children are still collected - Stray closing tags - a
</tag>with no matching open surfaces as acloseTagtoken withmalformed: true - Unclosed brackets - a
<with no matching>captures the remainder as a malformed token - Excessive nesting - documents nested beyond 500 levels have the deepest open tag flagged
malformed: trueto prevent a stack overflow
const tree = scaffold("<root><unclosed><valid>text</valid></root>");[
{
"role": "openTag",
"raw": "<root>",
"globalIndex": 0,
"localIndex": 0,
"malformed": true,
"children": [
{
"role": "openTag",
"raw": "<unclosed>",
"globalIndex": 1,
"localIndex": 0,
"malformed": true,
"children": [
{
"role": "openTag",
"raw": "<valid>",
"globalIndex": 2,
"localIndex": 0,
"children": [
{
"role": "textLeaf",
"raw": "text",
"globalIndex": 3,
"localIndex": 0
}
]
}
]
}
]
}
]Exports
import { scaffold, render, isMalformed } from "xml-to-html-converter";
import type {
XmlNode,
XmlNodeRole,
XmlAttribute,
MalformedXmlNode,
} from "xml-to-html-converter";| Export | Kind | Description |
| ------------------ | -------- | --------------------------------------------------- |
| scaffold | function | Parses an XML string and returns a node tree |
| render | function | Converts a node tree to an HTML string |
| isMalformed | function | Type guard, narrows XmlNode to MalformedXmlNode |
| XmlNode | type | The shape of every node in the tree |
| XmlNodeRole | type | Union of all valid role strings |
| XmlAttribute | type | Shape of a parsed attribute { name, value } |
| MalformedXmlNode | type | XmlNode narrowed to { malformed: true } |
Requirements
Node.js >=20.0.0
