@payello/module-xml

v0.0.2

Published

2 years ago

Payello XML Module

0High
0Medium
0Low

payellodev

XMLParser

XMLParser Module is used to turn XML documents into a JavaScript object.

[[TOC]]

Usage

First load the module into your script using one of the following:

import module_xml from "module-xml"
const module_xml = require("module-xml")

Then create an XMLParser object to use to parse XML documents

const options = {
    parse: {
        trimValues: false
    }
}

const xmlParser = new module_xml.XMLParser(options)

Finally, use the XMLParser to parse an XML document.

xmlParser.parse("<message><to>Bob</to><body>Hi Bob!</body></message>")

Output:

{
  "message": {
    "to": "Bob",
    "body": "Hi Bob!"
  }
}

Options

Parse Options

parse.attributes

Type: boolean | undefinedDefault: true

Defines if the parser should parse attributes.

xml = `<root foo="bar">text</root>`

options.parse.attributes = true

output = {
    "root": "text"
}

parse.emptyAttributes

Type: boolean | undefinedDefault: undefined

Defines if the parser should parse attributes with no value as true.

xml = `<root foo="bar" enabled></root>`

// When set to false
options.parse.emptyAttributes = false
output = { 
    "root": { 
        "@_foo": "bar"
    }
}

// When set to true
options.parse.emptyAttributes = true
output = { 
    "root": { 
        "@_foo": "bar",
        "@_enabled": true
    }
}

parse.namespaces

Type: boolean | undefinedDefault: true

Defines if the parser should remove namespaces from tags & attribute names.

xml = `<myns:root coolns:foo="bar"></myns:root>`

// When set to true
options.parse.namespaces = true
output = { 
    "myns:root": { 
        "@_coolns:foo": "bar"
    }
}

// When set to false
options.parse.namespaces = false
output = { 
    "root": { 
        "@_foo": "bar"
    }
}

parse.excludedPaths

Type: string[]Default: []

Array of JPaths to exclude from parsing. Output of these paths will be the raw string of the node content including whitespace and any nested tags.

You can also specify a global tag to not process, such as *.pre to not process any <pre> tags or *.script to not process any <script> tags.

xml = `
<root>
    <a>
        Dont parse this!
        <c>Or this</c>
    </a>
    <b>Boom</b>
</root>`

options.parse.excludedPaths = [ "root.a" ]
output = { 
    "root": {
        "a": "\n        Dont parse this!\n        <c>Or this</c>\n    ",
        "b": "Boom"
    }
}

parse.unpairedTags

Type: string[]Default: []

Defines a list of tags which don't have a matching closing tag. For example <br> in HTML.

xml = `
<html>
    Hello world<br>
</html>`

// When not set
options.parse.unpairedTags = []
output = throw new Error(
    "Expected closing tag 'br' (opened in line 3, col 16) instead of closing tag 'html'.:4:1"
)

// When set
options.parse.unpairedTags = ["br"]
output = {
    "html": { 
        "br": '',
        '#value': 'Hello world'
    }
}

parse.entities

Type: booleanDefault: true

When true, entities (also known as variables) are supported.

xml = `
<!DOCTYPE note [
    <!ENTITY writer "Noseman">
    <!ENTITY copyright "Copyright 2022 Big Nose Studios">
]>
<root>
    <writer>&writer;</writer>
    <copyright>&copyright;</copyright>
</root>`

// When set to false
options.parse.entities = false
output = { 
    "root": {
        "writer": '&writer;',
        "copyright": '&copyright;' 
    }
}

// When set to true
options.parse.entities = true
output = { 
    "root": {
        "writer": 'Noseman',
        "copyright": 'Copyright 2022 Big Nose Studios' 
    }
}

parse.htmlEntities

Type: booleanDefault: false

When true, HTML entities such as € (€) will be parsed.

The following HTML entities are supported: | Result | Description | Entity | Decimal | | :- | :- | :- | :- | | \n | non-breaking space |   |   | | < | less than | < | < | | > | greater than | > | > | | & | ampersand | & | & | | " | double quotation mark | " | " | | ' | single quotation mark (apostrophe) | ' | ' | | ¢ | cent | ¢ | ¢ | | £ | pound | £ | £ | | ¥ | yen | ¥ | ¥ | | € | euro | € | € | | © | copyright | © | © | | ® | registered trademark | ® | ® | | ₹ | Indian Rupee | &inr; | ₹ |

parse.trimValues

Type: booleanDefault: true

Defines if the parser should remove leading and trailing whitespace from values.

xml = `<root foo="  bar   ">    hello    </root>`

// When set to false
options.parse.trimValues = false
output = { 
    "root": {
        "@_foo": "  bar   ",
        "#value": "    hello    "
    }
}

// When set to true
options.parse.trimValues = true
output = { 
    "root": {
        "@_foo": "bar",
        "#value": "hello"
    }
}

parse.isArray

Type: (name: string, jPath?: string, isLeafNode?: boolean, isAttribute?: boolean) => booleanDefault: (name) => false

A method to determine if a tag should be parsed as an array. If true then the output of the tag will be an array, otherwise it will be an object.

xml = `<root>Hello world</root>`

// When false
options.parse.isArray = (name) => { return  false }
output = { 
    "root": "Hello world"
}

// When true
options.parse.isArray = (name) => { return  true }
output = { 
    "root": [
        "Hello world"
    ]
}

Output Options

output.raw

Type: boolean | undefinedDefault: undefined

When true, output will be an array of all tags parsed in order. This results in a more verbose and detailed object.Attributes will be always be grouped in the ":@" property regardless of the attributesGroupName option.This can be helpful when building an XML document from the JS object, as it ensures that the original structure and hierarchy of the XML data is preserved.

xml = `
<root foo="bar">
    <hello>world</hello>
</root>`

options.output.raw = true

output = [
  {
    "root": [
      {
        "hello": [
          {
            "#value": "world"
          }
        ]
      }
    ],
    ":@": {
      "@_foo": "bar"
    }
  }
]

output.attributePrefix

Type: stringDefault: "@_"

Sets the prefix for attribute names.

xml = `<root foo="bar"></root>`

options.output.attributePrefix = "@attr_"

output = { 
    "root": { 
        "@attr_foo": "bar" 
    } 
}

output.attributeGroup

Type: string | undefinedDefault: undefined

When defined, all attributes will be grouped under the given name.

xml = `<root foo="bar"></root>`

options.output.attributeGroup = "attr"

output = {
    "root": {
        "attr": {
            "@_foo": "bar" 
        }
    }
}

output.tagValueName

Type: stringDefault: "#value"

Sets the name for the value of a tag.

xml = `
<a>
    text
    <b>alpha</b>
</a>`

options.output.tagValueName = "$text"

output = {
    "a": {
        "b": "alpha",
        "$text": "text"
    }
}

output.tagValueAlways

Type: boolean | undefinedDefault: undefined

If set to true then the node value property will always be created.

xml = `<root>Hello world</root>`

// When set to false
options.output.tagValueAlways = false
output = { 
    "root": "Hello world"
}

// When set to a string
options.output.tagValueAlways = true
output = { 
    "root": {
        "#value": "Hello world"
    }
}

output.commentName

Type: string | undefinedDefault: undefined

If defined then comments will also be parsed and output with the given name.

xml = `
<root>
    <!--This is a comment-->
    Hello world
</root>`

// When undefined
options.output.commentName = undefined
output = { 
    "root": "Hello world"
}

// When set
options.output.commentName = "#comment"
output = { 
    "root": {
        "#comment": "This is a comment",
        "#value": "Hello world"
    }
}

output.declarationTags

Type: boolean | undefinedDefault: undefined

When true, includes the <?xml> tag in output.

output.cdataName

Type: string | undefinedDefault: undefined

If defined, then CDATA values are parsed to the given name.

xml = `<root>name:<![CDATA[<some>Jack</some>]]><![CDATA[Jack]]></root>`

// When set to false
options.output.cdataName = false
output = { 
    "root": "name:<some>Jack</some>Jack"
}

// When set to a string
options.output.cdataName = "__cdata"
output = { 
    "root": {
        "__cdata": [
            "<some>Jack</some>",
            "Jack"
        ],
        "#value": "name:"
    }
}

output.piTags

Type: boolean | undefinedDefault: undefined

When true, includes PI Tags in output. Pi Tags are tags which begin with <?.

Transform Options

transform.tag.name

Type: undefined | (tagName: string) => stringDefault: undefined

A method executed on each tag name for the ability to alter the name of tags to a required format. Option disabled when false.

xml = `<root>Hello world</root>`

options.transform.tag.name = (tagName) => tagName.toUpperCase()
output = { 
    "ROOT": "Hello world"
}

transform.tag.value

Type: undefined | (tagName: string, val: any, jPath?: string, hasAttributes?: boolean, isLeafNode?: boolean) => anyDefault: undefined

A method to be executed for each tag value parsed. The output of the method will be used as the output value of the tag.

tagName: string // Name of the tag (e.g. root)
val: any // Value of the tag, after being trimmed if enabled
jPath?: string // JPath of the tag
hasAttributes?: boolean // Boolean indicating if the tag has attributes
isLeafNode?: boolean // Boolean indicating if the tag is a leaf node

transform.tag.castValue

Type: booleanDefault: true

Defines if the parser should cast numeric & boolean values to their respective types.For more control over number casting see transform.castNumber option.

xml = `<root>1234</root>`

// When set to false
options.transform.tag.castValue = false
output = { 
    "root": "1234" // string
}

// When set to true
options.transform.tag.castValue = true
output = { 
    "root": 1234 // number
}

// Boolean value example
xml = `<root>true</root>`
output = { 
    "root": true // boolean
}

// Nested in the middle of value example
xml = `
<root>
    56<nested>910</nested>78
</root>`

output = { 
    "root": {
        "nested": 910, // number
        "#value": "5678" // string
    }
}

// Nested at the end of value example
xml = `
<root>
    5678<nested>910</nested>
</root>`

output = { 
    "root": {
        "nested": 910, // number
        "#value": 5678 // number
    }
}

transform.attribute.name

Type: undefined | (attrName: string) => stringDefault: undefined

A method executed on each attribute name for the ability to alter the name of attributes to a required format. Option disabled when false.

xml = `<root foo="bar">Hello world</root>`

options.transform.attribute.name = (attrName) => attrName.toUpperCase()
output = { 
    "root": {
        "@_FOO": "bar",
        "#value": "Hello world"
    }
}

transform.attribute.value

Type: undefined | (attrName: string, val: string, jPath?: string) => anyDefault: undefined

A method to be executed for each tag value parsed. The output of the method will be used as the output value of the tag.

attrName: string // Name of the attribute
val: any // Value of the attribute, after being trimmed if enabled
jPath?: string // JPath of the attribute

transform.attribute.castValue

Type: booleanDefault: false

Defines if the parser should cast numeric & boolean values on attributes to their respective types.For more control over number casting see transform.castNumber option.

xml = `<root foo="bar" index="123" enabled="true"></root>`

// When set to false
options.transform.attribute.castValue = false
output = { 
    "root": {
        "@_foo": "bar", // string
        "@_index": "123", // string
        "@_enabled": "true" // string
    }
}

// When set to true
options.transform.attribute.castValue = true
output = { 
    "root": {
        "@_foo": "bar", // string
        "@_index": 123, // number
        "@_enabled": true // boolean
    }
}

transform.castNumber.hex

Type: true | undefinedDefault: true

If set to true then hexadecimal strings will also be parsed to numbers (e.g. "0x2f" = 47)

transform.castNumber.leadingZeros

Type: true | undefinedDefault: true

If set to false, then values with a leading 0 (e.g. "0123") will be skipped from number parsing.Decimals such as 0.0 and 0.5 are not impacted.

transform.castNumber.skipLike

Type: RegExp | undefinedDefault: undefined

If set and the Regular Expression (regex) test succeeds on the value, then number parsing will be skipped.

transform.castNumber.eNotation

Type: true | undefinedDefault: undefined

If true then exponential notation strings will also be parsed to numbers (e.g. "12e7" = 120000000)

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

XMLParser

Usage

Options

Parse Options

parse.attributes

parse.emptyAttributes

parse.namespaces

parse.excludedPaths

parse.unpairedTags

parse.entities

parse.htmlEntities

parse.trimValues

parse.isArray

Output Options

output.raw

output.attributePrefix

output.attributeGroup

output.tagValueName

output.tagValueAlways

output.commentName

output.declarationTags

output.cdataName

output.piTags

Transform Options

transform.tag.name

transform.tag.value

transform.tag.castValue

transform.attribute.name

transform.attribute.value

transform.attribute.castValue

transform.castNumber.hex

transform.castNumber.leadingZeros

transform.castNumber.skipLike

transform.castNumber.eNotation