xml-xsd-engine
v1.7.3
Published
Zero-dependency XML parser, XSD parser, schema model and validation engine written in TypeScript
Downloads
1,616
Maintainers
Readme
xml-xsd-engine
Fast, zero-dependency XML + XSD validation engine for Node.js and browsers — written in TypeScript from scratch.
✔ XML parser (DOM + SAX + Streaming) ✔ XSD schema validation
✔ XPath 1.0 engine (compiled, two-tier LRU) ✔ XPath 2.0 functions (22)
✔ XML ↔ JSON transforms ✔ XSLT-lite transformations
✔ Schema cache (LRU + TTL + content-hash) ✔ Batch validation (concurrent)
✔ XSD → TypeScript codegen (streaming) ✔ XSD → JSON Schema Draft 7 (streaming)
✔ CLI xml-validate + xml-format ✔ Secure by default (XXE-safe)
✔ Namespace Engine ✔ Validation Pipeline (7 stages)
✔ Strict / Lax / Recover modes ✔ Profiling hooks (per-stage)
✔ Schema Compiler (type resolution) ✔ Schema Preflight (self-consistency)
✔ Identity Constraints (xs:key/unique/keyref) ✔ xs:assert XPath 1.0 lite
✔ SAX Instrumentation ✔ Source-mapped errors (line/col)
✔ SHA-256 content-hash cache keying ✔ DTD internal entity expansion
✔ xml:id / xml:base processing ✔ xs:list / xs:union hardening
✔ XML diff (diffXml) ✔ Schema inference (inferSchema)
✔ Fragment / subtree validation ✔ Encoding support (ISO-8859-1, ...)
✔ PSVI annotations — typed output (v1.7) ✔ Canonical XML C14N 1.0 (v1.7)
✔ xs:redefine support (v1.7) ✔ Mixed content model fix (v1.7)
✔ SchemaMerger — unified import/include (v1.7) ✔ XPath cache control (v1.7)
✔ Browser / Deno / Bun entry points ✔ Tree-shaking (sideEffects:false)Quick Start
npm install xml-xsd-engineimport { parseXml, parseXsd, validate } from 'xml-xsd-engine';
const xml = `<book><title>Dune</title></book>`;
const xsd = `
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="book">
<xs:complexType>
<xs:sequence>
<xs:element name="title" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>`;
const result = validate(parseXml(xml), parseXsd(xsd));
console.log(result.valid); // true
console.log(result.errorCount); // 0Why this library?
Most JavaScript XML libraries only parse XML. xml-xsd-engine is a complete XML processing stack in a single zero-dependency package:
| Feature | xml-xsd-engine | fast-xml-parser | xml2js | xmldom | |---------|:-:|:-:|:-:|:-:| | XML parsing (DOM) | ✅ | ✅ | ✅ | ✅ | | SAX / event streaming | ✅ | ✅ | ✅ | ✅ | | XPath queries (compiled, cached) | ✅ | ❌ | ❌ | ✅ | | XSD schema validation | ✅ | ❌ | ❌ | ❌ | | XML ↔ JSON transforms | ✅ | ✅ | ✅ | ❌ | | XSLT transformations | ✅ | ❌ | ❌ | ❌ | | XSD → TypeScript codegen | ✅ | ❌ | ❌ | ❌ | | XSD → JSON Schema Draft 7 | ✅ | ❌ | ❌ | ❌ | | Namespace Engine | ✅ | ❌ | ❌ | ✅ | | Validation Pipeline (7 stages) | ✅ | ❌ | ❌ | ❌ | | Strict / Lax / Recover modes | ✅ | ❌ | ❌ | ❌ | | Structured error codes | ✅ | ❌ | ❌ | ❌ | | Source-mapped errors (line/col) | ✅ | ❌ | ❌ | ❌ | | Identity constraints (xs:key/unique/keyref) | ✅ | ❌ | ❌ | ❌ | | Schema Preflight | ✅ | ❌ | ❌ | ❌ | | XML diff (diffXml) | ✅ | ❌ | ❌ | ❌ | | Schema inference | ✅ | ❌ | ❌ | ❌ | | PSVI typed output (v1.7) | ✅ | ❌ | ❌ | ❌ | | Canonical XML / C14N (v1.7) | ✅ | ❌ | ❌ | ❌ | | xs:redefine support (v1.7) | ✅ | ❌ | ❌ | ❌ | | Streaming code generation (v1.7) | ✅ | ❌ | ❌ | ❌ | | Browser / Deno / Bun entry points | ✅ | ❌ | ❌ | ❌ | | TypeScript-first | ✅ | ✅ | ⚠️ | ⚠️ | | Zero runtime deps | ✅ | ✅ | ❌ | ❌ | | XXE safe by default | ✅ | ⚠️ | ⚠️ | ⚠️ |
Installation
npm install xml-xsd-engine
# or
yarn add xml-xsd-engine
pnpm add xml-xsd-engineRequirements: Node.js ≥ 18 · ESM or CommonJS · Zero runtime dependencies
Core Features
Parse XML
import { parseXml } from 'xml-xsd-engine';
const doc = parseXml('<catalog><book id="1"><title>Dune</title></book></catalog>');
doc.root?.tagName; // 'catalog'
doc.root?.childElements[0].getAttribute('id'); // '1'
doc.root?.childElements[0].childElements[0].textContent; // 'Dune'Validate against XSD
import { validate } from 'xml-xsd-engine';
const result = validate(parseXml(xml), parseXsd(xsd));
console.log(result.valid); // true | false
console.log(result.errorCount); // number
result.issues.forEach(e =>
console.error(`${e.path} @${e.line}:${e.col} [${e.code}] ${e.message}`)
);Validation Pipeline (7 stages)
import { runPipeline, ValidationPipeline } from 'xml-xsd-engine';
// Convenience function
const pipe = runPipeline(xmlString, schema, { profile: true, psvi: true });
console.log('totalMs:', pipe.totalMs);
pipe.stages.forEach(s =>
console.log(` ${s.name}: ${s.success ? '✓' : '✗'} ${s.durationMs.toFixed(2)}ms`)
);
// Reusable pipeline (compile schema once)
const pipeline = new ValidationPipeline(schema, { mode: 'lax', recover: true });
pipeline.precompileSchema(); // cached after first call
const r = pipeline.run(xmlString);Strict / Lax / Recover modes
// strict (default) — unknown elements/attributes are hard errors
const strict = validate(doc, schema, { mode: 'strict' });
// lax — unknown elements produce warnings instead of errors
const lax = validate(doc, schema, { mode: 'lax' });
// recover — continue past structural errors to collect all issues
const all = validate(doc, schema, { recover: true, collectAll: true });PSVI — Typed Validation Output (v1.7)
import { validate, extractPsvi } from 'xml-xsd-engine';
const result = validate(doc, schema, { psvi: true });
// Convert WeakMap → serialisable path→annotation map
const annotated = extractPsvi(result.psvi!, doc);
for (const [path, ann] of annotated) {
console.log(path, ann.xsdType, ann.validity, ann.normalizedValue);
}
// /catalog xs:anyType valid null
// /catalog/item[1] ItemType valid null
// /catalog/item[1]/id xs:string valid '001'Canonical XML (v1.7)
import { canonicalize, clearCanonicalCache } from 'xml-xsd-engine';
const c14n = canonicalize(doc); // inclusive C14N 1.0
const exc = canonicalize(doc, { mode: 'exclusive' }); // exclusive (WS-Security)
const cmt = canonicalize(doc, { withComments: true }); // preserve comments
const sub = canonicalize(doc.root!.childElements[0]); // subtree only
clearCanonicalCache(); // invalidate subtree memo after DOM mutationXPath Engine (compiled, two-tier LRU cache)
import { xpath, compileXPath, configureXPathCache } from 'xml-xsd-engine';
// Ad-hoc queries
const books = xpath(doc, '//book[@lang="en"]') as XmlElement[];
const count = xpath(doc, 'count(//book)') as number;
// Compiled expressions (fastest — avoids reparse)
const expr = compileXPath('//book/title');
const titles = expr.evaluate(doc);
// Tune two-tier LRU cache (default: cold=512, hot=128)
configureXPathCache({ coldMax: 1024, hotMax: 256 });Identity Constraints
// xs:key / xs:unique / xs:keyref evaluated automatically by ValidationEngine
const result = validate(doc, schema); // identity checks run in stage 6
// Violations produce structured issues:
// code: 'IDENTITY_DUPLICATE_KEY', category: 'identity'Schema Preflight
import { validateSchema, checkSchema } from 'xml-xsd-engine';
// From XSD source — catch errors at load time
const issues = validateSchema(xsdSource);
const errors = issues.filter(i => i.severity === 'error');
if (errors.length) throw new Error(errors.map(e => e.message).join('\n'));
// From compiled SchemaModel
const issues2 = checkSchema(schema);Schema Cache
import { SchemaCache, globalSchemaCache } from 'xml-xsd-engine';
const cache = new SchemaCache({ maxSize: 50, ttlMs: 3_600_000 });
const schema = cache.getOrParse('my-schema', xsdSource); // compile once, ~900k ops/s on hit
cache.invalidateByContent('my-schema', newSource); // invalidates if hash changedXSD → TypeScript / JSON Schema
import { generateTypeScript, generateJsonSchema,
generateTypeScriptStream, generateJsonSchemaStream } from 'xml-xsd-engine';
// Synchronous
const ts = generateTypeScript(schema, { typePrefix: 'I', exportAll: true });
const jsonSchema = generateJsonSchema(schema, { useDefinitions: true });
// Streaming — ≈4 KB chunks, no peak memory spike on large schemas (v1.7)
for await (const chunk of generateTypeScriptStream(schema)) outputStream.write(chunk);
for await (const chunk of generateJsonSchemaStream(schema)) outputStream.write(chunk);XML Diff
import { diffXml } from 'xml-xsd-engine';
const changes = diffXml(doc1, doc2, { ignoreWhitespace: true });
// [{ type: 'modified', path: '/items/item[1]' },
// { type: 'added', path: '/items/item[2]' }]Schema Inference
import { inferSchema } from 'xml-xsd-engine';
const { model, toXsdString } = inferSchema([doc1, doc2], {
targetNamespace: 'http://example.com',
inferSimpleTypes: true,
inferRequired: true,
});
console.log(toXsdString()); // valid xs:schema documentFragment & Subtree Validation
import { validateFragment, validateSubtree } from 'xml-xsd-engine';
const r1 = validateFragment('<price currency="USD">19.99</price>', schema);
const r2 = validateSubtree(element, schema, { rootType: 'PriceType' });Batch Validation
import { BatchValidator } from 'xml-xsd-engine';
const report = await new BatchValidator('schema.xsd', { concurrency: 8 })
.validateFiles(['a.xml', 'b.xml', 'c.xml']);
console.log(report.passed, '/', report.total, 'in', report.durationMs, 'ms');SAX Parser (O(depth) memory)
import { parseSax } from 'xml-xsd-engine';
parseSax(xml, {
startElement: e => console.log('open', e.event.localName, e.event.namespaceURI),
endElement: e => console.log('close', e.event.localName),
text: e => console.log('text', e.value),
});SAX Instrumentation
import { SaxInstrumentation } from 'xml-xsd-engine';
const instr = new SaxInstrumentation();
instr.onElementStart = ({ path, localName, depth, line, col }) =>
console.log(`depth=${depth} path=${path} @${line}:${col}`);
instr.parse(xml);XML ↔ JSON
import { xmlToJson, jsonToXmlString } from 'xml-xsd-engine';
const obj = xmlToJson(doc, { attributePrefix: '@', collapseText: true });
const xml = jsonToXmlString({ book: { '@id': '1', title: 'Dune' } });XSLT Transformation
import { transformXml } from 'xml-xsd-engine';
const result = transformXml(sourceXml, stylesheetXml);Streaming (large files)
import { parseXmlFileStream, parseXmlStream } from 'xml-xsd-engine';
const doc = await parseXmlFileStream('./large.xml');
const doc2 = await parseXmlStream(nodeReadableStream);Async Schema Loader
import { parseXsdAsync } from 'xml-xsd-engine';
const schema = await parseXsdAsync(mainXsdSource, async (location) => {
const res = await fetch(`https://schemas.example.com/${location}`);
return res.ok ? res.text() : null;
});DTD Internal Entities
const xml = `<!DOCTYPE root [<!ENTITY greeting "Hello!">]><root>&greeting;</root>`;
parseXml(xml).root!.textContent; // 'Hello!'xml:id / xml:base
import { parseXml, resolveXmlBase } from 'xml-xsd-engine';
const doc = parseXml('<root><a xml:id="id1" xml:base="http://ex.com/docs/"/></root>');
doc.xmlIds.get('id1'); // → <a> element (O(1) lookup)
resolveXmlBase('http://ex.com/docs/', '../images/logo.png');
// → 'http://ex.com/images/logo.png'XPath 2.0 Functions
import { upperCase, lowerCase, matchesXPath, tokenize,
distinctValues, avg, stringJoin } from 'xml-xsd-engine';
upperCase('hello'); // 'HELLO'
matchesXPath('foo123', '\\d+'); // true
distinctValues([1, 2, 1, 3]); // [1, 2, 3]
tokenize('a b c', ' '); // ['a', 'b', 'c']
avg([1, 2, 3]); // 2SHA-256 Utilities
import { sha256Hex, sha256Short } from 'xml-xsd-engine';
sha256Hex('hello'); // 64-char hex SHA-256
sha256Short('hello'); // first 16 chars (for cache keys)Platform Entry Points
// Browser / Cloudflare Workers / WASM (no fs/path)
import { parseXml, validate } from 'xml-xsd-engine/browser';
// Deno
import { parseXml, denoReadXmlFile } from 'xml-xsd-engine/deno';
// Bun
import { parseXml, bunReadXmlFile } from 'xml-xsd-engine/bun';Profiling
const result = runPipeline(xmlSource, schema, {
profile: true,
onProfile: (e) => console.log(e.stage, e.durationMs),
});
result.stages.forEach(s => console.log(s.name, s.durationMs));CLI
xml-validate
# Basic validation
xml-validate document.xml schema.xsd
# Multiple files with glob
xml-validate data/*.xml schema.xsd --format compact
# Output formats: text (default), compact, json, github, junit
xml-validate document.xml schema.xsd --format json
xml-validate document.xml schema.xsd --format junit --output results.xml
xml-validate document.xml schema.xsd --format github # GitHub Actions annotations
# Validation modes
xml-validate document.xml schema.xsd --lax # allow unknown elements
xml-validate document.xml schema.xsd --recover # collect all errors
# Well-formedness only (no schema)
xml-validate --well-formed document.xml
# Watch mode (re-validates on file change, 200ms debounce)
xml-validate --watch document.xml schema.xsd
# Show source code frame around errors
xml-validate document.xml schema.xsd --code-frame
# Silent (exit code only)
xml-validate document.xml schema.xsd --silent && echo "OK"Exit codes: 0 valid · 1 validation errors · 2 parse error · 3 file not found
xml-format
xml-format document.xml # pretty-print to stdout
xml-format document.xml --indent 4 # 4-space indent
xml-format document.xml --declaration # add <?xml ...?> declaration
xml-format document.xml --in-place # overwrite file
xml-format document.xml --output formatted.xmlArchitecture
XML string
│
▼
XmlLexer single-pass state machine, no regex, CRLF norm, line/col
│ token stream
▼
XmlParser / SaxParser stack-based DOM OR event stream (O(depth) memory)
│ └── SaxInstrumentation (path + depth + namespace)
▼
NamespaceEngine scoped prefix→URI, O(decls) per push, default-NS
│
▼
ValidationPipeline ──── SchemaCompilerLite ──── SchemaModel (from XsdParser)
│ 7 stages: (compile once) + SchemaMerger (import/include/redefine)
│ parse
│ namespace
│ schema-compile
│ structure-validate ──── ValidationEngine
│ type-validate ──── TypeValidator (44 types + all facets)
│ identity-check ──── IdentityConstraintEngine (xs:key/unique/keyref)
│ post-process ──── AssertionEvaluator (xs:assert) + Psvi
│
▼
ValidationResult
│
├── formatResult (text/compact/json/github/junit)
├── extractPsvi (WeakMap → Map<path, PsviAnnotation>)
└── XmlCanonical (C14N 1.0 inclusive/exclusive)Security
Built-in protections — all on by default:
| Threat | Protection | |--------|-----------| | XXE injection | SYSTEM/PUBLIC entity URIs never fetched | | Billion Laughs | Entity expansion capped at 10,000 | | Deep nesting DoS | maxDepth limit (default 500) | | Attribute flood | maxAttributes limit (default 256) | | Giant text node | maxTextLength limit (default 10 MB) | | Node count bomb | maxNodeCount limit (default 1,000,000) |
import { parseXml } from 'xml-xsd-engine';
// Custom limits for untrusted input
const doc = parseXml(untrustedXml, {
maxDepth: 100,
maxNodeCount: 50_000,
maxTextLength: 500_000,
maxAttributes: 64,
});v1.7.0 Highlights
New features
| Feature | Description |
|---------|-------------|
| PSVI | Post-Schema-Validation Infoset — per-element type annotation (xsdType, normalizedValue, validity) |
| Canonical XML | W3C C14N 1.0 — inclusive and exclusive modes, subtree canonicalization, clearCanonicalCache() |
| xs:redefine | Load + merge a base schema, then override named components. Required for UBL, ebXML |
| Mixed content fix | mixed="true" interleaved text nodes no longer produce false-positive errors |
| SchemaMerger | Centralizes xs:import/include/redefine with SHA-256 content-hash caching |
| XPath cache control | configureXPathCache(), xpathCacheStats() — two-tier LRU (cold 512 + hot 128) |
| Streaming codegen | generateTypeScriptStream() / generateJsonSchemaStream() — ≈4 KB chunks |
Performance improvements
| Fix | Impact | |-----|--------| | XPath two-tier LRU | +24% first-expression throughput, +4× hot-tier hits | | Validation traversal memoization | +5% validate throughput | | XmlDiff fingerprint cache | +40% diff throughput | | Canonical XML delta saves | O(decls) per element (was O(scope)) | | PSVI object pool | −60% annotation heap allocation | | IdentityConstraintEngine pre-index | O(1) skip for constraint-free elements |
Bug fixes (critical)
result.validwas alwaystrue—_errorCountnot incremented in v1.6.0. Now fixed.- Canonical XML C14N spec §2.4 violation — inclusive C14N now renders all in-scope namespaces
- XmlDiff false "no change" — recursive fingerprints now encode child subtrees correctly
- xs:assert non-deterministic order — now sorted by test string before evaluation
Breaking Changes
v1.6.x → v1.7.0
No breaking API changes. v1.7.0 is fully backward-compatible.
Behavior changes (bug fixes):
| Area | Change |
|------|--------|
| result.valid | Was always true in v1.6.0. Now accurately reflects whether errors occurred. |
| Canonical XML | Inclusive C14N output may differ for documents with inherited namespace scopes (now spec-correct). |
| XmlDiff | May detect more changes in documents that were previously reported as identical. |
| xs:assert | Errors emitted in deterministic order (sorted by test string). |
v1.5.x → v1.6.0
No breaking changes. Added: XML diff, schema inference, fragment validation, encoding support, source-mapped errors.
v1.4.x → v1.5.0
| Change | Migration |
|--------|-----------|
| StartElementEvent.nsDeclarations new required field | Add nsDeclarations: new Map() if constructing manually |
| Three new XmlErrorCode values | Add cases to exhaustive switch(err.code) blocks |
v1.3.x → v1.4.0
| Change | Migration |
|--------|-----------|
| ValidationResult.summary shape changed | Replace .summary.errors → .summary.errorCount |
| Strict mode is now the default | Set mode: 'lax' if documents have undeclared elements |
Documentation
| Document | Description |
|----------|-------------|
| Quick Start | 10 common recipes in 5 minutes |
| Usage Guide | Full recipes for every feature |
| Feature Reference | All 49 features from v1.0–v1.7.0 |
| API Reference | Complete export reference |
| Architecture | Internal design + data flow |
| Error Codes | All XmlErrorCode values by category |
| Migration Guide | Upgrading between versions |
| Testing Guide | Test structure, coverage, how to write tests |
| Performance Guide | Baselines, tuning, benchmarks |
| CLI Reference | xml-validate and xml-format flags |
| Security Policy | Threat model + secure defaults |
| Compatibility | Runtime matrix, XSD compliance table |
| Roadmap | Planned features + good-to-have list |
| Completed Features | What shipped in v1.0–v1.7.0 |
| Issues Tracker | P0–P3 issues resolved in v1.7.0 |
| Changelog | Version history |
Examples
Runnable TypeScript examples in examples/:
| Example | What it shows |
|---------|---------------|
| 01-basic-parse | DOM navigation, XPath, serialization |
| 02-xsd-validation | Schema validation, error handling |
| 03-streaming-parser | SAX push/pull API, file streaming |
| 04-xml-to-json | xmlToJson, jsonToXml, round-trip |
| 05-xslt-transform | transformXml, sort, filter, AVTs |
| 06-cli-validation | All CLI flags + programmatic equivalents |
| 07-sax-instrumentation | SaxInstrumentation, path tracking, error recovery |
| 08-schema-preflight | validateSchema, checkSchema, filtering issues |
| 09-dtd-entities | DTD entity expansion, expandDtdEntities option |
| 10-xml-id-base | xml:id uniqueness, xml:base propagation |
| 11-xs-assert | AssertionEvaluator, xs:assert in XSD |
| 12-schema-cache-hash | Content-hash keying, invalidateByContent |
npx ts-node examples/01-basic-parse/index.tsRoadmap
| Version | Focus | Status | |----------|-------|--------| | v1.2 | XML↔JSON, XSLT-lite, Schema Cache, Batch Validation | ✅ Released | | v1.3 | Code generation (TS + JSON Schema), async loader, --watch, --code-frame | ✅ Released | | v1.4 | Namespace Engine, Validation Pipeline, Schema Compiler, compiled XPath, xml-format, strict/lax/recover/profile | ✅ Released | | v1.5 | Browser/Deno/Bun, SAX Instrumentation, Schema Preflight, Identity Constraints, xs:assert, XPath 2.0, SHA-256, DTD, xml:id/xml:base | ✅ Released | | v1.6 | Source-mapped errors, XML diff, Schema inference, Encoding, Fragment validation | ✅ Released | | v1.7 | PSVI, Canonical XML C14N, xs:redefine, Mixed content fix, SchemaMerger, XPath cache control, streaming codegen, 15 performance fixes, 1 critical bug fix | ✅ Released (2026-04-20) | | v1.7 | Streaming validation (SAX→validator, no DOM required), DFA content-model | 🔜 Planned | | v2.0 | Streaming-first architecture, full DFA compilation, full XSD 1.0 compliance | 🔜 Planned |
→ Full roadmap — including good-to-have features and community requests
Contributing
Contributions are welcome!
All PRs run through CI (build + 2276 tests + ESLint + TypeScript strict check).
Current coverage: stmts 87.11% · branches 76.28% · functions 77.42% · lines 89.33%
Run npm run test:coverage to see the full per-file breakdown.
License
MIT © 2024–2026 xml-xsd-engine contributors
