@lexbuild/core
v1.26.0
Published
Core AST definitions, parsing infrastructure, and format-agnostic renderers for the LexBuild ecosystem.
Downloads
1,037
Maintainers
Readme
@lexbuild/core
Shared infrastructure for the LexBuild legal-XML-to-Markdown pipeline. Provides streaming XML parsing, AST definitions, Markdown rendering, YAML frontmatter generation, and cross-reference link resolution used by all source packages.
Note: This is a foundational library. Most users should install
@lexbuild/clifor the command-line tool, or a source package (@lexbuild/usc,@lexbuild/ecfr,@lexbuild/fr) for programmatic access.
Install
npm install @lexbuild/core
# or
pnpm add @lexbuild/coreQuick Start
import { XMLParser, ASTBuilder, renderDocument, generateFrontmatter, createLinkResolver } from "@lexbuild/core";
import { createReadStream } from "node:fs";
// 1. Parse XML via streaming SAX
const parser = new XMLParser();
const builder = new ASTBuilder({
emitAt: "section",
onEmit: (node, context) => {
// 2. Each completed section is emitted here
const frontmatter = generateFrontmatter(/* ... */);
const resolver = createLinkResolver("relative");
const markdown = renderDocument(node, frontmatter, {
linkStyle: "relative",
resolveLink: resolver.resolve,
});
// 3. Write markdown to file
},
});
parser.on("openElement", (name, attrs) => builder.onOpenElement(name, attrs));
parser.on("closeElement", (name) => builder.onCloseElement(name));
parser.on("text", (text) => builder.onText(text));
await parser.parseStream(createReadStream("usc01.xml"));Multi-Level Emit
emitAt also accepts a ReadonlySet<LevelType>. Deeper levels fire first (sections before their containing title), and emitted nodes remain attached to their parents so a higher-level emission sees the full subtree. Attach-to-parent is gated by "any enclosing stack frame is itself an emit target" — do not reason via LEVEL_TYPES index ordering, which breaks on USLM's permissive nesting (e.g. an appendix inside a part).
const byLevel = new Map<LevelType, LevelNode[]>();
const builder = new ASTBuilder({
emitAt: new Set(["section", "chapter", "title"]),
onEmit: (node, context) => {
const bucket = byLevel.get(node.levelType) ?? [];
bucket.push(node);
byLevel.set(node.levelType, bucket);
},
});This is how the @lexbuild/usc and @lexbuild/ecfr converters produce multiple output granularities from a single parse.
API Reference
XML Parsing
| Export | Description |
|--------|-------------|
| XMLParser | Streaming SAX parser wrapping saxes with namespace normalization. Supports USLM (namespaced) and namespace-free XML (eCFR) via the defaultNamespace option. |
AST Builder
| Export | Description |
|--------|-------------|
| ASTBuilder | Stack-based USLM XML-to-AST builder with configurable emit-at-level streaming. emitAt accepts LevelType (single) or ReadonlySet<LevelType> (multi-level emit). Handles the full USLM 1.0 element vocabulary. Source packages for other formats provide their own builders. |
Rendering
| Export | Description |
|--------|-------------|
| renderDocument() | Render a section node with frontmatter to a complete Markdown file |
| renderSection() | Render a section-level node to Markdown body text |
| renderNode() | Render any AST node to Markdown |
| generateFrontmatter() | Generate a YAML frontmatter block from FrontmatterData |
| createLinkResolver() | Create a cross-reference link resolver supporting USC, CFR, and fallback URLs |
Types
| Export | Description |
|--------|-------------|
| ASTNode | Union type for all AST nodes |
| LevelNode | Hierarchical structural node (title, chapter, section, etc.) |
| ContentNode | Text content block (content, chapeau, continuation, proviso) |
| InlineNode | Inline text formatting (bold, italic, ref, footnoteRef, etc.) |
| NoteNode | Note block (editorial, statutory, amendment, etc.) |
| TableNode | Table with headers and rows |
| SourceCreditNode | Enactment source citation |
| FrontmatterData | Full frontmatter field definitions |
| EmitContext | Context passed with emitted nodes (ancestors, document metadata) |
| SourceType | "usc" \| "ecfr" \| "fr" |
| LegalStatus | "official_legal_evidence" \| "official_prima_facie" \| "authoritative_unofficial" |
Constants
| Export | Description |
|--------|-------------|
| FORMAT_VERSION | Output format version ("1.1.0") |
| GENERATOR | Generator string for frontmatter metadata |
| LEVEL_TYPES | Ordered array of level types (title → subsubitem) |
| BIG_LEVELS | Set of structural levels above section |
| USLM_NS | USLM namespace URI |
| XHTML_NS | XHTML namespace URI |
File System Utilities
| Export | Description |
|--------|-------------|
| writeFile() | Write with ENFILE/EMFILE retry and exponential backoff |
| writeFileIfChanged() | Write only if content differs. Returns true if written, false if skipped (mtime preserved). Used by converters for incremental updates. |
| mkdir() | Recursive mkdir with retry |
Compatibility
- Node.js >= 22
- ESM only — no CommonJS build
- TypeScript — ships
.d.tstype declarations - Zero browser dependencies — Node.js runtime only
Monorepo Context
This package is part of the LexBuild monorepo, managed with Turborepo and pnpm workspaces. All packages use changesets for lockstep versioning.
packages/
├── core/ ← you are here
├── usc/ # depends on core
├── ecfr/ # depends on core
├── fr/ # depends on core
└── cli/ # depends on core, usc, ecfr, frDevelopment
pnpm turbo build --filter=@lexbuild/core # Build
pnpm turbo test --filter=@lexbuild/core # Run tests
pnpm turbo typecheck --filter=@lexbuild/core
pnpm turbo lint --filter=@lexbuild/coreRelated Packages
| Package | Description |
|---------|-------------|
| @lexbuild/cli | CLI tool for downloading and converting legal XML |
| @lexbuild/usc | U.S. Code (USLM XML) converter and downloader |
| @lexbuild/ecfr | eCFR (Code of Federal Regulations) converter and downloader |
| @lexbuild/fr | Federal Register converter and downloader |
