@lexbuild/cli
v1.10.0
Published
Compiler for legal and civic texts. Converts disparate statutory data into structured formats optimized for AI, RAG, and semantic search.
Maintainers
Readme
@lexbuild/cli
Download and convert U.S. legal XML into structured Markdown optimized for AI, RAG pipelines, and semantic search. Supports the U.S. Code (54 titles, 60,000+ sections) and the eCFR (50 titles, 200,000+ sections).
Install
# Global install
npm install -g @lexbuild/cli
# Or run directly
npx @lexbuild/cli --helpQuick Start
# U.S. Code — download and convert all 54 titles
lexbuild download-usc --all
lexbuild convert-usc --all
# eCFR — download and convert all 50 titles
lexbuild download-ecfr --all
lexbuild convert-ecfr --all
# Start small — a single title
lexbuild download-usc --titles 1 && lexbuild convert-usc --titles 1
lexbuild download-ecfr --titles 17 && lexbuild convert-ecfr --titles 17Commands
download-usc
Download U.S. Code XML from the OLRC. Auto-detects the latest release point.
lexbuild download-usc --all # All 54 titles
lexbuild download-usc --titles 1-5,8,11 # Specific titles
lexbuild download-usc --all --release-point 119-73not60 # Pin a release| Option | Default | Description |
|--------|---------|-------------|
| --titles <spec> | — | Title(s): 1, 1-5, 1-5,8,11 |
| --all | — | Download all 54 titles (single bulk zip) |
| -o, --output <dir> | ./downloads/usc/xml | Output directory |
| --release-point <id> | auto-detected | Pin a specific OLRC release point |
convert-usc
Convert downloaded USC XML to Markdown.
lexbuild convert-usc --all # All downloaded titles
lexbuild convert-usc --titles 1 -g chapter # Chapter-level output
lexbuild convert-usc --titles 26 --dry-run # Preview without writing
lexbuild convert-usc ./downloads/usc/xml/usc01.xml # Direct file path| Option | Default | Description |
|--------|---------|-------------|
| --titles <spec> | — | Title(s) to convert |
| --all | — | Convert all titles in input directory |
| -i, --input-dir <dir> | ./downloads/usc/xml | Input XML directory |
| -o, --output <dir> | ./output | Output directory |
| -g, --granularity | section | section, chapter, or title |
| --link-style | plaintext | plaintext, canonical, or relative |
| --no-include-source-credits | — | Exclude source credits |
| --no-include-notes | — | Exclude all notes |
| --include-editorial-notes | — | Include editorial notes only |
| --include-statutory-notes | — | Include statutory notes only |
| --include-amendments | — | Include amendment notes only |
| --dry-run | — | Parse and report without writing |
| -v, --verbose | — | Verbose file output |
download-ecfr
Download eCFR XML. Defaults to the ecfr.gov API (daily-updated); govinfo bulk data available as fallback.
lexbuild download-ecfr --all # All 50 titles (eCFR API)
lexbuild download-ecfr --titles 1-5,17 # Specific titles
lexbuild download-ecfr --all --date 2026-01-01 # Point-in-time download
lexbuild download-ecfr --all --source govinfo # Govinfo bulk fallback| Option | Default | Description |
|--------|---------|-------------|
| --titles <spec> | — | Title(s): 1, 1-5, 1-5,17 |
| --all | — | Download all 50 titles |
| -o, --output <dir> | ./downloads/ecfr/xml | Output directory |
| --source | ecfr-api | ecfr-api (daily) or govinfo (bulk) |
| --date <YYYY-MM-DD> | current | Point-in-time date (ecfr-api only) |
convert-ecfr
Convert downloaded eCFR XML to Markdown.
lexbuild convert-ecfr --all # All downloaded titles
lexbuild convert-ecfr --titles 17 -g part # Part-level output
lexbuild convert-ecfr --all --dry-run # Preview without writing
lexbuild convert-ecfr ./downloads/ecfr/xml/ECFR-title17.xml # Direct file path| Option | Default | Description |
|--------|---------|-------------|
| --titles <spec> | — | Title(s) to convert |
| --all | — | Convert all titles in input directory |
| -i, --input-dir <dir> | ./downloads/ecfr/xml | Input XML directory |
| -o, --output <dir> | ./output | Output directory |
| -g, --granularity | section | section, part, chapter, or title |
| --link-style | plaintext | plaintext, canonical, or relative |
| --no-include-source-credits | — | Exclude source credits |
| --no-include-notes | — | Exclude all notes |
| --include-editorial-notes | — | Include editorial/regulatory notes only |
| --include-statutory-notes | — | Include statutory notes only |
| --include-amendments | — | Include amendment notes only |
| --dry-run | — | Parse and report without writing |
| -v, --verbose | — | Verbose file output |
Output Structure
U.S. Code
| Granularity | Example Path |
|---|---|
| section (default) | output/usc/title-01/chapter-01/section-1.md |
| chapter | output/usc/title-01/chapter-01/chapter-01.md |
| title | output/usc/title-01.md |
eCFR
| Granularity | Example Path |
|---|---|
| section (default) | output/ecfr/title-17/chapter-IV/part-240/section-240.10b-5.md |
| part | output/ecfr/title-17/chapter-IV/part-240.md |
| chapter | output/ecfr/title-17/chapter-IV/chapter-IV.md |
| title | output/ecfr/title-17.md |
Every file includes YAML frontmatter with source metadata (source, legal_status, identifier, hierarchy context) followed by the legal text in Markdown. Section and chapter/part granularities generate _meta.json sidecar files and README.md summaries per title.
Performance
The full U.S. Code — all 54 titles, 60,000+ sections, ~85 million estimated tokens — converts in about 20–30 seconds on modern hardware. SAX streaming keeps memory bounded for even the largest titles (100MB+ XML).
Compatibility
- Node.js >= 22
- ESM only — no CommonJS build
Monorepo Context
This is the published CLI for the LexBuild monorepo. It depends on @lexbuild/core, @lexbuild/usc, and @lexbuild/ecfr for all conversion and download logic.
pnpm turbo build --filter=@lexbuild/cli
pnpm turbo typecheck --filter=@lexbuild/cliRelated Packages
| Package | Description |
|---------|-------------|
| @lexbuild/core | Shared parsing, AST, and rendering infrastructure |
| @lexbuild/usc | U.S. Code converter — programmatic API |
| @lexbuild/ecfr | eCFR converter — programmatic API |
