docx-to-builder
v1.0.0
Published
Parse any .docx template and generate a ready-to-run JavaScript builder that reproduces it exactly using the docx npm package.
Downloads
150
Maintainers
Readme
docx-to-builder
Parse any
.docxtemplate and generate a ready-to-run JavaScript builder that reproduces it exactly.
You have a branded Word template. You want to generate documents from it programmatically — with AI-written content, dynamic data, or automation pipelines. But Pandoc can't faithfully reproduce your layout, and docxtemplater requires manually adding {placeholders} to every field.
docx-to-builder takes a different approach: it reads your .docx template's raw XML, understands every formatting decision, and writes a JavaScript file that recreates that exact layout using the docx npm package.
How it works
your-template.docx → [docx-to-builder] → your-template-builder.js
↓
node your-template-builder.js
↓
branded-output.docx ✓- Parse — reads the
.docxXML directly (no third-party Python libraries needed) - Extract — captures colors, fonts, spacing, borders, tables, images, headers, footers
- Infer — detects
[bracketed placeholders]and maps them todata.fieldNamereferences - Generate — writes a complete ES module with
buildDocument(data, outputPath)
Quick start
Requirements: Python 3.8+ · Node.js · docx npm package
# 1. Generate a builder from your template
python3 docx-to-builder.py my-template.docx
# 2. Install docx in your project (if you haven't already)
npm install docx
# 3. Run the builder
node my-template-builder.js
# Output: my-template-output.docxOr install via npm and run with npx:
npx docx-to-builder my-template.docxTemplate conventions
Your template can use any Word formatting. To make fields dynamic, use bracketed placeholders:
| In your template | Generated JS |
|---|---|
| [Client Name] | data.clientName |
| [Proposal Title] | data.title |
| [Month Day, Year] | data.date |
| [Draft / Final] | data.status |
| [Your custom field] | data.yourCustomField |
Any text not in brackets is treated as a static label and kept as-is.
What gets extracted
| Template element | Extracted |
|---|---|
| Body paragraphs | Text, font, size, color, bold, italic, all-caps, spacing, alignment |
| Section headings | With red rule lines, borders, spacing |
| Bullet lists | Level, indent, font |
| Tables | Cell widths, shading, borders, margins, content |
| Inline images | Size (EMU → pt), relationship to file, auto-copied to assets/ |
| Headers & footers | Text, tab stops, border rules, page number fields |
| Page margins | Per-section |
| Brand colors | Auto-named constants (COLOR.accent, COLOR.body, etc.) |
Generated file structure
// Brand colors extracted from template
const COLOR = { accent: '2C4A6E', body: '444444', ... };
const FONT = 'Calibri';
const LOGO_PATH = join(__dirname, 'assets/logo.png');
// Header and footer — exact replica of template
function buildHeader(data) { ... }
function buildFooter() { ... }
// Document body — all layout hardcoded, bracketed fields wired to data object
function buildContent(data) { ... }
// Main entry point — call this from your code or pipeline
export async function buildDocument(data, outputPath) { ... }
// Data object — auto-populated from inferred placeholders
const data = {
title: 'Proposal Title',
clientName: 'Client Name',
date: '...',
};Wiring in your own content
After generation, the data object at the bottom of the file maps directly to the bracketed fields the generator found. Replace the example values with your real data — or import buildDocument and pass a data object from your own code:
import { buildDocument } from './my-template-builder.js';
await buildDocument({
title: 'Cloud Migration Proposal',
clientName: 'Acme Corp',
date: 'April 1, 2026',
status: 'Final',
}, 'output/acme-proposal.docx');Example
The examples/ folder contains:
sample-template.docx— a generic proposal template (Meridian Consulting)sample-builder.js— the builder generated from that template
Run the example:
cd examples
npm install docx
node sample-builder.js
# → output/sample-output.docxLimitations
- Inline images — fully extracted. The image file is automatically copied to
assets/next to the generated builder — no manual step needed - Complex graphics — SmartArt, shapes, charts, and WordArt use a different XML format and are skipped; the surrounding paragraph text is preserved
- Multi-section documents — section breaks are noted as comments; the generated builder uses a single section (the last one's margins). Multi-section support is on the roadmap
- Dynamic content — the generator wires up bracketed placeholders automatically, but long static template paragraphs become literal strings. Replace them with
data.fieldNamereferences for full dynamic control - Table of contents — TOC fields are not regenerated
CLI options
python3 docx-to-builder.py <template.docx> [--output <builder.js>]
# Examples:
python3 docx-to-builder.py proposal.docx
python3 docx-to-builder.py proposal.docx --output src/builders/proposal-builder.jsContributing
Issues and PRs welcome. If your template produces unexpected output, open an issue and attach the template (or a sanitized version of it).
License
MIT — see LICENSE
