@packback/html-to-docx
v1.4.4
Published
A library-agnostic service for converting HTML content to Microsoft Word DocX documents. Works in both Angular frontend applications and Node.js backend environments.
Readme
@packback/html-to-docx
A library-agnostic service for converting HTML content with all its oddities to Microsoft Word DocX documents. Works in both browser and Node.js environments.
Key Features
- Library Agnostic: Accepts any DOM Document object, not tied to specific HTML parsers
- Node.js Compatible: Works in server environments using JSDOM
- Browser Compatible: Works in frontend applications using native DOMParser
- Comprehensive HTML Support: Handles formatting, lists, images, headers, and more
- Document Styling: Configurable fonts, sizes, and citation formats (APA, MLA, Chicago)
- Self-Contained: All dependencies are local to avoid circular imports
Installation
npm install @packback/html-to-docx
# or
yarn add @packback/html-to-docxLocal Development
For local development with hot reload, add a path mapping to your tsconfig.json:
{
"compilerOptions": {
"paths": {
"@packback/html-to-docx": ["../../../packages/html-to-docx/src/index.ts"]
}
}
}This points directly to the TypeScript source files, so changes are reflected immediately without rebuilding. For Packback frontend developers, this path is already configured in frontend/questions-frontend/src/tsconfig.app.dev.json - just uncomment it.
Alternative (slower): If you need to test the built package output, update package.json:
"@packback/html-to-docx": "file:../packages/html-to-docx"Note: This requires running yarn build after every change.
Usage Examples
Frontend (Browser)
import { HtmlToDocxService } from '@packback/html-to-docx';
// Convert HTML string
const docxDocument = await HtmlToDocxService.convertHtmlToDocument({
htmlContent: '<p>Hello <strong>world</strong>!</p>',
documentSettings: {
font_family: 'arial',
font_size: 12,
format_style: 'apa'
},
includeHeader: true,
includeFooter: false
});
// Convert pre-parsed document (library agnostic)
const parser = new DOMParser();
const document = parser.parseFromString(htmlContent, 'text/html');
const docxDocument = await HtmlToDocxService.convertHtmlToDocument({
htmlContent: '', // Not used when document is provided
document,
documentSettings: { font_family: 'open-sans', font_size: 12 }
});
// With references/bibliography page
const sources = [
{
citation: [
{ resolved: true, text: 'Smith, J.' },
{ resolved: true, text: ' (2023). ' },
{ resolved: true, text: 'Book Title', format: 'italic' },
{ resolved: true, text: '. Publisher.' }
],
citation_format: 'apa'
}
];
const docxDocument = await HtmlToDocxService.convertHtmlToDocument({
htmlContent: '<p>Hello <strong>world</strong>!</p>',
documentSettings: { format_style: 'apa' },
sources
});Backend (Node.js)
import { HtmlToDocxService } from '@packback/html-to-docx';
import { JSDOM } from 'jsdom';
import { Packer } from 'docx';
import fs from 'fs/promises';
// Make Node constants available globally
const jsdom = new JSDOM('');
global.Node = jsdom.window.Node;
async function convertHtml(htmlContent, outputPath) {
// Parse HTML using JSDOM
const jsdom = new JSDOM(htmlContent);
const document = jsdom.window.document;
// Convert to DocX
const docxDocument = await HtmlToDocxService.convertHtmlToDocument({
htmlContent: '',
document,
documentSettings: {
font_family: 'times-new-roman',
font_size: 11,
format_style: 'mla'
}
});
// Save to file
const buffer = await Packer.toBuffer(docxDocument);
await fs.writeFile(outputPath, buffer);
}Document Settings
Font Families
arial- Arialopen-sans- Open Sans (default)times-new-roman- Times New Roman
Font Sizes
10- 10 point11- 11 point12- 12 point (default)
Format Styles
apa- APA formattingchicago- Chicago stylemla- MLA formatting
References and Bibliography
The package supports automatic generation of properly formatted references/bibliography pages based on citation data. When sources are provided, a references page is automatically appended to the document with appropriate formatting for the selected citation style.
Features
- Automatic Page Break: A page break is inserted before the references section
- Style-Specific Formatting:
- APA: "References" title (bold), double-spaced entries
- MLA: "Works Cited" title, double-spaced entries
- Chicago: "Bibliography" title, single-spaced within entries, double-spaced between
- Hanging Indentation: All entries use 0.5-inch hanging indentation
- Alphabetical Sorting: Sources are automatically sorted by first author/text
- Format Preservation: Italics and other formatting from citations are preserved
- Filtering: Only resolved citation pieces are included; placeholder text is omitted
Source Data Format
Sources should be provided as an array of objects with:
citation: Array of citation pieces (text, resolved status, optional format)citation_format: The citation style ('apa', 'mla', or 'chicago')
Only sources matching the document's format_style will be included in the references page.
Supported HTML Features
Text Formatting
- Bold:
<strong>,<b> - Italic:
<em>,<i> - Underline:
<u> - Subscript:
<sub> - Superscript:
<sup>
Structure
- Paragraphs:
<p> - Headers: Custom Quill header classes (
dd-title-header,dd-h1-header, etc.) - Lists:
<ul>,<ol>withdata-listattributes - Links:
<a href="">with proper hyperlink styling
Layout
- Alignment:
.ql-align-center,.ql-align-right,.ql-align-justify - Indentation:
.ql-indent-1through.ql-indent-9 - Line Height:
.ql-line-height-1,.ql-line-height-1-5,.ql-line-height-2 - Page Breaks:
.page-breakclass
Media
- Images:
<img>elements with URL support - Alt Text: Proper fallback handling for failed image loads
Command Line Interface
The package includes a CLI tool for converting HTML files to DOCX from the command line, useful for testing and integration with other systems (e.g., PHP applications).
⚠️ Security Warning
The CLI reads files without validation. Never pass user-controlled input as file paths, as attackers could read sensitive files. Always validate and sanitize paths before use (restrict directories, validate extensions, block path traversal).
Installation
# Install globally
npm install -g @packback/html-to-docx
# Or build locally and use node directly (recommended for development)
cd packages/html-to-docx
yarn install && yarn buildUsage
# Using node directly (preserves quotes properly)
node dist/cli.js input.html output.docx
# With custom font and size
node dist/cli.js input.html output.docx --font arial --size 11
# With formatting style
node dist/cli.js input.html output.docx --style apa
# If installed globally
html-to-docx input.html output.docx --style apaCLI Options
--font <name>- Font family:arial,open-sans,times-new-roman(default:open-sans)--size <number>- Font size:10,11,12(default:12)--style <name>- Format style:apa,mla,chicago--header-title <text>- Page header title--header-last-name <text>- Page header last name--header-page-numbers- Include page numbers in header--footer <text>- Footer text--sources <path>- Path to JSON file containing sources for references/bibliography page-h, --help- Show help message
Examples
Sample HTML files are provided in the examples/ directory:
# Simple example with basic formatting
node dist/cli.js examples/simple-example.html output.docx
# Full Quill document with title page and MLA style
node dist/cli.js examples/sample-quill.html output.docx --font times-new-roman --size 12 --style mla
# With page header and numbers (APA style) - use quotes for multi-word values
node dist/cli.js examples/sample-quill.html output.docx \
--style apa \
--header-title 'The Baroque Period' \
--header-last-name Koves \
--header-page-numbers
# With custom footer
node dist/cli.js examples/simple-example.html output.docx \
--footer 'Copyright 2025 - All Rights Reserved'
# With references/bibliography page from sources.json
node dist/cli.js examples/sample-quill.html output.docx \
--style apa \
--sources examples/sources.jsonThe sources.json file should contain an array of sources with citation data:
[
{
"citation": [
{ "resolved": true, "text": "Smith, J." },
{ "resolved": true, "text": " (2023). " },
{ "resolved": true, "text": "Book Title", "format": "italic" },
{ "resolved": true, "text": ". Publisher." }
],
"citation_format": "apa"
}
]An example of apa formatted page which should be on the last page of the output document:

Node.js Compatibility
When running in Node.js environments:
- Set
global.Node = jsdom.window.Nodeto provide DOM constants - Use the
documentparameter instead ofhtmlContent - Import JSDOM for HTML parsing
Dependencies
- docx: DocX document generation
- Local utilities: Self-contained formatting and styling utilities
- DOM API: Browser DOMParser or Node.js JSDOM
Development
Setup
# Install dependencies
yarn install
# Build the package
yarn build
# Run tests
yarn test
# Run tests in watch mode
yarn test:watch
# Lint code
yarn lintTesting
The package includes comprehensive tests that run in both browser and Node.js environments using Jest with JSDOM.
Building
The TypeScript source is compiled to CommonJS format in the dist/ directory with type definitions.
License
MIT - see LICENSE file for details.
