docx-pdf-converter
v1.2.2
Published
A Node.js tool to convert DOCX files to PDF using Mammoth for accurate Word document parsing and Puppeteer for high-quality PDF generation. Ideal for document automation and batch processing.
Downloads
317
Maintainers
Readme
DOCX to PDF Converter
Convert DOCX files to PDF using Mammoth and Puppeteer with preserved formatting and layout.
Installation
npm install docx-pdf-converterBasic Usage
const fs = require('fs')
const { convertDocxToPdf } = require('docx-pdf-converter')
async function convert() {
const docxBuffer = fs.readFileSync('path/to/input.docx')
// Return as Buffer (default)
const result = await convertDocxToPdf(docxBuffer, 'input.docx')
fs.writeFileSync(result.filename, result.buffer)
// Return as File
const fileResult = await convertDocxToPdf(docxBuffer, 'input.docx', { returnType: 'file' })
console.log(`File saved as ${fileResult.filename}`)
// Return as Base64
const base64Result = await convertDocxToPdf(docxBuffer, 'input.docx', { returnType: 'base64' })
console.log(`Base64 string: ${base64Result.base64.slice(0, 30)}...`)
}
convert()Advanced Usage with Formatting Options
The converter now supports advanced formatting options to preserve page layout, size, margins, and more:
const fs = require('fs')
const { convertDocxToPdf, extractDocumentMetadata } = require('docx-pdf-converter')
async function convertWithFormatting() {
const docxBuffer = fs.readFileSync('path/to/input.docx')
// Optional: Analyze document to determine best configuration
const metadata = await extractDocumentMetadata(docxBuffer)
console.log('Document analysis:', metadata)
// Basic options (original parameters)
const options = {
returnType: 'file',
outputDir: './output'
}
// Formatting options (new parameter)
const formatOptions = {
pageConfig: {
format: 'A4', // 'A4', 'Letter', 'Legal', etc.
margin: {
top: '1in',
right: '1in',
bottom: '1in',
left: '1in'
}
},
preserveHeaders: true // Attempts to preserve headers and footers
}
// Convert with formatting options
const result = await convertDocxToPdf(
docxBuffer,
'input.docx',
options,
formatOptions
)
console.log(`File saved as ${result.filename}`)
}
convertWithFormatting()Smart Conversion (Automatic Document Analysis)
For the best results with minimal configuration, use the smart converter:
const fs = require('fs')
const { smartConvertDocxToPdf } = require('docx-pdf-converter')
async function smartConvert() {
const docxBuffer = fs.readFileSync('path/to/input.docx')
// Smart converter automatically analyzes the document
// and applies appropriate formatting
const result = await smartConvertDocxToPdf(
docxBuffer,
'input.docx',
{ returnType: 'file', outputDir: './output' }
)
console.log(`PDF created at: ${result.filename}`)
}
smartConvert()API Reference
convertDocxToPdf(fileBuffer, originalFilename, options, formatOptions)
Converts a DOCX file buffer to PDF.
- fileBuffer (Buffer): DOCX file as a Buffer
- originalFilename (string): Original filename (used for output naming)
- options (Object): Basic conversion options
- returnType (string): 'buffer' (default), 'file', or 'base64'
- outputDir (string): Output directory for saved files (default: system temp dir)
- formatOptions (Object): Advanced formatting options (optional)
- pageConfig (Object): Page configuration
- format (string): Page format ('A4', 'Letter', 'Legal', etc.)
- margin (Object): Page margins with top, right, bottom, left properties
- preserveHeaders (boolean): Whether to preserve headers and footers (default: true)
- pageConfig (Object): Page configuration
Returns an object with one or more of:
- buffer: Buffer of the PDF (if returnType is 'buffer')
- filename: Name of the output file
- base64: Base64 string of the PDF (if returnType is 'base64')
extractDocumentMetadata(fileBuffer)
Analyzes a DOCX file to extract metadata useful for formatting decisions.
- fileBuffer (Buffer): DOCX file as a Buffer
Returns an object with:
- suggestedFormat: Recommended page format based on content
- contentLength: Approximate content length
- hasHeaders: Whether the document appears to have headers
- hasFooters: Whether the document appears to have footers
- messages: Any messages from the analysis process
smartConvertDocxToPdf(fileBuffer, originalFilename, options)
Automatically analyzes a document and converts it with optimized formatting.
- fileBuffer (Buffer): DOCX file as a Buffer
- originalFilename (string): Original filename (used for output naming)
- options (Object): Basic conversion options (same as convertDocxToPdf)
Returns the same output as convertDocxToPdf.
Troubleshooting
If you encounter issues with the layout or formatting in the converted PDF:
- Try using the
smartConvertDocxToPdffunction for automatic optimization - Experiment with different page formats and margins
- If headers/footers aren't appearing correctly, try setting
preserveHeaders: false
Requirements
- Node.js 12 or higher
- This package uses Puppeteer which will download a Chrome browser during installation
License
MIT
