@zola_do/document-manipulator
v0.2.9
Published
PDF/DOCX merge, conversion, template population for NestJS
Maintainers
Readme
@zola_do/document-manipulator
PDF and DOCX merge, conversion, and template population for NestJS.
Overview
@zola_do/document-manipulator provides:
- PDF Merging — Combine multiple PDFs into one
- DOCX Merging — Combine multiple Word documents
- Document Conversion — Convert between formats (DOCX ↔ PDF ↔ HTML)
- Template Population — Fill DOCX templates with data
- MinIO Integration — Store and retrieve documents
Installation
# Install individually
npm install @zola_do/document-manipulator
# Or via meta package
npm install @zola_do/nestjs-sharedDependencies
npm install @zola_do/minio
npm install docx-templates pdf-lib @scholarcy/docx-merger
npm install axios form-data
npm install libreoffice-convertNote: For document conversion, LibreOffice must be installed on the system.
Operations notes
- LibreOffice / soffice: Runs as a subprocess during conversion. In production, enforce timeouts on conversion calls in your app, bound temporary directories, and avoid running converters as root. On Linux, consider
systemdMemoryMax/CPUQuotafor the worker process or dedicated job containers. - Cloud conversion: For serverless or locked-down containers without LibreOffice, route conversions through an external API by implementing a thin wrapper in your app (keep secrets and quotas in your infrastructure, not in this library).
Quick Start
1. Register Modules
import { Module } from '@nestjs/common';
import { DocumentManipulatorModule } from '@zola_do/document-manipulator';
import { MinIoModule } from '@zola_do/minio';
@Module({
imports: [MinIoModule, DocumentManipulatorModule],
})
export class AppModule {}2. Use Service
import { Injectable } from '@nestjs/common';
import { DocumentManipulatorService } from '@zola_do/document-manipulator';
@Injectable()
export class ReportService {
constructor(private readonly docService: DocumentManipulatorService) {}
async mergeInvoices(invoiceIds: string[]) {
// Get PDF buffers
const pdfBuffers = await Promise.all(
invoiceIds.map((id) => this.getInvoicePdf(id)),
);
// Merge PDFs
const mergedBuffer = await this.docService.mergePdf(pdfBuffers);
// Upload to MinIO
return await this.minioService.uploadBuffer(
mergedBuffer,
`merged-invoice-${Date.now()}.pdf`,
'application/pdf',
'invoices',
);
}
}PDF Operations
Merge PDFs
Combine multiple PDF buffers into one:
async mergePdfs(pdfBuffers: Buffer[]) {
const merged = await this.docService.mergePdf(pdfBuffers);
return merged; // Single PDF Buffer
}Example: Merge invoice PDFs
async generateMergedInvoice(orderId: string): Promise<Buffer> {
const order = await this.orderService.findOne(orderId);
// Get each item's PDF
const itemPdfs = await Promise.all(
order.items.map(item => this.getItemPdf(item.id))
);
// Add cover page
const coverPage = await this.generateCoverPage(order);
const allPdfs = [coverPage, ...itemPdfs];
return this.docService.mergePdf(allPdfs);
}Convert Document to PDF
Convert DOCX to PDF (requires LibreOffice):
async convertToPdf(docxBuffer: Buffer): Promise<Buffer> {
const pdfBuffer = await this.docService.convertDocument(docxBuffer, '.pdf');
return pdfBuffer;
}DOCX Operations
Merge DOCX Files
Combine multiple Word documents:
async mergeDocx(docxBuffers: Buffer[]) {
const merged = await this.docService.mergeDocx(docxBuffers);
return merged; // Single DOCX Buffer
}Example: Merge contract sections
async generateContract(contractId: string): Promise<Buffer> {
const contract = await this.contractService.findOne(contractId);
const sections = [
await this.loadTemplate('contracts/header.docx'),
await this.loadTemplate(`contracts/${contract.type}/body.docx`),
await this.loadTemplate('contracts/terms.docx'),
await this.loadTemplate('contracts/signatures.docx'),
];
return this.docService.mergeDocx(sections);
}Convert DOCX to HTML
async convertToHtml(docxBuffer: Buffer): Promise<Buffer> {
const htmlBuffer = await this.docService.convertDocument(docxBuffer, '.html');
return htmlBuffer;
}Convert DOCX to PDF
async convertDocxToPdf(docxBuffer: Buffer): Promise<Buffer> {
const pdfBuffer = await this.docService.convertDocument(docxBuffer, '.pdf');
return pdfBuffer;
}Template Population
Fill DOCX templates with data:
async populateTemplate(
templateBuffer: Buffer,
data: Record<string, any>,
): Promise<Buffer> {
const result = await this.docService.populateTemplate(templateBuffer, data);
return result;
}Example: Generate personalized document
async generateOfferLetter(employeeId: string): Promise<Buffer> {
const employee = await this.employeeService.findOne(employeeId);
const template = await this.loadTemplate('offer-letter.docx');
return this.docService.populateTemplate(template, {
candidateName: employee.fullName,
position: employee.position,
startDate: employee.startDate.toLocaleDateString(),
salary: employee.salary.toLocaleString(),
benefits: employee.benefits,
companyName: 'Acme Corporation',
hrName: 'Jane Smith',
offerDate: new Date().toLocaleDateString(),
});
}Document Conversion
Supported Conversions
| Input | Output | Method | | ----- | ------ | ----------- | | DOCX | PDF | LibreOffice | | DOCX | HTML | LibreOffice | | DOCX | ODT | LibreOffice | | DOC | DOCX | LibreOffice |
Using LibreOffice
Note: LibreOffice must be installed for conversion to work.
# macOS
brew install libreoffice
# Ubuntu/Debian
sudo apt install libreoffice
# Windows
# Download from https://www.libreoffice.org/download/download/Remote Conversion Service
For serverless environments without LibreOffice:
async convertViaRemote(docxBuffer: Buffer): Promise<Buffer> {
// Configure remote endpoint in environment
// DOCUMENT_CONVERT_ENDPOINT=https://convert.example.com
return await this.docService.convertDocument(docxBuffer, '.pdf');
}FileHelperService
Utility service for file operations:
import { Injectable } from '@nestjs/common';
import { FileHelperService } from '@zola_do/document-manipulator';
@Injectable()
export class DocumentService {
constructor(
private readonly docService: DocumentManipulatorService,
private readonly fileHelper: FileHelperService,
) {}
async processDocument(buffer: Buffer, filename: string) {
// Get file extension
const ext = this.fileHelper.getFileExtension(filename);
// Get MIME type
const mime = this.fileHelper.getMimeType(filename);
// Validate file type
if (!this.fileHelper.isValidDocumentType(ext)) {
throw new BadRequestException('Invalid document type');
}
// Get file size
const size = this.fileHelper.getFileSize(buffer);
// ... process document
}
}FileHelper Methods
class FileHelperService {
getFileExtension(filename: string): string;
getMimeType(filename: string): string;
isValidDocumentType(extension: string): boolean;
isValidImageType(extension: string): boolean;
getFileSize(buffer: Buffer): number;
sanitizeFilename(filename: string): string;
}Document Architecture
┌─────────────────────────────────────────────────────────────────────┐
│ Document Manipulation Flow │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ Source Documents │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ DOCX 1 │ │ DOCX 2 │ │ DOCX 3 │ │
│ └────┬────┘ └────┬────┘ └────┬────┘ │
│ │ │ │ │
│ └───────────┼───────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Document │ │
│ │ Manipulator │ │
│ │ Service │ │
│ └────────┬────────┘ │
│ │ │
│ ┌─────────┼─────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ merge │ │ populate │ │ convert │ │
│ │ PDFs │ │ template │ │ formats │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Merged │ │ Filled │ │ Output │ │
│ │ PDF │ │ Template │ │ Format │ │
│ └──────────┘ └──────────┘ └──────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────┘Environment Variables
| Variable | Description | Required |
| --------------------------- | ----------------------------- | ----------- |
| DOCUMENT_CONVERT_ENDPOINT | Remote conversion service URL | No |
| MINIO_ENDPOINT | MinIO server endpoint | For storage |
| MINIO_PORT | MinIO port | For storage |
| MINIO_ACCESSKEY | MinIO access key | For storage |
| MINIO_SECRETKEY | MinIO secret key | For storage |
API Reference
Module
DocumentManipulatorModule.forRoot(options?: DocumentManipulatorModuleOptions)Service
class DocumentManipulatorService {
// PDF Operations
async mergePdf(pdfBuffers: Buffer[]): Promise<Buffer>;
// DOCX Operations
async mergeDocx(docxBuffers: Buffer[]): Promise<Buffer>;
// Template Operations
async populateTemplate(
templateBuffer: Buffer,
data: Record<string, any>,
): Promise<Buffer>;
// Conversion
async convertDocument(
buffer: Buffer,
outputExtension: '.pdf' | '.html' | '.odt',
): Promise<Buffer>;
}FileHelperService
class FileHelperService {
getFileExtension(filename: string): string;
getMimeType(filename: string): string;
isValidDocumentType(extension: string): boolean;
isValidImageType(extension: string): boolean;
getFileSize(buffer: Buffer): number;
sanitizeFilename(filename: string): string;
}Common Use Cases
1. Generate Merged Invoice Report
async generateInvoiceReport(date: Date): Promise<Buffer> {
const orders = await this.orderService.findByDate(date);
const orderPdfs = await Promise.all(
orders.map(order => this.generateOrderPdf(order.id))
);
const header = await this.generateReportHeader(date);
return this.docService.mergePdf([header, ...orderPdfs]);
}2. Create Package Documents
async createContractPackage(contractId: string): Promise<Buffer> {
const contract = await this.contractService.findOne(contractId);
// Generate each document from template
const [cover, terms, specs, pricing] = await Promise.all([
this.populateTemplate('cover.docx', contract),
this.populateTemplate('terms.docx', contract),
this.populateTemplate('specs.docx', contract),
this.populateTemplate('pricing.docx', contract),
]);
// Merge all DOCX
return this.docService.mergeDocx([cover, terms, specs, pricing]);
}3. Export to Multiple Formats
async exportDocument(documentId: string, format: string): Promise<Buffer> {
const doc = await this.documentService.findOne(documentId);
switch (format) {
case 'pdf':
return this.docService.convertDocument(doc.buffer, '.pdf');
case 'html':
return this.docService.convertDocument(doc.buffer, '.html');
default:
return doc.buffer; // Return original DOCX
}
}Troubleshooting
Q: LibreOffice not found?
Ensure LibreOffice is installed and in PATH:
which libreoffice
# or
which sofficeQ: Conversion fails?
Check that the input buffer is a valid document:
if (buffer.length === 0) {
throw new Error('Empty buffer');
}Q: PDF merge produces blank pages?
Ensure PDF buffers are valid:
// Verify PDF header
if (!buffer.slice(0, 4).equals(Buffer.from('%PDF'))) {
throw new Error('Invalid PDF buffer');
}Related Packages
- @zola_do/docx — DOCX template processing
- @zola_do/minio — Required for storage
License
ISC
