doctopdf-convertor
v0.1.1
Published
Composable document to PDF conversion toolkit for Node.js.
Maintainers
Readme
doctopdf
Composable document-to-PDF conversion toolkit for Node.js with a focus on security, extensibility, and reliability.
Features
- Plugin-based architecture: register multiple converters and prioritise them per request.
- Flexible inputs: convert from file paths or raw buffers with automatic MIME/extension detection.
- Security-first defaults: optional directory allow-lists, secure temporary files, and overwrite protection.
- Extensible: author custom converters for specialised formats beyond LibreOffice.
- Tested: unit tests cover orchestration, error handling, and security boundaries.
The bundled
LibreOfficeConverterrequires LibreOffice (soffice) to be installed and available on the host system.
Installation
npm install doctopdf-convertorThe library targets Node.js >=16.20. If you rely on the LibreOffice converter, ensure soffice is present on the system PATH (or supply the binary location via options).
Quick start
import fs from "node:fs/promises";
import { createDefaultDocumentConverter } from "doctopdf-convertor";
const converter = createDefaultDocumentConverter();
const result = await converter.convertToPdf("report.docx");
if (result.buffer) {
await fs.writeFile("report.pdf", result.buffer);
console.log("PDF generated!");
}Targeting a specific output file
import path from "node:path";
import { DocumentConverter, LibreOfficeConverter } from "doctopdf-convertor";
const converter = new DocumentConverter({
converters: [new LibreOfficeConverter()],
allowedOutputRoot: path.resolve("output"), // optional security fence
});
await converter.convertToPdf("invoice.doc", {
target: {
kind: "file",
path: path.resolve("output/invoice.pdf"),
overwrite: true,
},
});If overwrite is omitted (defaults to false), the call fails when the file already exists.
Supplying buffers
When converting files received over the network you can pass buffers and (optionally) metadata hints:
const result = await converter.convertToPdf({
kind: "buffer",
data: incomingBuffer,
originalFilename: "presentation.pptx", // helps guess extensions
});Restricting inputs and outputs
You can enforce directory allow-lists to prevent accidental traversal outside controlled locations:
const converter = new DocumentConverter({
converters: [new LibreOfficeConverter()],
allowedInputRoots: [path.resolve("/srv/uploads")],
allowedOutputRoot: path.resolve("/srv/pdf-cache"),
});Conversions that target paths outside these directories throw a SecurityError.
Extending with custom converters
Implement the Converter interface to support other back-ends:
import type {
Converter,
ConversionIntent,
ConverterContext,
ConversionResult,
} from "doctopdf-convertor";
class MarkdownConverter implements Converter {
id = "markdown";
supportedSources = ["buffer", "path"] as const;
outputFormats = ["pdf"] as const;
async isAvailable() {
return true;
}
async canConvert(intent: ConversionIntent) {
return (
intent.source.kind === "buffer" &&
intent.source.mimeType === "text/markdown"
);
}
async convert(
intent: ConversionIntent,
context: ConverterContext
): Promise<ConversionResult> {
const pdfBuffer = await renderMarkdownToPdf(intent.source); // application-specific helper
return {
converterId: this.id,
targetFormat: "pdf",
buffer: pdfBuffer,
warnings: [],
metadata: {},
};
}
}
const converter = new DocumentConverter({
converters: [new MarkdownConverter(), new LibreOfficeConverter()],
});You can prioritise converters at call time via the preferredConverters option.
Error handling
All failures surface as ConversionError subclasses:
UnsupportedFormatErrorindicates no converter accepted the input.ConverterUnavailableErrorsignals missing runtime dependencies (for example, the LibreOffice binary).SecurityErroris raised when guard rails (output directory, overwrite protection) are violated.
try {
await converter.convertToPdf("slides.key");
} catch (error) {
if (error instanceof UnsupportedFormatError) {
console.warn("No converter could process this file.");
}
}Testing
Unit tests validate orchestration and security behaviours (mocking the LibreOffice converter). Run them with:
npm testExamples
Run the bundled demo after building the library:
npm run example:basicThe script in examples/basic converts the provided sample.docx into a PDF (requires LibreOffice). You can pass another filename from that directory (for example sample.txt) to try different inputs. See examples/README.md for more details.
Roadmap ideas
- Additional built-in converters (for example wkhtmltopdf or Ghostscript).
- Fine-grained output metadata (page count, word count, embedded fonts).
- Optional sandbox execution helpers (Docker, Firejail, or similar).
Contributions are welcome. Open an issue or submit a pull request with your improvement ideas.
