@noctuatech/pdf-cleaner
v0.2.3
Published
A library for cleaning PDF files by filtering out specified operations. Remove text or other unwanted elements from PDFs.
Readme
PDF Cleaner
Module for easily removing text and other content from a pdf.
npm i @noctuatech/pdf-cleanerAPI
This package exposes a small set of functions through the cleaner() initializer which prepares the library and returns the available methods.
All methods operate on a PDF provided as a Uint8Array (or a Node.js Buffer which is compatible) and return a Uint8Array containing the modified PDF bytes.
cleaner()
Initializes the library and returns the PDFDocument class.
import { cleaner } from "@noctuatech/pdf-cleaner";
const PDFCleaner = await cleaner();PDFDocument.filterOperations
Filters content stream operators according to the provided list and mode (see Mode enum below).
import { cleaner, Mode } from "@noctuatech/pdf-cleaner";
import fs from "node:fs/promises";
const PDFCleaner = await cleaner();
const doc = await PDFCleaner.fromBytes(
await fs.readFile("./test.pdf")
);
const embeddedImagesRemoved = await doc.filterOperations(
["BI", "ID", "EI"],
Mode.Remove
);Cleaner.removeText
Removes text drawing operations from the PDF and returns the cleaned PDF bytes.
import { cleaner, Mode } from "@noctuatech/pdf-cleaner";
import fs from "node:fs/promises";
const PDFCleaner = await cleaner();
const doc = await PDFCleaner.fromBytes(
await fs.readFile("./test.pdf")
);
const documentWithNoText = await doc.removeText();Cleaner.leaveOnlyText
Keeps only text drawing operators and removes other content.
import { cleaner, Mode } from "@noctuatech/pdf-cleaner";
import fs from "node:fs/promises";
const PDFCleaner = await cleaner();
const doc = await PDFCleaner.fromBytes(
await fs.readFile("./test.pdf")
);
const documentWithOnlyText = await doc.leaveOnlyText();Types / enums
The Mode enum has two values:
enum Mode {
Keep = 0,
Remove = 1,
}Mode.Keep— when used withfilterOperationswill keep the listed operators and remove others.Mode.Remove— will remove the listed operators.
