@file-type/pdf
v0.3.0
Published
file-type plugin to parse PDF files
Maintainers
Readme
@file-type/pdf
Detector plugin for file-type that identifies PDF (Portable Document Format) files and selected PDF-based subtypes.
This plugin goes beyond simple magic-number detection and can inspect the internal PDF structure to distinguish between generic PDF files and specific producer formats such as Adobe Illustrator (.ai).
Scope
This detector is designed for well-formed PDF files and established PDF-based subtypes. Support for corrupted or non-conforming PDFs is intentionally limited and only considered when a deviation is both common and widely accepted.
Installation
npm install @file-type/pdfUsage
The following example shows how to add the PDF detector to file-type:
import { FileTypeParser } from 'file-type';
import { detectPdf } from '@file-type/pdf';
const parser = new FileTypeParser({
customDetectors: [detectPdf],
});
const fileType = await parser.fromFile('example.pdf');
console.log(fileType);PDF/A detection
When a PDF is identified as PDF/A (based on XMP metadata such as pdfaid:part), the detector returns a result that includes an additional flag:
export interface PdfTypeResult extends FileTypeResult {
archive?: boolean;
}Example result for a PDF/A file:
{
ext: 'pdf',
mime: 'application/pdf',
archive: true
}This allows consumers to distinguish archival PDFs from regular PDFs without introducing non-standard extensions or MIME types.
Supported file formats
.ai/application/illustrator: Adobe Illustrator.pdf/application/pdf: Generic Portable Document Format files, including PDF/A (archival PDF)
License
This project is licensed under the MIT License. Feel free to use, modify, and distribute it as needed.
