npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

xpdf-wrapper

v0.1.0

Published

Node.js wrapper for Xpdf command-line tools

Readme

📄 xpdf-wrapper

A powerful Node.js wrapper for Xpdf command-line tools

Extract text, images, fonts, and metadata from PDF files with ease

npm version npm downloads license node version TypeScript

Getting StartedAPI ReferenceExamplesConfiguration


🌟 Why xpdf-wrapper?

xpdf-wrapper brings the power of Xpdf's battle-tested PDF processing tools to Node.js. Whether you need to extract text for search indexing, convert PDFs to images, or analyze document metadata, this library provides a clean, modern API with full TypeScript support.

✨ Key Features

| Feature | Description | |---------|-------------| | 📄 Complete Xpdf Suite | All 9 tools included: pdftotext, pdftops, pdftoppm, pdftopng, pdftohtml, pdfinfo, pdfimages, pdffonts, pdfdetach | | 🔄 Buffer Support | Process PDFs directly from memory - no need to save temporary files | | 📝 Direct Text Output | pdftotext returns extracted text directly in result.text | | 🎯 TypeScript First | Complete type definitions for all tools and options | | ⚡ Zero Config | Xpdf binaries are automatically downloaded on install | | 🔀 Flexible API | Choose between standalone functions or the unified Xpdf class | | 🚀 Batch Processing | Process multiple PDFs or run multiple operations concurrently |


📦 Installation

# Using npm
npm install xpdf-wrapper

# Using yarn
yarn add xpdf-wrapper

# Using pnpm
pnpm add xpdf-wrapper

Note: Xpdf binaries are automatically downloaded for your platform (Windows, macOS, Linux) during installation.


🚀 Quick Start

Basic Text Extraction

import { pdftotext } from "xpdf-wrapper";

// Extract text from a PDF file
const result = await pdftotext("./document.pdf");
console.log(result.text);

Working with Buffers

import { pdftotext } from "xpdf-wrapper";
import { readFileSync } from "fs";

// Process PDF directly from a Buffer
const pdfBuffer = readFileSync("./document.pdf");
const result = await pdftotext(pdfBuffer);
console.log(result.text);

Get PDF Metadata

import { pdfinfo } from "xpdf-wrapper";

const result = await pdfinfo("./document.pdf");
console.log(result.stdout);
// Output:
// Creator:        Microsoft Word
// Producer:       Adobe PDF Library
// CreationDate:   Mon Dec 25 12:00:00 2024
// Pages:          5
// File size:      102400 bytes
// ...

📚 API Reference

Available Tools

xpdf-wrapper provides wrappers for all 9 Xpdf command-line tools:

| Tool | Function | Description | |------|----------|-------------| | pdftotext | pdftotext() | Extract text content from PDF | | pdftops | pdftops() | Convert PDF to PostScript | | pdftoppm | pdftoppm() | Convert PDF pages to PPM images | | pdftopng | pdftopng() | Convert PDF pages to PNG images | | pdftohtml | pdftohtml() | Convert PDF to HTML | | pdfinfo | pdfinfo() | Get PDF metadata and information | | pdfimages | pdfimages() | Extract embedded images from PDF | | pdffonts | pdffonts() | List fonts used in PDF | | pdfdetach | pdfdetach() | Extract file attachments from PDF |

Standalone Functions

All tool wrappers accept either a file path (string) or a Buffer as input:

import {
  pdftotext,
  pdftops,
  pdftoppm,
  pdftopng,
  pdftohtml,
  pdfinfo,
  pdfimages,
  pdffonts,
  pdfdetach
} from "xpdf-wrapper";

// Using file path
const text = await pdftotext("./document.pdf", undefined, { layout: true });

// Using Buffer
const buffer = readFileSync("./document.pdf");
const info = await pdfinfo(buffer, { rawDates: true });

// With options
const fonts = await pdffonts("./document.pdf");

The Xpdf Class

For more structured results and batch operations, use the Xpdf class:

import { Xpdf } from "xpdf-wrapper";
import { readFileSync } from "fs";

const xpdf = new Xpdf();

// Extract text with parsed result
const textResult = await xpdf.pdfToText("./document.pdf");
console.log(textResult.text);

// Get PDF info with parsed metadata
const infoResult = await xpdf.pdfInfo("./document.pdf");
console.log(infoResult.info.Pages);      // 5
console.log(infoResult.info.Creator);    // "Microsoft Word"

// List fonts with parsed output
const fontsResult = await xpdf.pdfFonts("./document.pdf");
console.log(fontsResult.fonts);          // Array of font objects

// Works with Buffers too
const buffer = readFileSync("./document.pdf");
const result = await xpdf.pdfInfo(buffer);

Processing Multiple PDFs

Pass an array to process multiple PDF files:

const xpdf = new Xpdf();

// Process multiple PDFs
const results = await xpdf.pdfInfo([
  "./document1.pdf",
  "./document2.pdf",
  "./document3.pdf"
]);

// Results is an array
results.forEach((result, index) => {
  console.log(`Document ${index + 1}: ${result.info.Pages} pages`);
});

// Mix file paths and Buffers
const buffer = readFileSync("./document2.pdf");
const mixedResults = await xpdf.pdfToText([
  "./document1.pdf",
  buffer,
  "./document3.pdf"
]);

Batch Operations

Run multiple operations on the same PDF(s) concurrently:

const xpdf = new Xpdf();

// Run multiple operations on a single PDF
const results = await xpdf.batch("./document.pdf", [
  "pdfInfo",
  "pdfFonts", 
  "pdfToText"
]);

// Access results by operation name
console.log("Page count:", results.pdfInfo?.info.Pages);
console.log("Fonts used:", results.pdfFonts?.fonts);
console.log("Text content:", results.pdfToText?.text);

⚙️ Configuration

Environment Variables

| Variable | Default | Description | |----------|---------|-------------| | NODE_XPDF_BIN_DIR | <package>/bin | Custom path to Xpdf binaries |

Custom Options

Configure the Xpdf class with custom options:

import { Xpdf } from "xpdf-wrapper";

const xpdf = new Xpdf({
  // Custom binary directory
  binDir: "/opt/xpdf/bin",
  
  // Runtime options
  run: {
    timeoutMs: 30000,  // 30 second timeout
  }
});

Tool-Specific Options

Each tool supports its own set of options matching the Xpdf CLI:

// pdftotext options
await pdftotext("./doc.pdf", undefined, {
  firstPage: 1,
  lastPage: 10,
  layout: true,        // Maintain original layout
  table: true,         // Table mode
  lineEnd: "unix",     // Line endings: "unix" | "dos" | "mac"
  enc: "UTF-8",        // Output encoding
  ownerPassword: "secret",
  userPassword: "secret"
});

// pdfinfo options
await pdfinfo("./doc.pdf", {
  firstPage: 1,
  lastPage: 5,
  box: true,           // Print page box info
  meta: true,          // Print metadata
  rawDates: true,      // Print dates in raw format
});

// pdftopng options
await pdftopng("./doc.pdf", "./output", {
  firstPage: 1,
  lastPage: 1,
  resolution: 300,     // DPI
  mono: true,          // Monochrome output
  gray: true,          // Grayscale output
});

📁 Examples

The examples/ directory contains working examples:

| Example | Description | |---------|-------------| | buffer-example.ts | Working with PDF Buffers | | pdftotext-example.ts | Text extraction examples | | pdfinfo-example.ts | Getting PDF metadata | | batch-example.ts | Batch processing examples |

Running Examples

# First, build the project
npm run build

# Then run an example
npx tsx examples/buffer-example.ts
npx tsx examples/pdftotext-example.ts
npx tsx examples/pdfinfo-example.ts
npx tsx examples/batch-example.ts

�️ Development

# Clone the repository
git clone https://github.com/iqbal-rashed/xpdf-wrapper.git
cd xpdf-wrapper

# Install dependencies
npm install

# Build the project
npm run build

# Run tests
npm test

# Run tests in watch mode
npm run test:watch

# Lint the code
npm run lint

# Format the code
npm run format

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

📋 Requirements

  • Node.js 18.0 or higher
  • Platforms: Windows, macOS, Linux (binaries auto-downloaded)

🔗 Related Links


�📄 License

This project is licensed under the MIT License - see the LICENSE file for details.


Made with ❤️ by Rashed Iqbal

Star this repo if you find it helpful!