npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

afpp

v2.5.2

Published

Async Fast PDF Parser for Node.js — dependency-light, TypeScript-first, production-ready.

Downloads

708

Readme

afpp

Version codecov Node npm Downloads Repo Size Last Commit

afpp — A modern, dependency-light PDF parser for Node.js.

Built for performance, reliability, and developer sanity.


Overview

afpp (Another PDF Parser, Properly) is a Node.js library for extracting text and images from PDF files without manual native build steps, event-loop blocking, or fragile runtime assumptions.

The project was created to address recurring problems encountered with existing PDF tooling in the Node.js ecosystem:

  • Excessive bundle sizes and transitive dependencies
  • Native build steps (canvas, ImageMagick, Ghostscript)
  • Browser-specific assumptions (window, DOM, canvas)
  • Poor TypeScript support
  • Unreliable handling of encrypted PDFs
  • Performance and memory inefficiencies

afpp focuses on predictable behavior, explicit APIs, and production-ready defaults.


Key Features

  • No manual build step required — prebuilt native binaries are bundled automatically via @napi-rs/canvas
  • Fully asynchronous, non-blocking architecture
  • First-class TypeScript support
  • Supports local files, buffers, and remote URLs
  • Handles encrypted PDFs
  • Configurable concurrency and rendering scale
  • Minimal and auditable dependency graph

Requirements

  • Node.js >= 22.14.0

Installation

Install using your preferred package manager:

npm install afpp
# or
yarn add afpp
# or
pnpm add afpp

Quick Start

All parsing functions accept the same input types:

  • string (file path)
  • Buffer
  • Uint8Array
  • URL

Extract Text from a PDF

import { pdf2string } from 'afpp';

const pages = await pdf2string('./document.pdf');
console.log(pages); // ['Page 1 text', 'Page 2 text', ...]

Render PDF Pages as Images

import { pdf2image } from 'afpp';

(async () => {
  const url = new URL('https://pdfobject.com/pdf/sample.pdf');
  const images = await pdf2image(url);

  console.log(images); // [Buffer, Buffer, ...]
})();

Streaming API (Large PDFs)

For large PDFs, use streaming functions to process pages incrementally without loading all results into memory:

import { writeFile } from 'fs/promises';

import { streamPdf2image, streamPdf2string } from 'afpp';

// Stream images - process each page as it's rendered
for await (const { pageNumber, pageCount, data } of streamPdf2image(
  './large.pdf',
)) {
  await writeFile(`page-${pageNumber}.png`, data);
  console.log(`Processed ${pageNumber}/${pageCount}`);
}

// Stream text - process each page as it's extracted
for await (const { pageNumber, data } of streamPdf2string('./large.pdf')) {
  console.log(`Page ${pageNumber}: ${data.substring(0, 100)}...`);
}

Benefits:

  • Lower peak memory usage
  • Faster time-to-first-result
  • Built-in progress tracking via pageNumber and pageCount

Extract PDF Metadata

import { getPdfMetadata } from 'afpp';

const metadata = await getPdfMetadata('./document.pdf');
console.log(metadata.pageCount); // e.g. 9
console.log(metadata.isEncrypted); // false
console.log(metadata.title); // 'My Document' or undefined
console.log(metadata.creationDate); // Date object or undefined

// Encrypted PDF
const meta = await getPdfMetadata('./secure.pdf', { password: 'secret' });
console.log(meta.isEncrypted); // true

Low-Level Parsing API

For advanced use cases, parsePdf exposes page-level control and transformation.

import { parsePdf } from 'afpp';

(async () => {
  const response = await fetch('https://pdfobject.com/pdf/sample.pdf');
  const buffer = Buffer.from(await response.arrayBuffer());

  const result = await parsePdf(buffer, {}, (pageContent) => pageContent);
  console.log(result);
})();

Configuration

All public APIs accept a shared options object.

const result = await parsePdf(buffer, {
  concurrency: 5,
  imageEncoding: 'jpeg',
  password: 'STRONG_PASS',
  scale: 4,
});

AfppParseOptions

| Option | Type | Default | Description | | --------------- | ------------------------------------- | ------- | ---------------------------------------------------------------------------------- | | concurrency | number \| 'auto' | 1 | Number of pages processed in parallel. Use 'auto' for CPU-based scaling. | | imageEncoding | 'png' \| 'jpeg' \| 'webp' \| 'avif' | 'png' | Output format for rendered images | | password | string | — | Password for encrypted PDFs | | scale | number | 1.0 | Rendering scale. Valid range: 0.1–10. (1.0 = 72 DPI, 2.0 = 144 DPI, 3.0 = 216 DPI) |

PdfMetadata

Returned by getPdfMetadata. All fields except pageCount and isEncrypted are optional — absent metadata fields are undefined, never empty strings.

| Field | Type | Description | | ------------------ | --------- | ------------------------------------------------ | | pageCount | number | Total number of pages | | isEncrypted | boolean | Whether the document required a password to open | | title | string? | Document title | | author | string? | Document author | | subject | string? | Document subject | | creator | string? | Application that created the document | | producer | string? | PDF producer application | | creationDate | Date? | Document creation date | | modificationDate | Date? | Document last modification date |


Design Principles

  • Node-first: No browser globals or DOM assumptions
  • Explicit over implicit: No magic configuration
  • Fail fast: Clear errors instead of silent corruption
  • Production-oriented: Optimized for long-running processes

Contributing

See CONTRIBUTING.md for development setup and pull request guidelines.


License

MIT © Richard Solár