npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

pdf-oxide-wasm

v0.3.17

Published

Fast, zero-dependency PDF toolkit for Node.js, browsers, and edge runtimes — text extraction, markdown/HTML conversion, search, form filling, creation, and editing. Rust core compiled to WebAssembly.

Readme

pdf-oxide-wasm

Fast, zero-dependency PDF toolkit for Node.js, browsers, and serverless edge runtimes. Extract text, convert to markdown/HTML, search, fill forms, create and edit PDFs — all from WebAssembly.

Built on the pdf-oxide Rust core. No native binaries, no system dependencies.

npm license

Why pdf-oxide-wasm

| Feature | pdf-oxide-wasm | pdf-parse | pdf-lib | pdfjs-dist | |---|---|---|---|---| | Text extraction | Yes | Yes | No | Yes | | Markdown / HTML output | Yes | No | No | No | | PDF creation | Yes | No | Yes | No | | Form field read/write | Yes | No | Partial | No | | Full-text search (regex) | Yes | No | No | No | | Image extraction | Yes | No | No | No | | Merge, encrypt, edit | Yes | No | Yes | No | | Serverless / edge runtimes | Yes | No | No | No | | Zero native dependencies | Yes | Yes | Yes | No | | WebAssembly-based | Yes | No | No | No | | TypeScript types included | Yes | No | Yes | Yes | | License | MIT / Apache-2.0 | MIT | MIT | Apache-2.0 |

Install

npm install pdf-oxide-wasm

Quick Start

Extract text (Node.js — CommonJS)

const { WasmPdfDocument } = require("pdf-oxide-wasm");
const fs = require("fs");

const bytes = new Uint8Array(fs.readFileSync("document.pdf"));
const doc = new WasmPdfDocument(bytes);

console.log(`Pages: ${doc.pageCount()}`);
console.log(doc.extractText(0));       // plain text from page 0
console.log(doc.toMarkdown(0));        // markdown from page 0
console.log(doc.toHtml(0));            // HTML from page 0

doc.free();

Extract text (ESM / TypeScript)

import { WasmPdfDocument } from "pdf-oxide-wasm";
import { readFile } from "fs/promises";

const bytes = new Uint8Array(await readFile("document.pdf"));
const doc = new WasmPdfDocument(bytes);

const text = doc.extractAllText();
const markdown = doc.toMarkdownAll();

doc.free();

Create a PDF from Markdown

import { WasmPdf } from "pdf-oxide-wasm";

const pdf = WasmPdf.fromMarkdown("# Invoice\n\nTotal: $42.00", "Invoice", "Acme Corp");
const bytes = pdf.toBytes(); // Uint8Array — write to file or send as response

Search inside a PDF

const results = doc.search("quarterly revenue", true); // case-insensitive
// Returns: [{ page, text, bbox, start_index, end_index, span_boxes }]

Read and fill form fields

const fields = doc.getFormFields();
// [{ name, field_type, value, tooltip, bounds, is_readonly, is_required }]

doc.setFormFieldValue("name", "Jane Doe");
doc.setFormFieldValue("agree_terms", true);

const filledPdf = doc.saveToBytes(); // Uint8Array

Encrypt a PDF (AES-256)

const encrypted = doc.saveEncryptedToBytes(
  "user-password",
  "owner-password",
  true,  // allow print
  false, // deny copy
);

Features

Text Extraction — plain text, Markdown, and HTML output formats. Character-level and span-level extraction with bounding boxes, font names, sizes, weights, colors, and italic flags.

Format Conversion — convert any page or all pages to Markdown (with heading detection, images, form fields), HTML (with optional CSS layout preservation), or structured plain text.

Full-Text Search — regex and literal search across all pages or a single page. Case-insensitive, whole-word, and max-results options. Returns match positions with bounding boxes.

Image Extraction — extract image metadata (dimensions, color space, bits per component, bounding boxes) and raw image bytes as PNG.

Form Fields — read all AcroForm fields (text, button, choice, signature). Get/set individual field values. Export form data as FDF or XFDF. Flatten forms into static content. XFA detection.

PDF Creation — generate PDFs from Markdown, HTML, plain text, or images (PNG/JPEG). Multi-image support (one page per image). Set title, author metadata.

PDF Editing — set document metadata (title, author, subject, keywords). Rotate pages, set MediaBox/CropBox, crop margins. Erase (whiteout) regions. Reposition, resize, and set bounds on images. Flatten or apply redactions. Merge PDFs. Embed files.

Encryption — AES-256 encryption with granular permissions (print, copy, modify, annotate).

Document Structure — bookmarks/outline (table of contents), annotations (links, comments, form widgets), page labels, XMP metadata, vector paths.

API Reference

WasmPdfDocument — read, extract, search, and edit existing PDFs

| Method | Description | |---|---| | new(data) | Load PDF from Uint8Array | | pageCount() | Number of pages | | version() | PDF version as [major, minor] | | authenticate(password) | Decrypt an encrypted PDF | | hasStructureTree() | Check for Tagged PDF structure | | Text Extraction | | | extractText(page) | Plain text from one page | | extractAllText() | Plain text from all pages | | extractChars(page) | Character-level data with positions | | extractSpans(page) | Span-level data with positions | | Format Conversion | | | toMarkdown(page, headings?, images?, forms?) | Markdown from one page | | toMarkdownAll(headings?, images?, forms?) | Markdown from all pages | | toHtml(page, layout?, headings?, forms?) | HTML from one page | | toHtmlAll(layout?, headings?, forms?) | HTML from all pages | | toPlainText(page) | Plain text with layout | | toPlainTextAll() | Plain text all pages | | Search | | | search(pattern, caseInsensitive?, literal?, wholeWord?, max?) | Search all pages | | searchPage(page, pattern, ...) | Search one page | | Images | | | extractImages(page) | Image metadata (dimensions, color space, bbox) | | extractImageBytes(page) | Image data as PNG Uint8Array | | pageImages(page) | Image placement info (bounds, matrix) | | Forms | | | getFormFields() | All form fields with types and values | | getFormFieldValue(name) | Get a single field value | | setFormFieldValue(name, value) | Set a field value | | exportFormData(format?) | Export as FDF or XFDF | | hasXfa() | Check for XFA form data | | flattenForms() | Flatten all form fields | | flattenFormsOnPage(page) | Flatten fields on one page | | Document Structure | | | getOutline() | Bookmarks / table of contents | | getAnnotations(page) | Page annotations | | extractPaths(page) | Vector paths (lines, curves) | | pageLabels() | Page label ranges | | xmpMetadata() | XMP metadata | | Editing | | | setTitle(title) | Set document title | | setAuthor(author) | Set document author | | setSubject(subject) | Set document subject | | setKeywords(keywords) | Set document keywords | | setPageRotation(page, degrees) | Set page rotation | | rotatePage(page, degrees) | Rotate page by degrees | | rotateAllPages(degrees) | Rotate all pages | | pageMediaBox(page) | Get MediaBox | | setPageMediaBox(page, llx, lly, urx, ury) | Set MediaBox | | pageCropBox(page) | Get CropBox | | setPageCropBox(page, llx, lly, urx, ury) | Set CropBox | | cropMargins(left, right, top, bottom) | Crop all page margins | | eraseRegion(page, llx, lly, urx, ury) | Whiteout a region | | eraseRegions(page, rects) | Whiteout multiple regions | | repositionImage(page, name, x, y) | Move an image | | resizeImage(page, name, w, h) | Resize an image | | setImageBounds(page, name, x, y, w, h) | Set image bounds | | flattenPageAnnotations(page) | Flatten page annotations | | flattenAllAnnotations() | Flatten all annotations | | applyPageRedactions(page) | Apply redactions on page | | applyAllRedactions() | Apply all redactions | | mergeFrom(data) | Merge another PDF | | embedFile(name, data) | Embed a file | | Save | | | saveToBytes() | Save edits → Uint8Array | | saveEncryptedToBytes(userPwd, ownerPwd?, ...) | Save with AES-256 encryption | | free() | Release WASM memory |

WasmPdf — create new PDFs

| Method | Description | |---|---| | fromMarkdown(content, title?, author?) | Create PDF from Markdown | | fromHtml(content, title?, author?) | Create PDF from HTML | | fromText(content, title?, author?) | Create PDF from plain text | | fromImageBytes(data) | Create PDF from image (PNG/JPEG) | | fromMultipleImageBytes(images) | Create multi-page PDF from images | | toBytes() | Get PDF as Uint8Array | | size | PDF size in bytes |

Platform Compatibility

Works without modification in:

  • Node.js 18+ (CommonJS and ESM)
  • Browsers — Chrome, Firefox, Safari, Edge
  • Cloudflare Workers — runs in V8 isolates with WASM support
  • Deno — native WASM support
  • Bun — native WASM support

No native binaries, no node-gyp, no postinstall scripts. Install and use immediately.

Performance

pdf-oxide-wasm is built on a Rust PDF parser compiled to WebAssembly. The Rust core (pdf_oxide) achieves 0.8ms mean extraction time across 3,830 test PDFs with a 100% success rate — the fastest PDF text extraction library available in Rust. The WASM compilation preserves near-native performance without garbage collection overhead or child process spawning.

Full Documentation

Complete guide with examples: Getting Started with WASM

Rust library documentation: docs.rs/pdf_oxide

License

MIT OR Apache-2.0