npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

mac-ocr

v1.0.0

Published

macOS CLI for OCR and searchable PDFs using Apple's Vision framework. Recognize text in images and PDFs, stream PDF pages, and add an invisible selectable text layer over scanned pages.

Readme

[!TIP] Useful for AI agents too: instead of spending vision tokens reading documents, an agent can run mac-ocr locally for free. A skill is bundled so agents know how to use it.

Features

  • Read text from an image: mac-ocr photo.png
  • Read text from many images: mac-ocr *.png
  • Stream text from a PDF, page by page: mac-ocr scan.pdf --format jsonl
  • Turn an image into a searchable PDF: mac-ocr searchable-pdf photo.pngphoto.ocr.pdf
  • Add a selectable text layer to a scanned PDF: mac-ocr searchable-pdf scan.pdfscan.ocr.pdf

Install

npm install -g mac-ocr

Or run it without installing:

npx mac-ocr receipt.jpg

Requirements: macOS 10.15+. The npm package ships a prebuilt universal binary, so no Xcode or Swift toolchain is needed.

Recognize text

OCR is the default action — you don't need a subcommand:

mac-ocr receipt.jpg                 # text → stdout
mac-ocr page1.png page2.png         # multiple images
mac-ocr scan.pdf                    # multi-page PDF
cat screenshot.png | mac-ocr        # stdin
mac-ocr https://example.com/a.png   # URL (simple GET)

Default output is plain text. Use JSON when you need bounding boxes, confidence, or page metadata:

mac-ocr receipt.jpg --format json
mac-ocr document.pdf --format jsonl   # one JSON object per page, streamed

PDF pages stream as they're recognized, so with a large document you see the first page's text right away.

Save text to files

mac-ocr ~/Screenshots/*.png -o '[dir]/[name].txt'   # a .txt next to each image
mac-ocr scan.pdf -o notes.md                        # recognized text to a chosen .txt/.md file
mac-ocr receipts/*.pdf -o out/                      # one file per input in out/
grep -rli "invoice" ~/Screenshots                    # then search with normal tools

-o takes a file, a directory (out/), or a filename template (all placeholders). Quote templates, since […] is a glob pattern in zsh. Whatever the extension, the content is the plain recognized text.

Create a searchable PDF

searchable-pdf takes a PDF or an image and writes a PDF that looks identical to the source but whose text is selectable and searchable. By default it writes [name].ocr.pdf next to each input — one searchable PDF per input (inputs are never merged):

mac-ocr searchable-pdf scan.pdf            # writes scan.ocr.pdf
mac-ocr searchable-pdf photo.jpg            # image → one-page photo.ocr.pdf
mac-ocr searchable-pdf *.pdf                # writes <name>.ocr.pdf for each

Use -o to control the destination — a directory, a [name] template, a fixed file, or - for stdout:

mac-ocr searchable-pdf scan.pdf -o out/              # out/scan.ocr.pdf
mac-ocr searchable-pdf scan.pdf -o '[name]-ocr.pdf'  # scan-ocr.pdf
mac-ocr searchable-pdf scan.pdf -o searchable.pdf    # fixed path
mac-ocr searchable-pdf scan.pdf -o - > scan.pdf      # stdout

A fixed path or - (stdout) takes a single input; for multiple inputs use a directory or a [name] template.

Pages that already have selectable text are skipped — only scanned pages get OCR. A PDF that needs no OCR at all passes through unchanged. To OCR every page regardless, pass --ocr-all-pages. The finer points (what survives a rewrite, how "already has text" is decided) are in docs/CLI.md.

In an interactive terminal you get a live [page/total] progress counter. Piped or redirected runs are silent on success, so scripts stay clean.

Options

Both OCR and searchable-pdf accept the recognition options:

| Flag | Effect | |------|--------| | --fast | Faster, lower-accuracy recognition (details) | | --password <password> | Password for an encrypted PDF (or set MAC_OCR_PDF_PASSWORD) | | -l, --language <code> | Recognition language (BCP-47, repeatable). e.g. -l en-US -l ja-JP | | -c, --confidence <0–1> | Drop observations below this confidence | | -w, --custom-words <word> | Add custom vocabulary (repeatable) | | --custom-words-file <path> | Custom vocabulary file, one word per line | | --no-language-correction | Disable language correction | | --min-text-height <0–1> | Ignore text shorter than this fraction of image height | | --pdf-dpi <auto\|72–600> | PDF rasterization DPI (default auto) | | --roi <x,y,w,h> | Region of interest: restrict recognition to a normalized region (top-left origin) |

mac-ocr <file>

| Flag | Effect | |------|--------| | -f, --format <text\|json\|jsonl> | Output format (default text) | | -o, --output <path> | Output path, directory, or template ([name], [ext], [dir], [page]). Default: stdout. Any extension — e.g. .txt or .md. | | --max-candidates <1–10> | Alternative text candidates per observation |

mac-ocr searchable-pdf <file>

| Flag | Effect | |------|--------| | -o, --output <dest> | Output path, [name] template, directory, or - for stdout. Default: [name].ocr.pdf next to each input. | | --ocr-all-pages | OCR every page, including pages that already have selectable text (skipped by default) |

List the recognition languages available on your macOS version with mac-ocr languages (add --fast for the fast recognizer's set).

See docs/CLI.md for the full reference — every command and flag, plus the JSON output schema.

Node.js API

The same package exposes a typed, promise-based API that wraps the binary. Inputs are image or PDF bytes — read files or fetch URLs in your own code and pass the bytes:

npm install mac-ocr
import fs from 'node:fs/promises'
import { ocr, createSearchablePdf, supportedLanguages } from 'mac-ocr'

// Recognize text in an image or single-page PDF
const result = await ocr(await fs.readFile('receipt.jpg'))
console.log(result.text)
for (const { text, confidence, boundingBox } of result.observations) { /* … */ }

// Multi-page PDF: stream pages as they finish…
for await (const page of ocr.pages(await fs.readFile('book.pdf'))) {
    console.log(page.page, '/', page.pageCount, page.text)
}
// …or collect the whole thing into an array
const pages = await Array.fromAsync(ocr.pages(await fs.readFile('book.pdf')))

// Build a searchable PDF (returns the PDF bytes)
const pdf = await createSearchablePdf(await fs.readFile('scan.pdf'), { fast: true })
await fs.writeFile('scan.ocr.pdf', pdf)

// Recognition languages supported on this macOS version (for ocr and createSearchablePdf)
const languages = await supportedLanguages()

Options mirror the CLI flags (like { fast: true } above), plus an AbortSignal for cancellation. Failures throw a MacOcrError with a kind you can branch on. See docs/NODE.md for every option, the result types, and error handling.

How it works

mac-ocr is a native Swift binary built on Apple's Vision framework (VNRecognizeTextRequest). Recognition happens entirely on-device — nothing is uploaded. The searchable-PDF layer is invisible text drawn with Core Graphics + Core Text, placed word by word where Vision found each word.

Agent Skills

The package bundles an agent skill covering the CLI and Node API — set up skills-npm in your project and coding agents discover it automatically.