npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

pi-ocr

v1.3.15

Published

Pi extension: Zero-setup multi-backend OCR — MinerU (free cloud), Ollama (local GPU, LaTeX formulas), Pix2Text (local Python). Extract text, formulas, and tables from images and PDFs. Default: zero config, works out of the box.

Readme

pi-ocr

⚡ Zero setup. Works out of the box.

Default backend is MinerU — a free cloud API. No GPU, no API key, no pip install. Just pi install and /ocr.

OCR for Pi Coding Agent. Bridges the multimodal gap for non-vision LLMs like DeepSeek: when your model can't see images, pi_ocr reads them for you.


Quickstart

pi install npm:pi-ocr
/ocr ./screenshot.png
/ocr ./paper.pdf

That's all. MinerU (free cloud API) is the default — zero config.

The pi_ocr tool takes only a file path. Backend, model, and task are configured by the user via /ocr settings — the AI doesn't need to manage them.


Backends

Switch anytime with /ocr (no args).

| | Backend | Best for | Setup | |---|---|---|---| | ☁️ | MinerU (default) | PDFs, general docs | None | | ☁️ | MinerU Pro | Large PDFs, vlm accuracy | API token | | 🦙 | Ollama | Math formulas → LaTeX | GPU + 2.2GB model | | 🔤 | Tesseract | Plain text (~30MB) | brew install tesseract | | 📐 | Pix2Text | Math + text, GPU/CPU | pip install pix2text |

💡 Unsure which backend to pick? See the benchmark with real test results and the ground truth for comparison.


MinerU (default)

Free cloud API. Images are wrapped as PDF so language-aware OCR applies.

Limits: ≤10MB, ≤20 pages/request. PDFs >20 pages auto-split via pypdfium2.


MinerU Pro (vlm model)

Higher accuracy via token-based precision API. ≤200MB, ≤200 pages — no splitting needed.

Get a free token at mineru.net/apiManage, then set it in /ocr settings. 1000 pages/day high-priority.


Ollama

Local GPU OCR via glm-ocr — state-of-the-art formula recognition (94.6 OmniDocBench). Outputs LaTeX.

# macOS
brew install ollama && ollama pull glm-ocr
brew install poppler   # multi-page PDFs

# Linux
curl -fsSL https://ollama.com/install.sh | sh
ollama pull glm-ocr
sudo apt install poppler-utils

Tesseract

Classic OCR engine. Ultra-lightweight (~30MB). No formula support — use Ollama or Pix2Text for math.

brew install tesseract              # macOS
sudo apt install tesseract-ocr      # Linux

Supports Chinese: brew install tesseract-lang (auto-installed on macOS).


Pix2Text

Mathpix alternative — handles text + formulas on GPU (CUDA/MPS) or CPU. Auto-detects best device.

pip install pix2text

First run downloads ONNX models (~50MB).


Settings

Open with /ocr (no args).

| Setting | Description | |---|---| | OCR Backend | Switch between MinerU, Ollama, Pix2Text, Tesseract | | MinerU: Split PDF >20 pages | Auto-split large PDFs into free-tier chunks | | MinerU Pro Token | API token from mineru.net/apiManage | | Ollama Model | Vision model (glm-ocr, minicpm-v, etc.) | | Clear OCR temp files | Remove cached OCR output from /tmp |


Output Behavior

Results ≤2000 chars are returned inline in the tool response. Longer results are written to a temp file (/tmp/pi-ocr-*.md); the tool response includes the file path for the AI to read.


Commands

| Command | | |---|---| | /ocr | Open settings (backend, model, split toggle, clear cache) | | /ocr <file> | OCR a file | | /ocr <file> formula | Math LaTeX output (Ollama backend) |

Troubleshooting

MinerU 429 → Wait a minute or switch backend.

MinerU Pro 401 → Regenerate token at mineru.net/apiManage.

"Is Ollama running?"ollama serve

"pdftoppm not found"brew install poppler / sudo apt install poppler-utils

"python3 not found" (Pix2Text)pip install pix2text

"tesseract not found"brew install tesseract / sudo apt install tesseract-ocr


License

MIT