npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@nova-mind-cloud/pdf-parser-mcp

v1.0.4

Published

MCP Server for PDF parsing and content extraction

Downloads

9

Readme

@gdm-pixel/pdf-parser-mcp

npm version License: NMCL Subscription Required

🔐 Subscription-Based PDF Parser MCP

PDF parsing and text extraction.

⚠️ Requires Nova-Mind Cloud subscription - Starting at €39/month


💎 Open Source Code + Cloud Services

Code is open - Audit, learn, modify freely
🔐 Usage requires subscription - Backend authentication & infrastructure

👉 View pricing | Plans: €39 / €89 / €149 per month


✨ Features

  • 📄 PDF Text Extraction - Extract all text content from PDF files
  • 📊 Metadata Extraction - Get PDF metadata (title, author, pages, etc.)
  • 📂 Batch Processing - Parse multiple PDFs in a directory
  • 🔍 Page Limiting - Extract specific page ranges
  • Fast Processing - Efficient parsing engine
  • 🌐 Cross-platform - Works on Windows, macOS, Linux

📦 Installation

With Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "pdf-parser": {
      "command": "npx",
      "args": ["-y", "@gdm-pixel/pdf-parser-mcp@latest"]
    }
  }
}

Config location:

  • Windows: %APPDATA%\Claude\claude_desktop_config.json
  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Linux: ~/.config/Claude/claude_desktop_config.json

🚀 Usage

Extract Text from PDF

Extract text from this PDF: C:\Documents\report.pdf

Returns:

  • Full text content
  • Page count
  • Metadata (if available)

Extract with Page Limit

Extract first 10 pages from C:\Documents\long-document.pdf

Useful for:

  • Large documents
  • Quick previews
  • Specific chapters

Get PDF Metadata

Analyze this PDF file: C:\Documents\report.pdf

Returns:

  • Title
  • Author
  • Creator
  • Creation date
  • Number of pages
  • PDF version

List PDFs in Directory

List all PDF files in C:\Documents\Reports

Returns:

  • All PDF files found
  • File sizes
  • File paths

📋 Available Tools

parse-pdf

Extract text content from a PDF file.

Parameters:

  • filePath (required) - Absolute path to PDF file
  • options (optional) - Parsing options
    • extractMetadata (boolean) - Extract metadata (default: true)
    • maxPages (number) - Maximum pages to extract

Example:

{
  "filePath": "C:\\Documents\\report.pdf",
  "options": {
    "extractMetadata": true,
    "maxPages": 10
  }
}

analyze-pdf

Get detailed PDF information without extracting text.

Parameters:

  • filePath (required) - Absolute path to PDF file

Returns:

  • Metadata (title, author, dates)
  • Page count
  • PDF version
  • File size

list-pdf-files

List all PDF files in a directory.

Parameters:

  • directory (required) - Directory path to scan

Returns:

  • Array of PDF file paths
  • File sizes
  • File names

🔧 Troubleshooting

File Not Found

Problem: "File not found" or "Cannot read PDF" errors

Solutions:

  • Verify file path is absolute (not relative)
  • Check file exists at specified location
  • Ensure file has .pdf extension
  • Verify you have read permissions

Extraction Fails

Problem: PDF opens but text extraction fails

Solutions:

  • Check PDF is not password protected
  • Verify PDF contains actual text (not scanned images)
  • For image-based PDFs, you need OCR (not included)
  • Try extracting fewer pages with maxPages

Garbled Text

Problem: Extracted text is unreadable or has weird characters

Solutions:

  • PDF may have non-standard encoding
  • Try different PDF reader to verify content
  • Check if PDF is corrupted
  • Some encrypted PDFs may produce garbled output

Performance Issues

Problem: Large PDFs take too long to parse

Solutions:

  • Use maxPages option to limit extraction
  • Process PDFs in smaller chunks
  • Close other applications to free up memory
  • Consider splitting large PDFs into smaller files

💡 Use Cases

Document Analysis

Extract and summarize this PDF report: C:\Reports\Q4-2024.pdf

Batch Processing

List all PDFs in C:\Documents then extract text from each

Research & Data Extraction

Extract first 5 pages from C:\Papers\research.pdf and find key findings

Content Migration

Extract all text from old PDFs in C:\Archive for new system

Metadata Inspection

Analyze these PDFs and show their metadata: C:\Downloads\*.pdf

📝 Notes

Supported PDF Types

  • ✅ Text-based PDFs (created from Word, LaTeX, etc.)
  • ✅ PDFs with embedded fonts
  • ✅ Multi-page documents
  • ❌ Image-only PDFs (requires OCR)
  • ❌ Password-protected PDFs
  • ⚠️ Scanned documents (may need OCR)

Performance Tips

  • Use maxPages for large documents
  • Process PDFs in batches
  • Extract metadata first to check page count
  • Close other applications for large files

Limitations

  • No OCR support (image-based PDFs)
  • No password-protected PDF support
  • No PDF editing/creation
  • Text extraction only (no images)

🔒 Security

This tool:

  • ✅ Only reads PDF files (no modifications)
  • ✅ Works with local files only
  • ✅ No data sent to external services
  • ✅ No file system modifications
  • ✅ Requires explicit file paths

📄 License

MIT © Charles Annoni


🔗 Links


🙏 Credits

Created by Charles Annoni (GDM-Pixel)

Part of the Nova-Mind ecosystem - AI-powered coaching platform.


🆘 Need OCR?

For image-based PDFs or scanned documents, you'll need an OCR solution. Consider:

  • Adobe Acrobat Pro (commercial)
  • Tesseract OCR (open source)
  • Online OCR services
  • Dedicated OCR MCP server (coming soon)