npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

n8n-nodes-mineru

v0.1.10

Published

Free and comprehensive document parsing capabilities

Readme

n8n-nodes-mineru

MinerU Logo

📖 Introduction

n8n-nodes-mineru is a powerful n8n community node package that integrates the MinerU document parsing API, providing you with free and comprehensive document parsing capabilities. It supports intelligent parsing of various formats including PDF, Word, PowerPoint, images, and can automatically recognize text, tables, formulas, and image content.

✨ Key Features

  • 🚀 Multi-format Support: Supports PDF, DOC, DOCX, PPT, PPTX, PNG, JPG, JPEG and other formats
  • 🧠 Intelligent Recognition: Automatically recognizes text, tables, formulas, and images in documents
  • 🌐 Dual Service Modes: Supports both online API service and local self-deployed service
  • 📊 Multiple Output Formats: Supports Markdown, JSON, DOCX, HTML, LaTeX and other output formats
  • 🔧 Flexible Configuration: Provides rich parameter configuration options to meet different scenario requirements
  • 🌍 Multi-language Support: Supports Chinese, English, and automatic language detection

📦 Included Nodes

1. MinerU Node

  • Function: Uses MinerU online API service to parse documents
  • Features: Automatically creates tasks and waits for results, returns parsed ZIP files
  • Use Case: Users who need to use the official API service

2. MinerU Custom Service Node

  • Function: Connects to self-deployed MinerU API server
  • Features: Supports local file upload with more custom configuration options
  • Use Case: Users with self-deployment needs or requiring more control

🛠️ Installation

Method 1: Install via n8n Community Nodes

  1. Open n8n interface
  2. Go to Settings > Community Nodes
  3. Click Install Community Node
  4. Enter package name: n8n-nodes-mineru
  5. Click Install

Method 2: Install via npm

# Execute in n8n root directory
npm install n8n-nodes-mineru

Method 3: Manual Installation

# Clone repository
git clone https://github.com/opendatalab/awsome-mineru.git
cd awsome-mineru/n8n-nodes-mineru

# Install dependencies
npm install

# Build project
npm run build

# Link to n8n (for development environment)
npm link

🔑 Credential Configuration

MinerU API Credentials

  1. Create new credentials in n8n
  2. Select MinerU API type
  3. Enter your API Token
  4. Save credentials

Get API Token:

📋 Usage Guide

MinerU Node Usage

  1. Add Node: Add "MinerU" node to your workflow
  2. Configure Credentials: Select the created MinerU API credentials
  3. Set Parameters:
    • Document URL: Link to the document to be parsed (required)
    • Enable OCR: Whether to enable image text recognition
    • Enable Formula Recognition: Whether to recognize mathematical formulas
    • Enable Table Recognition: Whether to recognize table structures
    • Document Language: Select the main language of the document
    • Extra Export Format: Select additional output formats needed
    • Model Version: Select the MinerU model version to use
  4. Execute Node: The node will automatically create parsing task and wait for completion
  5. Get Results: Returns ZIP file containing all results after parsing completion

MinerU Custom Service Node Usage

  1. Deploy Service: First need to deploy MinerU API server
  2. Add Node: Add "MinerU Custom Service" node to your workflow
  3. Configure Parameters:
    • API Version: Select V1 or V2
    • File URL: Link to the document to be parsed
    • API Server Address: Your MinerU server address
    • Output Directory: Output directory for parsing results
    • Configure corresponding parameters based on selected API version
  4. Execute Node: Directly call your server for parsing

🔧 Parameter Description

Common Parameters

| Parameter | Type | Default | Description | |-----------|------|---------|-------------| | Document URL | String | - | URL address of the document to be parsed | | Enable OCR | Boolean | false | Whether to enable optical character recognition | | Enable Formula Recognition | Boolean | true | Whether to recognize mathematical formulas | | Enable Table Recognition | Boolean | true | Whether to recognize table structures | | Document Language | Option | Chinese | Main language of the document |

MinerU Node Specific Parameters

| Parameter | Type | Default | Description | |-----------|------|---------|-------------| | Data ID | String | - | Optional data identifier | | Page Range | String | - | Specify the page range to parse | | Extra Export Format | Multi-select | [] | Additional output formats besides default | | Polling Interval | Number | 5 | Interval time to check task status (seconds) | | Maximum Wait Time | Number | 10 | Maximum time to wait for task completion (minutes) |

Custom Service Node Specific Parameters

| Parameter | Type | Default | Description | |-----------|------|---------|-------------| | API Server Address | String | http://localhost:8000 | MinerU server address | | Output Directory | String | ./output | Output directory for parsing results | | Backend Engine | Option | pipeline | Processing engine type | | Return Markdown | Boolean | true | Whether to return Markdown format results |

🌟 Usage Examples

Example 1: Parse PDF Document and Extract Text

{
  "nodes": [
    {
      "name": "MinerU",
      "type": "n8n-nodes-mineru.mineru",
      "parameters": {
        "url": "https://example.com/document.pdf",
        "isOcr": true,
        "enableFormula": true,
        "enableTable": true,
        "language": "ch",
        "extraFormats": ["docx", "html"]
      }
    }
  ]
}

Example 2: Parse Multiple Format Documents Using Custom Service

{
  "nodes": [
    {
      "name": "MinerU Custom Service",
      "type": "n8n-nodes-mineru.mineruCustom",
      "parameters": {
        "apiVersion": "v2",
        "fileUrl": "https://example.com/presentation.pptx",
        "serverUrl": "http://your-mineru-server:8000",
        "langList": "auto",
        "formulaEnable": true,
        "tableEnable": true,
        "returnMd": true
      }
    }
  ]
}

🚀 Advanced Usage

Batch Document Processing

You can combine with other n8n nodes to implement batch document processing:

  1. Use HTTP Request node to get document list
  2. Use Split In Batches node to process in batches
  3. Use MinerU node to parse each document
  4. Use Merge node to combine results

Result Post-processing

After parsing completion, you can:

  1. Use Move Binary Data node to process returned files
  2. Use HTTP Request node to upload results to cloud storage
  3. Use Email node to send parsing results
  4. Use Webhook node to trigger subsequent processes

🔍 Troubleshooting

Common Issues

Q: Node execution fails with "API Token verification failed" A: Please check if your API Token is correct and ensure you have obtained a valid Token from the MinerU official website.

Q: Document parsing timeout A: You can appropriately increase the "Maximum Wait Time" parameter or check if the document size is too large.

Q: Custom service connection failed A: Please ensure your MinerU server is running normally and the network connection is stable.

Q: Some document formats cannot be parsed A: Please confirm the document format is in the supported list and check if the document is corrupted.

Debugging Tips

  1. Enable Node Debug: Enable "Continue On Fail" option in node settings
  2. Check Error Logs: Check n8n error logs for detailed information
  3. Test Connection: Use simple documents to test if connection is normal first
  4. Check Parameters: Ensure all required parameters are set correctly

🤝 Contributing

We welcome community contributions! If you want to contribute to the project:

  1. Fork this repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE.md file for details.

🔗 Related Links

👥 Contact Us


If this project helps you, please give us a ⭐️!