npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

@bergetai/n8n-nodes-berget-ai-ocr

v1.0.4

Published

n8n node for Berget AI OCR document processing

Readme

n8n-nodes-berget-ai-ocr

n8n node for Berget AI OCR document processing - extract text from PDFs, images, and documents.

Installation

Community Nodes (Recommended)

  1. Open n8n
  2. Go to Settings > Community Nodes
  3. Click Install a community node
  4. Enter: @bergetai/n8n-nodes-berget-ai-ocr
  5. Click Install

Manual Installation

# In your n8n project
npm install @bergetai/n8n-nodes-berget-ai-ocr

Configuration

  1. Add the node to your workflow
  2. Configure API settings:
    • API Key: Your Berget AI API key
    • Document Type: URL or Base64
    • Document Source: URL or base64 data
    • Processing Mode: Sync or async
    • Options: OCR method, output format, etc.

Features

  • Multiple Input Types: URLs and base64 encoded documents
  • Async Processing: Handle large documents asynchronously
  • Multiple OCR Engines: EasyOCR, Tesseract, RapidOCR, etc.
  • Table Extraction: Accurate or fast table processing
  • Multiple Formats: Markdown and JSON output
  • Image Support: Include images in output
  • Document Types: PDF, DOCX, PPTX, HTML support

Supported Document Types

  • PDF - Portable Document Format
  • DOCX - Microsoft Word documents
  • PPTX - Microsoft PowerPoint presentations
  • HTML - Web pages and HTML documents
  • Images - JPG, PNG, TIFF, etc.

OCR Engines

  • EasyOCR - Recommended, supports 80+ languages
  • Tesseract - Classic OCR engine
  • RapidOCR - Fast processing
  • OCR Mac - macOS native OCR
  • TesserOCR - Python wrapper for Tesseract

Examples

Basic Document Processing

{
  "operation": "process",
  "documentType": "url",
  "documentUrl": "https://example.com/document.pdf",
  "async": false,
  "options": {
    "outputFormat": "md",
    "tableMode": "accurate",
    "ocrMethod": "easyocr"
  }
}

Async Processing for Large Documents

{
  "operation": "process",
  "documentType": "url",
  "documentUrl": "https://example.com/large-document.pdf",
  "async": true,
  "options": {
    "outputFormat": "json",
    "tableMode": "fast",
    "ocrMethod": "rapidocr"
  }
}

Base64 Document Processing

{
  "operation": "process",
  "documentType": "base64",
  "documentData": "JVBERi0xLjQKJcOkw7zDtsO...",
  "options": {
    "outputFormat": "md",
    "includeImages": true
  }
}

Output Format

Synchronous Processing

{
  "content": "# Document Title\n\nExtracted text content...",
  "usage": {
    "pages": 5,
    "characters": 2492
  },
  "metadata": {
    "filename": "document.pdf",
    "pageCount": 5,
    "fileType": "application/pdf",
    "processingTime": 7787
  },
  "processing_mode": "synchronous"
}

Asynchronous Processing

{
  "taskId": "d11234-5678-9101-1121",
  "status": "pending",
  "resultUrl": "/v1/ocr/result/d11234-5678-9101-1121",
  "processing_mode": "asynchronous",
  "message": "Document processing started. Use the taskId to check status."
}

Processing Modes

Synchronous (Default)

  • Immediate processing and response
  • Best for small to medium documents
  • Response includes extracted content directly

Asynchronous

  • Background processing for large documents
  • Returns task ID for status checking
  • Use separate API calls to get results

Advanced Options

Table Extraction

  • Accurate: Slower but better table structure recognition
  • Fast: Quicker processing with basic table extraction

Output Formats

  • Markdown: Clean, readable text format
  • JSON: Structured data with metadata

OCR Options

  • Perform OCR: Enable/disable text extraction
  • Table Structure: Extract table layouts
  • Include Images: Embed images as base64

Use Cases

  • Document Digitization: Convert scanned PDFs to text
  • Data Extraction: Extract structured data from forms
  • Content Analysis: Process documents for AI analysis
  • Archive Processing: Digitize historical documents
  • Invoice Processing: Extract data from invoices
  • Contract Analysis: Process legal documents

Pricing

OCR processing is charged per page processed. See current pricing at berget.ai/models.

Testing

# Test node structure
npm test

# Test with real API
BERGET_API_KEY=your-key npm test

# Link locally for n8n testing
npm run test:local

Support

License

MIT License - See LICENSE file for details.