n8n-nodes-ocr-ai
v1.0.3-104
Published
n8n community node for OCR and document extraction using multiple AI providers (Gemini, OpenAI, Claude, Grok, Vertex)
Maintainers
Readme
n8n-nodes-ocr-ai
This is an n8n community node for document extraction using AI. It wraps the ocr-ai library (v1.0.3) to provide OCR and structured data extraction capabilities.
Features
- Multi-provider support: Gemini, OpenAI, Claude, Grok, and Vertex AI
- Text extraction: Extract plain text from documents
- JSON extraction: Extract structured data matching a custom schema
- Multiple input types: Binary data, URL, or Base64
- Supported formats: PDF, images (jpg, png, gif, webp, bmp, tiff), text files
Installation
Community Nodes (Recommended)
- Go to Settings > Community Nodes
- Select Install
- Enter
n8n-nodes-ocr-aiin the package name field - Accept the risks and click Install
Manual Installation
cd ~/.n8n/nodes
pnpm install n8n-nodes-ocr-aiCredentials
Configure credentials based on the provider you want to use:
| Provider | Credential Type | Required Fields | |----------|----------------|-----------------| | Gemini | OCR AI Gemini API | API Key | | OpenAI | OCR AI OpenAI API | API Key | | Claude | OCR AI Claude API | API Key | | Grok | OCR AI Grok API | API Key | | Vertex AI | OCR AI Vertex AI | Project ID, Location |
Usage
Extract Text
- Add the OCR AI node to your workflow
- Select a provider and configure credentials
- Choose Extract Text operation
- Select input type (Binary, URL, or Base64)
- Configure the input source
- Run the workflow
Extract JSON
- Add the OCR AI node
- Choose Extract JSON operation
- Define a JSON schema describing the structure you want to extract
- The node will return structured data matching your schema
Example Schema
{
"invoice_number": "string",
"date": "string",
"total": "number",
"items": [{
"description": "string",
"quantity": "number",
"price": "number"
}]
}Options
| Option | Description | |--------|-------------| | Custom Prompt | Guide the extraction with a custom prompt | | Language | Set extraction language (default: auto) | | Model | Override the default model for the provider | | Temperature | Control randomness (0-2) | | Max Tokens | Maximum tokens in response |
Default Models
| Provider | Default Model | |----------|--------------| | Gemini | gemini-1.5-flash | | OpenAI | gpt-4o | | Claude | claude-sonnet-4-20250514 | | Grok | grok-2-vision-1212 | | Vertex AI | gemini-2.0-flash |
Development
# Install dependencies
pnpm install
# Build
pnpm build
# Development mode
pnpm dev
# Lint
pnpm lintLicense
MIT
Dependencies
- ocr-ai v1.0.3
