n8n-nodes-ldxhub
v0.8.1
Published
n8n community node for LDX hub — AI-powered document processing platform with structured data extraction (StructFlow), layout-preserving OCR (RenderOCR), text-based PDF conversion (CastDoc), and XLIFF translation refinement (RefineLoop)
Maintainers
Readme
n8n-nodes-ldxhub
✅ Free to try — 25,000 credits/month, no credit card required
✅ One key for everything — OpenAI, Anthropic, Google, AWS, Azure, xAI
✅ 30-second sign-up — GitHub, Google, or email; your API key is shown immediately
n8n community node for LDX hub — AI-powered document processing platform: structured data extraction (StructFlow), XLIFF translation refinement (RefineLoop), layout-preserving OCR (RenderOCR), text-based PDF conversion (CastDoc), and plain-text/JSONL extraction (ExtractDoc).
Table of Contents
- Features
- Installation
- Prerequisites
- Credentials Setup
- Usage
- AI Agent Integration
- Polling Settings
- Troubleshooting
- Support
- Changelog
- License
Features
- StructFlow: Extract structured JSON from unstructured text using AI models (medical records, customer feedback, legal documents, and more)
- RefineLoop: Iteratively improve XLIFF translation quality using frontier AI models (Google Gemini, Anthropic Claude, OpenAI GPT, and more)
- RenderOCR: Convert PDFs and images to Word/Excel/PowerPoint with layout-preserving OCR (via industry-leading OCR engines)
- CastDoc: Convert text-based PDFs to Word/Excel/PowerPoint without OCR (high-fidelity layout preservation for digital-born documents)
- ExtractDoc: Extract plain text or JSONL from PDF/DOCX/XLSX/PPTX using the
ki/extractengine (no AI, no OCR, free tier) — ideal as a preprocessing step before StructFlow - One credential for all AI providers: OpenAI, Anthropic Claude, Google Gemini, AWS Nova, Azure OpenAI, xAI Grok — accessed through a single LDX hub API key
- HTTP long-polling architecture — compatible with n8n Cloud execution model
- Proven at scale: tested with 1.19M-character academic papers
Installation
In your n8n instance:
- Go to Settings → Community Nodes
- Click Install
- Enter
n8n-nodes-ldxhuband confirm
Prerequisites
- An LDX hub account
- n8n version supporting community nodes (v1.x or later)
- A paid or free tier subscription (free tier includes 25,000 credits/month, suitable for evaluation)
Credentials Setup
- Sign up at the LDX hub DevPortal — free, no credit card. Use GitHub, Google, or email; your API key is shown immediately after sign-up.
- Choose a subscription plan — start with Free (25,000 credits/month) to evaluate, or pick Starter / Standard / Pro for production. See pricing for details.
- Click your account email in the top-right of the DevPortal, then select My Subscriptions
- Under API Keys, copy the Current Key
- In n8n, create a new LDXhub API credential:
- Base URL:
https://gw.ldxhub.io(default; leave as-is for production) - API Key: paste the key from step 4
- Base URL:
- Click Save — n8n will automatically test the credential by listing available models
If the credential test fails, verify:
- The API key is active (shown with a green dot in the DevPortal)
- The Base URL has no trailing slash
- Your network allows outbound HTTPS to
gw.ldxhub.io
Usage
💡 Quick start: Import ready-to-use example workflows from
examples/. Each example requires a LDXhub API credential and a placeholder input file path to be updated.
- Add LDXhub API credentials (see Credentials Setup)
- Add the LDXhub node to your workflow
- Select a resource and operation (see below)
StructFlow — extract structured data from text
- Resource: StructFlow → Operation: Run Extraction Job
- Configure Model, System Prompt, and Example Output
- Choose Input Mode:
- Inline Inputs: Provide ID + Data pairs directly in the workflow (good for small batches, quick prototyping)
- Binary File: Provide a JSONL file as binary input (good for large batches, or as part of an ExtractDoc → StructFlow pipeline)
Examples: Inline mode · Binary mode
RefineLoop — XLIFF translation refinement
- Resource: RefineLoop → Operation: Run Refinement Job
- Provide an XLIFF file via binary input
- Choose an AI model and set max revisions
Example: RefineLoop workflow
RenderOCR — PDF/image to Office
- Resource: RenderOCR → Operation: Run Conversion Job
- Provide a PDF or image file via binary input
- Choose an OCR engine, target language, and output format (docx/xlsx/pptx)
Example: RenderOCR workflow
CastDoc — text-based PDF to Office (no OCR)
- Resource: CastDoc → Operation: Run Conversion Job
- Provide a PDF file via binary input
- Choose an engine and output format (docx/xlsx/pptx)
Example: CastDoc workflow
ExtractDoc — plain text or JSONL extraction (no AI, no OCR)
- Resource: ExtractDoc → Operation: Run Conversion Job
- Provide a PDF / DOCX / XLSX / PPTX file via binary input
- Choose the
ki/extractengine and output format (textorjsonl)
- Input: Binary file (PDF / DOCX / XLSX / PPTX)
- Output: Plain text (
.txt) or JSONL (.jsonl) - Use case: Preprocessing step before StructFlow (the Accordion pattern); also useful as a standalone free text extractor
- Pricing: Free tier (no AI, no OCR)
- Engine:
ki/extract
AI Agent Integration
The LDXhub node is marked as usableAsTool: true, so it can be attached to an AI Agent node as a tool. This enables agentic workflows where an AI agent autonomously decides when to extract structured data, translate documents, or convert files using LDX hub.
Example use cases:
- Customer support agent that extracts structured complaint data from incoming emails (StructFlow)
- Document processing agent that automatically OCRs and translates uploaded PDFs (RenderOCR → RefineLoop)
- Knowledge base ingestion agent that converts and structures diverse document formats (CastDoc → StructFlow)
Polling Settings
For large documents, jobs may take several minutes. The node polls until completion:
| Setting | Default | Description | |---|---|---| | Max Polling Attempts | 180 | Maximum number of poll requests | | Server Wait Seconds | 10 | Server-side long-poll wait per request |
Theoretical max wait = Max Polling Attempts × Server Wait Seconds seconds.
Defaults give 30 minutes. For longer documents, increase Max Polling Attempts
(e.g., 360 for 60 minutes).
n8n Cloud users: your plan's workflow execution timeout applies independently. Check your plan's limits.
Troubleshooting
401 Unauthorized
- The API key is invalid, revoked, or expired
- Roll the key from My Subscriptions → API Keys → Roll API Key in the LDX hub DevPortal, then update the n8n credential
400 Bad Request — invalid file_id
- The binary input is missing or the binary field name is incorrect
- Check that the previous node outputs a binary property matching the Input Binary Field setting (default:
data)
Job times out / polling exhausted
- Large documents may exceed the default 30-minute window
- Increase Max Polling Attempts in Polling Settings
- For n8n Cloud, also check your plan's workflow execution timeout
StructFlow Inline mode — empty results
- Ensure Inputs collection has at least one record with non-empty ID and Data fields
- Verify Example Output is valid JSON
Credit limit exceeded
- Your subscription's monthly credit allowance has been reached
- Check usage in the DevPortal's My Subscriptions page
- Upgrade your plan or wait for the next billing period
Support
- Product: https://ldxlab.io/ldxhub
- Documentation: https://gw.portal.ldxhub.io/introduction
- Bug reports & feature requests: GitHub Issues
Changelog
See CHANGELOG.md for the full version history.
License
Copyright (c) 2026 Kawamura International Co., Ltd.
