n8n-nodes-rag-architect
v1.0.1
Published
n8n community node for RAG-Architect - Transform any website into AI-ready knowledge chunks
Downloads
210
Maintainers
Readme
n8n-nodes-rag-architect
🧠 Transform any website into AI-ready knowledge chunks directly in n8n
This is an n8n community node for RAG-Architect - a powerful tool that converts web content into structured, AI-ready knowledge chunks.
Features
- 🌐 URL Processing - Extract and chunk content from any public URL
- 📄 Structure-Aware Chunking - Preserves document hierarchy and context
- 🔒 PII Scrubbing - Automatic redaction of emails, phones, and sensitive data
- ❓ Q&A Generation - Optional AI-generated question-answer pairs
- 🔗 Multiple Output Formats - n8n, LangChain, LlamaIndex, or raw JSON
- ⚡ Async Support - Process in background or wait for completion
Installation
Community Nodes (Recommended)
- Go to Settings > Community Nodes
- Click Install
- Enter
n8n-nodes-rag-architect - Click Install
Manual Installation
cd ~/.n8n/nodes
npm install n8n-nodes-rag-architectPrerequisites
You need an Apify API token to use this node:
- Create an account at apify.com
- Go to Settings > Integrations
- Copy your API token
Usage
Basic Example: Process URLs
- Add the RAG-Architect node to your workflow
- Configure your Apify credentials
- Enter URLs to process (one per line or comma-separated)
- Choose your output format
- Execute!
Operations
Process URLs
Transform web content into knowledge chunks.
Inputs:
- URLs - One or more URLs to process
- Output Format - n8n, LangChain, LlamaIndex, or raw
- Wait for Completion - Wait or return immediately
Options:
- Generate Q&A Pairs
- PII Scrubbing settings
- Chunk size configuration
- Header splitting rules
Output:
{
"_type": "chunk",
"id": "chunk_abc123",
"content": "The extracted content...",
"contextHeader": "[Source: example.com | Section: Features]",
"metadata": {
"source_url": "https://example.com/docs",
"title": "Documentation",
"section": "Features",
"word_count": 150
},
"questions": [
{
"question": "What are the main features?",
"answer": "The main features include..."
}
]
}Get Run Status
Check the status of an async processing run.
Get Results
Fetch results from a completed run.
Workflow Examples
Simple Knowledge Base Builder
[Manual Trigger] → [RAG-Architect] → [Pinecone Insert]Customer Support Bot Pipeline
[Webhook] → [RAG-Architect] → [OpenAI Embeddings] → [Vector Store] → [AI Agent]Documentation Sync
[Schedule] → [RAG-Architect] → [Transform] → [Notion Update]Configuration Options
Chunking Configuration
| Option | Default | Description | |--------|---------|-------------| | Min Chunk Size | 100 | Minimum characters per chunk | | Max Chunk Size | 2000 | Maximum characters per chunk | | Overlap Size | 50 | Characters overlapping between chunks | | Split On | ##, ### | Markdown headers to split on | | Preserve Tables | true | Keep tables intact | | Preserve Code | true | Keep code blocks intact |
PII Configuration
| Option | Default | Description | |--------|---------|-------------| | Enabled | true | Enable PII scrubbing | | Redact Emails | true | Replace emails with [EMAIL] | | Redact Phones | true | Replace phones with [PHONE] |
Q&A Generation
| Option | Default | Description | |--------|---------|-------------| | Generate Q&A | false | Generate question-answer pairs | | Questions/Chunk | 3 | Number of Q&A pairs per chunk |
Output Formats
n8n Format (Recommended)
Optimized for n8n workflows with clean structure:
{
"_type": "chunk",
"content": "...",
"contextHeader": "...",
"metadata": {...}
}LangChain Format
Compatible with LangChain document loaders:
{
"page_content": "...",
"metadata": {...}
}LlamaIndex Format
Compatible with LlamaIndex documents:
{
"text": "...",
"metadata": {...}
}Error Handling
The node supports n8n's standard error handling:
- Stop on Error - Workflow stops on first error
- Continue on Fail - Errors are captured but workflow continues
Resources
Support
- Issues: GitHub Issues
- Author: Jason Pellerin (@ai_solutionist)
- Website: jasonpellerinfreelance.com
License
MIT License - see LICENSE for details.
Built with 🧠 by Jason Pellerin
