n8n-nodes-rag-architect

v1.0.1

Published

8 days ago

n8n community node for RAG-Architect - Transform any website into AI-ready knowledge chunks

Downloads

210

0High
0Medium
0Low

aisolutionist

n8n-community-node-package n8n rag ai knowledge-base web-scraping embeddings apify

n8n-nodes-rag-architect

🧠 Transform any website into AI-ready knowledge chunks directly in n8n

This is an n8n community node for RAG-Architect - a powerful tool that converts web content into structured, AI-ready knowledge chunks.

Features

🌐 URL Processing - Extract and chunk content from any public URL
📄 Structure-Aware Chunking - Preserves document hierarchy and context
🔒 PII Scrubbing - Automatic redaction of emails, phones, and sensitive data
❓ Q&A Generation - Optional AI-generated question-answer pairs
🔗 Multiple Output Formats - n8n, LangChain, LlamaIndex, or raw JSON
⚡ Async Support - Process in background or wait for completion

Installation

Community Nodes (Recommended)

Go to Settings > Community Nodes
Click Install
Enter n8n-nodes-rag-architect
Click Install

Manual Installation

cd ~/.n8n/nodes
npm install n8n-nodes-rag-architect

Prerequisites

You need an Apify API token to use this node:

Create an account at apify.com
Go to Settings > Integrations
Copy your API token

Usage

Basic Example: Process URLs

Add the RAG-Architect node to your workflow
Configure your Apify credentials
Enter URLs to process (one per line or comma-separated)
Choose your output format
Execute!

Operations

Process URLs

Transform web content into knowledge chunks.

Inputs:

URLs - One or more URLs to process
Output Format - n8n, LangChain, LlamaIndex, or raw
Wait for Completion - Wait or return immediately

Options:

Generate Q&A Pairs
PII Scrubbing settings
Chunk size configuration
Header splitting rules

Output:

{
  "_type": "chunk",
  "id": "chunk_abc123",
  "content": "The extracted content...",
  "contextHeader": "[Source: example.com | Section: Features]",
  "metadata": {
    "source_url": "https://example.com/docs",
    "title": "Documentation",
    "section": "Features",
    "word_count": 150
  },
  "questions": [
    {
      "question": "What are the main features?",
      "answer": "The main features include..."
    }
  ]
}

Get Run Status

Check the status of an async processing run.

Get Results

Fetch results from a completed run.

Workflow Examples

Simple Knowledge Base Builder

[Manual Trigger] → [RAG-Architect] → [Pinecone Insert]

Customer Support Bot Pipeline

[Webhook] → [RAG-Architect] → [OpenAI Embeddings] → [Vector Store] → [AI Agent]

Documentation Sync

[Schedule] → [RAG-Architect] → [Transform] → [Notion Update]

Configuration Options

Chunking Configuration

| Option | Default | Description | |--------|---------|-------------| | Min Chunk Size | 100 | Minimum characters per chunk | | Max Chunk Size | 2000 | Maximum characters per chunk | | Overlap Size | 50 | Characters overlapping between chunks | | Split On | ##, ### | Markdown headers to split on | | Preserve Tables | true | Keep tables intact | | Preserve Code | true | Keep code blocks intact |

PII Configuration

| Option | Default | Description | |--------|---------|-------------| | Enabled | true | Enable PII scrubbing | | Redact Emails | true | Replace emails with [EMAIL] | | Redact Phones | true | Replace phones with [PHONE] |

Q&A Generation

| Option | Default | Description | |--------|---------|-------------| | Generate Q&A | false | Generate question-answer pairs | | Questions/Chunk | 3 | Number of Q&A pairs per chunk |

Output Formats

n8n Format (Recommended)

Optimized for n8n workflows with clean structure:

{
  "_type": "chunk",
  "content": "...",
  "contextHeader": "...",
  "metadata": {...}
}

LangChain Format

Compatible with LangChain document loaders:

{
  "page_content": "...",
  "metadata": {...}
}

LlamaIndex Format

Compatible with LlamaIndex documents:

{
  "text": "...",
  "metadata": {...}
}

Error Handling

The node supports n8n's standard error handling:

Stop on Error - Workflow stops on first error
Continue on Fail - Errors are captured but workflow continues

Resources

Support

Issues: GitHub Issues
Author: Jason Pellerin (@ai_solutionist)
Website: jasonpellerinfreelance.com

License

MIT License - see LICENSE for details.

Built with 🧠 by Jason Pellerin

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

n8n-nodes-rag-architect

Features

Installation

Community Nodes (Recommended)

Manual Installation

Prerequisites

Usage

Basic Example: Process URLs

Operations

Process URLs

Get Run Status

Get Results

Workflow Examples

Simple Knowledge Base Builder

Customer Support Bot Pipeline

Documentation Sync

Configuration Options

Chunking Configuration

PII Configuration

Q&A Generation

Output Formats

n8n Format (Recommended)

LangChain Format

LlamaIndex Format

Error Handling

Resources

Support

License