n8n-nodes-hyper-reader

v1.0.1

Published

2 months ago

n8n community node for Hyper-Reader - Agent-ready web scraper optimized for LLMs

0High
0Medium
0Low

aisolutionist

n8n-community-node-package n8n web-scraper llm ai claude gpt-4 gemini apify

n8n-nodes-hyper-reader

🦉 Agent-ready web scraper optimized for Claude, GPT-4, and Gemini

This is an n8n community node for Hyper-Reader - a high-fidelity, LLM-optimized web content extraction tool.

Features

🎯 Agent-Optimized Output - Pre-formatted for Claude, GPT-4, Gemini, or SearchGPT
🧹 85% Noise Reduction - Strips ads, navbars, tracking, and junk
🕵️ Elite Stealth Mode - Bypasses anti-bot protection
📸 Vision Support - Capture screenshots for Vision AI analysis
🔗 Deep Read - Follow internal links for broader context
⚡ Fast - 1-second response times via Standby Mode

Installation

Community Nodes (Recommended)

Go to Settings > Community Nodes
Click Install
Enter n8n-nodes-hyper-reader
Click Install

Manual Installation

cd ~/.n8n/nodes
npm install n8n-nodes-hyper-reader

Prerequisites

You need an Apify API token:

Create an account at apify.com
Go to Settings > Integrations
Copy your API token

Usage

Operations

Scrape URL

Extract clean content from a single URL.

{
  "url": "https://example.com/article",
  "title": "Article Title",
  "content": "# Article Title\n\nClean markdown content...",
  "metadata": {
    "author": "John Doe",
    "publishDate": "2026-01-15"
  },
  "wordCount": 1250,
  "tokensSaved": 8500,
  "processingTime": 1240
}

Scrape Multiple

Process multiple URLs in one run.

Deep Read

Scrape a URL and automatically follow internal links to gather more context. Perfect for:

Documentation sites
Multi-page articles
Product catalogs

Get Run Status

Check progress of an async scraping job.

Agent Presets

| Preset | Description | |--------|-------------| | Claude | XML-structured Markdown with clear sections | | GPT-4 | Citation-heavy format with source references | | Gemini | Compact, efficient Markdown | | SearchGPT | Web-search optimized with key facts | | Raw | Clean Markdown, no agent optimization |

Stealth Levels

| Level | Use Case | |-------|----------| | 1 - Basic | Fast, for simple sites | | 2 - Standard | Good for most sites | | 3 - Elite | LinkedIn, Amazon, protected sites |

Workflow Examples

AI Research Assistant

[Webhook] → [Hyper-Reader] → [OpenAI] → [Google Sheets] → [Slack]

Scrape article → Summarize with AI → Log to vault → Notify team

Competitor Monitoring

[Schedule] → [Hyper-Reader (Multiple)] → [Compare] → [Alert]

Daily scrape → Compare changes → Alert on updates

RAG Pipeline

[Trigger] → [Hyper-Reader (Deep Read)] → [Embeddings] → [Vector Store]

Deep read docs → Generate embeddings → Store for retrieval

Vision Analysis

[Hyper-Reader (Vision)] → [GPT-4 Vision] → [Report]

Capture screenshot → Analyze with Vision AI → Generate report

Configuration Options

| Option | Default | Description | |--------|---------|-------------| | Agent Preset | Claude | Target AI optimization | | Output Format | markdown | markdown, json, or html_cleaned | | Stealth Level | 2 | Anti-bot protection (1-3) | | Capture Screenshot | false | Enable Vision mode | | Include Metadata | true | Title, author, date | | Preserve Links | true | Keep hyperlinks | | Include Images | false | Image URLs in output | | Exclude Selectors | nav, footer... | CSS selectors to remove | | Max Content Length | 0 | Truncation limit (0 = none) |

Deep Read Options

| Option | Default | Description | |--------|---------|-------------| | Depth | 1 | Link levels to follow (1-3) | | Max Pages | 10 | Maximum pages to extract |

Output Example

Claude Preset Output

<document>
<metadata>
<title>How to Build AI Agents</title>
<source>https://example.com/ai-agents</source>
<author>Jane Smith</author>
<date>2026-01-15</date>
</metadata>

<content>
# How to Build AI Agents

AI agents are autonomous systems that...

## Key Components

1. **Perception** - How agents understand their environment
2. **Reasoning** - Decision-making processes
3. **Action** - Executing tasks in the world

...
</content>
</document>

GPT-4 Preset Output

# How to Build AI Agents

Source: https://example.com/ai-agents
Author: Jane Smith | Published: January 15, 2026

AI agents are autonomous systems that... [1]

## Key Components

1. **Perception** - How agents understand their environment [2]
...

---
[1] Definition adapted from Russell & Norvig
[2] See also: Embodied AI research

Token Savings

Hyper-Reader strips ~85% of web page noise:

| Metric | Raw HTML | Hyper-Reader | Savings | |--------|----------|--------------|---------| | Characters | 150,000 | 22,500 | 85% | | Tokens (GPT-4) | ~37,500 | ~5,625 | ~$0.032/page |

Pricing

| Tier | Price | Best For | |------|-------|----------| | Standard | $1 / 1,000 pages | Blogs, news, docs | | Elite | $5 / 1,000 pages | LinkedIn, Amazon, protected sites | | Pro Monthly | $49 / month | Standby Mode, unlimited proxy |

Resources

Support

Issues: GitHub Issues
Author: Jason Pellerin (@ai_solutionist)
Website: jasonpellerinfreelance.com

License

MIT License - see LICENSE for details.

Built with 🦉 by Jason Pellerin

Transform web chaos into agent-ready intelligence.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

n8n-nodes-hyper-reader

Features

Installation

Community Nodes (Recommended)

Manual Installation

Prerequisites

Usage

Operations

Scrape URL

Scrape Multiple

Deep Read

Get Run Status

Agent Presets

Stealth Levels

Workflow Examples

AI Research Assistant

Competitor Monitoring

RAG Pipeline

Vision Analysis

Configuration Options

Deep Read Options

Output Example

Claude Preset Output

GPT-4 Preset Output

Token Savings

Pricing

Resources

Support

License