n8n-nodes-crw

v0.2.0

Published

3 months ago

n8n community node for CRW — the open-source web scraper built for AI agents

0High
0Medium
0Low

us4

n8n-community-node-package n8n crw web-scraping markdown crawling ai-agent

n8n-nodes-crw

n8n community node for CRW — the open-source web scraper built for AI agents.

Scrape, crawl, and extract web data directly in your n8n workflows. Works with both self-hosted CRW and fastcrw.com cloud.

Installation

Via n8n UI

Go to Settings > Community Nodes
Click Install a community node
Enter n8n-nodes-crw
Click Install

Via Environment Variable

# Docker
docker run -e EXTRA_COMMUNITY_PACKAGES=n8n-nodes-crw n8nio/n8n

# docker-compose
environment:
  - EXTRA_COMMUNITY_PACKAGES=n8n-nodes-crw

Setup — Pick One

Option A: Cloud (fastcrw.com) — Quickest Start

| Field | Value | |---|---| | Base URL | https://fastcrw.com/api (default) | | API Key | crw_live_... from fastcrw.com |

Option B: Self-hosted with binary (free, no limits)

curl -fsSL https://raw.githubusercontent.com/us/crw/main/install.sh | bash
crw  # starts on http://localhost:3000

| Field | Value | |---|---| | Base URL | http://localhost:3000 | | API Key | (leave empty) |

Option C: Self-hosted with Docker

docker run -d -p 3000:3000 ghcr.io/us/crw:latest

Same credentials as Option B.

Operations

Scrape

Scrape a single URL and return its content in one or more formats.

URL — The page to scrape
Output Formats — markdown, html, rawHtml, plainText, links, json
Only Main Content — Strip nav/footer/sidebar (default: true)
Additional Options — JS rendering, CSS selectors, XPath, custom headers, proxy, stealth mode, JSON schema for LLM extraction

Crawl

Crawl a website starting from a URL. Returns content from multiple pages.

URL — Starting URL
Max Depth — How many links deep to follow (default: 2)
Max Pages — Maximum pages to crawl (default: 100)
Wait for Completion — Poll until done, or return job ID immediately
Poll Interval / Max Wait Time — Control polling behavior

Each crawled page is returned as a separate n8n item for downstream processing.

Check Crawl Status

Check the status of a crawl job by its ID. Returns status, progress, and page data.

Cancel Crawl

Cancel a running crawl job by its ID.

Map

Discover all URLs on a website. Each discovered URL is returned as a separate n8n item.

URL — The site to map
Max Depth — How deep to discover links (default: 2)
Use Sitemap — Whether to use the site's sitemap (default: true)

AI Agent Tool

This node supports usableAsTool: true, so it can be used as a tool by n8n's AI Agent node. Set the environment variable:

N8N_COMMUNITY_PACKAGES_ALLOW_TOOL_USAGE=true

Example Workflows

Scrape and save to Google Sheets

[CRW: Scrape] → [Google Sheets: Append Row]

Crawl site for RAG pipeline

[CRW: Crawl] → [Embeddings: OpenAI] → [Pinecone: Upsert]

Map and batch scrape

[CRW: Map] → [CRW: Scrape] → [OpenAI: Summarize] → [Slack: Post]

License

MIT — this node wrapper is MIT licensed. The CRW server itself is AGPL-3.0.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

n8n-nodes-crw

Installation

Via n8n UI

Via Environment Variable

Setup — Pick One

Option A: Cloud (fastcrw.com) — Quickest Start

Option B: Self-hosted with binary (free, no limits)

Option C: Self-hosted with Docker

Operations

Scrape

Crawl

Check Crawl Status

Cancel Crawl

Map

AI Agent Tool

Example Workflows

Scrape and save to Google Sheets

Crawl site for RAG pipeline

Map and batch scrape

Links

License