npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

n8n-nodes-crawl4ai-enhanced

v0.3.6

Published

n8n nodes for Crawl4AI web crawler and data extraction with LLM-powered multiple item extraction

Readme

Crawl4AI for n8n

This project provides n8n integration for Crawl4AI, a powerful web crawling and data extraction tool. It consists of two main nodes:

  1. Crawl4AI: Basic Crawler - For general web crawling and content extraction
  2. Crawl4AI: Content Extractor - For extracting structured data using CSS selectors, LLM, or JSON

Features

Basic Crawler Node

  • Crawl Single URL - Extract content from a single web page
  • Crawl Multiple URLs - Process multiple web pages in one operation
  • Process Raw HTML - Extract content from raw HTML without crawling

Content Extractor Node

  • CSS Selector Extractor - Extract structured data using CSS selectors
  • LLM Extractor - Use AI to extract structured data from webpages
  • JSON Extractor - Extract and process JSON data from web pages

Installation

  1. Clone this repository into your n8n custom nodes directory
  2. Run npm install to install dependencies
  3. Restart your n8n instance

Usage

Setting up credentials

Before using the nodes, you need to set up Crawl4AI API credentials:

  1. Go to Settings > Credentials > New
  2. Select Crawl4AI API
  3. Configure connection settings:
    • Connection Mode: Direct or Docker
    • Authentication: Configure as needed
    • LLM Settings: Enable and configure if needed for AI extraction

Basic Crawler Usage

The Basic Crawler node allows you to crawl web pages and extract their content:

  1. Add the "Crawl4AI: Basic Crawler" node to your workflow
  2. Select an operation (Crawl Single URL, Crawl Multiple URLs, or Process Raw HTML)
  3. Configure the required parameters
  4. Run the workflow to extract content

Content Extractor Usage

The Content Extractor node allows you to extract structured data from web pages:

  1. Add the "Crawl4AI: Content Extractor" node to your workflow
  2. Select an extraction method (CSS Selector, LLM, or JSON)
  3. Configure the extraction parameters
  4. Run the workflow to extract structured data

Configuration Options

Browser Options

  • Headless Mode: Run browser in headless mode
  • Enable JavaScript: Enable JavaScript execution
  • Viewport Size: Set browser viewport dimensions
  • Timeout: Maximum time to wait for page load
  • User Agent: Override browser user agent

Crawler Options

  • Cache Mode: Control caching behavior
  • JavaScript Code: Execute custom JS on the page
  • CSS Selector: Focus crawling on specific elements
  • Excluded Tags: Skip specific HTML tags
  • Check Robots.txt: Respect robots.txt rules
  • Word Count Threshold: Filter content by word count

LLM Extraction Options

  • Extraction Instructions: Instructions for the AI
  • Schema Fields: Define structured data schema
  • LLM Provider: Choose AI model provider
  • Temperature: Control randomness of AI responses

Project Structure

nodes/
  ├── Crawl4aiBasicCrawler/
  │   ├── Crawl4aiBasicCrawler.node.ts         # Main node file
  │   ├── crawl4ai.svg                        # Icon
  │   ├── actions/
  │   │   ├── operations.ts                   # Operations definition
  │   │   ├── router.ts                       # Router handler
  │   │   ├── crawlSingleUrl.operation.ts     # Single URL crawl operation
  │   │   ├── crawlMultipleUrls.operation.ts  # Multiple URL crawl operation
  │   │   └── processRawHtml.operation.ts     # Raw HTML processing operation
  │   └── helpers/
  │       ├── interfaces.ts                   # Interface definitions
  │       ├── utils.ts                        # Common utilities
  │       ├── apiClient.ts                    # API client
  │       └── formatters.ts                   # Formatting tools
  │
  └── Crawl4aiContentExtractor/
      ├── Crawl4aiContentExtractor.node.ts    # Main node file
      ├── crawl4ai.svg                        # Icon
      ├── actions/
      │   ├── operations.ts                   # Operations definition
      │   ├── router.ts                       # Router handler
      │   ├── cssExtractor.operation.ts       # CSS selector extraction operation
      │   ├── llmExtractor.operation.ts       # LLM extraction operation
      │   └── jsonExtractor.operation.ts      # JSON extraction operation
      └── helpers/
          ├── interfaces.ts                   # Interface definitions
          ├── utils.ts                        # Common utilities
          ├── apiClient.ts                    # API client
          └── formatters.ts                   # Formatting tools

credentials/
  └── Crawl4aiApi.credentials.ts              # Credentials definition

License

MIT

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Release Notes

  • 0.3.5 (Rollback): Reverted newly added "enhanced Browser/Crawler/LLM configuration" features due to instability. Restored pre-features behavior for reliability. Versions 0.3.1–0.3.4 are deprecated on npm; please use 0.3.5.