npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

@michaelvanlaar/n8n-nodes-defuddle

v0.2.6

Published

n8n node to extract main content from webpages using Defuddle library

Readme

@michaelvanlaar/n8n-nodes-defuddle

This is an n8n community node that extracts the main content from webpages using the Defuddle library. It provides a simple way to clean HTML content and extract the most relevant parts of a webpage, similar to a browser's reader mode.

n8n is a fair-code licensed workflow automation platform.

Installation

Follow the installation guide in the n8n community nodes documentation.

Community Nodes in n8n Settings (Recommended)

  1. Go to Settings > Community Nodes in your n8n instance
  2. Select Install
  3. Enter @michaelvanlaar/n8n-nodes-defuddle in Enter npm package name
  4. Click Install

Manual Installation

To get started install the package in your n8n root directory:

npm install @michaelvanlaar/n8n-nodes-defuddle

For Docker-based deployments add the following line before the font installation command in your n8n Dockerfile:

RUN cd /usr/local/lib/node_modules/n8n && npm install @michaelvanlaar/n8n-nodes-defuddle

Operations

The Defuddle node extracts clean content from HTML pages. It accepts HTML input (typically from an HTTP Request node) and returns structured, readable content.

Usage

Basic Workflow

  1. Add an HTTP Request node to fetch the webpage HTML
  2. Add the Defuddle node after it
  3. Configure the Defuddle node with the HTML source (default: {{$json.data}})

Example Workflow

HTTP Request → Defuddle → [Your processing nodes]

Configuration Options

HTML Source (Required)

The HTML content to extract from. By default, this is set to {{$json.data}} which references the data from the previous HTTP Request node.

URL (Optional)

The original URL of the page. This helps Defuddle resolve relative links and extract better metadata.

Content Format

Choose the output format for the extracted content:

  • HTML Only (default): Return content as HTML
  • Markdown Only: Convert content to Markdown (content field will contain Markdown)
  • HTML + Markdown: Return both HTML (content) and Markdown (contentMarkdown)

Options

  • Remove Images: Strip all images from the extracted content
  • Remove Exact Selectors: Remove elements matching exact ad/button selectors (default: enabled)
  • Remove Partial Selectors: Remove elements matching partial ad/button selectors (default: enabled)
  • Debug Mode: Enable verbose logging for troubleshooting
  • Output Fields: Select which fields to include in the output:
    • Content (main extracted content)
    • Content Markdown (Markdown version, only when using "HTML + Markdown" format)
    • Title
    • Author
    • Description
    • Domain
    • Word Count
    • Published Date
    • Image (main article image)
    • Schema.org Data (structured data)

Output

The node returns a JSON object with the selected fields. When no custom output fields are specified, it returns: content, title, author, and description by default.

Example output (HTML Only):

{
	"content": "<p>The main article content...</p>",
	"title": "Article Title",
	"author": "Author Name",
	"description": "Article summary"
}

Example output (HTML + Markdown):

{
	"content": "<p>The main article content...</p>",
	"contentMarkdown": "The main article content...",
	"title": "Article Title",
	"author": "Author Name",
	"description": "Article summary"
}

All available fields:

  • content: Main article content (HTML or Markdown depending on format selection)
  • contentMarkdown: Markdown version (only when using "HTML + Markdown" format)
  • title: Article title
  • author: Author name
  • description: Article summary/description
  • domain: Website domain
  • wordCount: Total word count
  • published: Publication date
  • image: Main article image URL
  • schemaOrgData: Structured data from Schema.org markup

Compatibility

  • Requires n8n version 1.20.0 or above
  • Node.js 20 or higher required (as of version 0.2.0)

Development

Testing

This package includes comprehensive automated tests to ensure reliability and prevent regressions.

Running Tests:

npm test                # Run test suite
npm run test:watch      # Run tests in watch mode for development
npm run test:coverage   # Generate coverage report

Testing Framework:

  • Jest with TypeScript support (ts-jest)
  • 47 test cases covering all node features
  • 80% code coverage target

  • Automated pre-commit hooks via Husky

Test Categories:

  1. Feature tests (content extraction, format conversion, output filtering)
  2. Security tests (JSDOM sandboxing, XSS prevention, script blocking)
  3. Error handling (missing input, invalid HTML, continueOnFail)
  4. Edge cases (large documents, Unicode, malformed HTML)
  5. Integration tests (n8n interface mocking, batch processing)

Quality Assurance

Pre-commit hooks automatically run:

  1. Linting (ESLint with n8n community node rules)
  2. Tests (Jest test suite)
  3. Build (TypeScript compilation and icon copying)

This ensures all commits maintain code quality and passing tests.

Claude Code Integration

This project integrates Claude Code with Context7 MCP for enhanced AI-assisted development, providing access to current n8n documentation.

Setup

1. Environment File Setup

Create an environment configuration file:

cp .env.example .env

This generates a new .env file in your project root where you'll store your API credentials.

2. API Key Registration

Obtain your authentication key from context7.com, then add it to your .env:

CONTEXT7_API_KEY=your-api-key-here

The project .gitignore automatically prevents this file from being committed to version control.

3. MCP Configuration

The project includes a .mcp.json file that pre-configures the MCP server settings. No additional setup is needed—the integration is ready once your .env file contains a valid API key.

Context7 Slash Commands

This project includes slash commands for Claude Code that provide quick access to n8n documentation via Context7.

/context7:n8n [topic]

Pulls official n8n documentation into the conversation context to assist with development tasks.

Usage:

/context7:n8n

Fetches general n8n documentation relevant to the current task (e.g., community node development, node structure, testing).

With optional topic:

/context7:n8n node development
/context7:n8n credentials
/context7:n8n IExecuteFunctions
/context7:n8n parameters

Focuses the documentation retrieval on a specific topic.

When to use:

  • Developing or maintaining n8n community nodes
  • Working with n8n APIs (IExecuteFunctions, INodeType, INodeProperties, etc.)
  • Troubleshooting node-related issues
  • Understanding n8n conventions and best practices
  • Working with credentials, webhooks, or polling triggers
  • Checking for API changes or updated patterns

Recommended Usage Scenarios

Use Context7 integration when:

  • Learning unfamiliar or newly-released n8n APIs
  • Resolving complex node development challenges
  • Implementing features requiring deep knowledge of n8n internals
  • Confirming best practices or verifying API changes
  • Working with credentials, webhooks, or polling triggers

Avoid using it for:

  • Following established code patterns already present in the codebase
  • Standard refactoring tasks
  • Similar features already implemented elsewhere

Resources

Version History

0.2.7 (Upcoming)

(No changes yet)

0.2.6

  • Security: Force form-data to patched version 4.0.4+ via npm overrides to address prototype pollution vulnerability
  • New Feature: Add Context7 MCP integration with /n8n-docs slash command for accessing n8n documentation
  • Testing Infrastructure: Add comprehensive Jest testing with 47 test cases (~100% coverage)
    • Feature tests: content extraction, format conversion, output filtering, Defuddle options
    • Security tests: JSDOM sandboxing, script blocking, XSS prevention
    • Error handling tests: missing input, invalid HTML, continueOnFail behavior
    • Edge case tests: large documents, Unicode, malformed HTML, empty content
    • Integration tests: IExecuteFunctions mocking, batch processing, pairedItem indexing
  • Quality Assurance: Add Husky pre-commit hooks (lint → test → build)
  • Dependency Updates:
    • n8n-workflow: updated to 1.115.0
    • Development dependencies updated to latest versions
  • Documentation:
    • Add comprehensive release checklist (.claude/release-checklist.md)
    • Add OpenSpec documentation system for tracking changes
    • Add Conventional Commits and gitmoji guidelines
    • Archive completed OpenSpec changes
  • Development Workflow: Update prepublishOnly to include automated testing (build + lint + test)

0.2.5

  • Update development dependencies to latest versions:
    • @typescript-eslint/eslint-plugin: 8.45.0 → 8.46.1
    • @typescript-eslint/parser: 8.45.0 → 8.46.1
    • typescript-eslint: 8.45.0 → 8.46.1
    • eslint-plugin-n8n-nodes-base: 1.16.3 → 1.16.4

0.2.4

  • Add LICENSE.md file

0.2.3

  • Documentation improvements and workflow standardization

0.2.2

  • Updated README.md with complete version history

0.2.1

  • Fixed peer dependency conflict by downgrading jsdom to v24.x to match defuddle's requirements
  • Resolves npm install errors when installing via n8n Community Nodes

0.2.0

  • Dependency updates for security and compatibility
  • Updated TypeScript to v5.9 (better performance and type checking)
  • Updated ESLint to v9 with flat config
  • Updated Prettier to v3.6
  • Updated gulp to v5
  • Improved type safety in code
  • Breaking change: Now requires Node.js 20 or higher

0.1.0

  • Initial release
  • HTML content extraction with Defuddle library
  • Markdown conversion support (HTML Only, Markdown Only, HTML + Markdown)
  • Configurable output fields
  • Security hardening with sandboxed JSDOM

License

MIT

Alternative Custom Node With Similar Features

n8n-nodes-webpage-content-extractor, which is based on the Readability library that is used by Firefox's Reader View.