npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

n8n-nodes-vector-store-processor

v1.8.15

Published

n8n node for intelligent document chunking and processing for vector store ingestion with Smart Qdrant Vector Store supporting Ollama and OpenAI embeddings

Readme

n8n-nodes-vector-store-processor

This is an n8n community node that intelligently processes and chunks documents for vector store ingestion with enhanced structure analysis and markdown support. Perfect for RAG (Retrieval-Augmented Generation) workflows and AI applications.

n8n is a fair-code licensed workflow automation platform.

Features

  • Intelligent Document Chunking: Splits documents into semantically meaningful chunks optimized for vector embeddings
  • Markdown Support: Parses markdown headings, lists, and structure for better organization
  • Structure Analysis: Automatically detects chapters, sections, and content hierarchy
  • Flexible Processing Modes:
    • Run once for all items (combine multiple documents into one knowledge base)
    • Run once for each item (process documents separately)
  • Rich Metadata: Includes document title, chapter, section, content type, chunk indices, and more
  • ASCII Sanitization: Ensures namespace compatibility with all vector stores
  • Binary File Support: Process text from binary files (.txt, .md, .pdf) or text fields
  • Configurable Chunk Size: Control the maximum size of text chunks for optimal embedding
  • Global Chunk Indexing: Maintains sequential chunk numbering across entire documents
  • Content Type Classification: Automatically categorizes content (examples, basics, advanced, etc.)

Installation

Follow the installation guide in the n8n community nodes documentation.

Community Nodes (Recommended)

  1. Go to Settings > Community Nodes
  2. Select Install
  3. Enter n8n-nodes-vector-store-processor in Enter npm package name
  4. Agree to the risks and select Install

Manual Installation

To install manually, navigate to your n8n installation directory and run:

npm install n8n-nodes-vector-store-processor

⚠️ Memory Management Requirements

IMPORTANT: For optimal memory management when using the Smart Qdrant Vector Store node with large documents, you must start n8n with the --expose-gc flag to enable garbage collection:

# For systemd service (recommended)
sudo systemctl edit n8n
# Add this line under [Service]:
Environment="NODE_OPTIONS=--expose-gc"

# Or start n8n directly with:
NODE_OPTIONS="--expose-gc" n8n start

# Or for Docker:
docker run -e NODE_OPTIONS="--expose-gc" n8nio/n8n

Why is this needed?

  • The Smart Qdrant Vector Store processes documents in batches and triggers garbage collection after each batch
  • This prevents memory buildup when processing large documents or many documents
  • Without --expose-gc, memory will still be managed by Node.js but less efficiently
  • The "Clear Memory" option in the node will work best with this flag enabled

Operations

The Vector Store Processor node provides the following configuration options:

Mode

  • Run Once for All Items: Combines all input items into a single document before processing
  • Run Once for Each Item: Processes each input item as a separate document

Input Type

  • Text Field: Process text from a JSON field
  • Binary File: Process text from a binary file (supports .txt, .md, .pdf text extraction, etc.)

Parameters

| Parameter | Type | Default | Description | |-----------|------|---------|-------------| | Mode | Options | Run Once for Each Item | Processing mode | | Input Type | Options | Text Field | Source of text data | | Text Field | String | text | Name of the field containing text (when Input Type is Text Field) | | Binary Property | String | data | Name of the binary property (when Input Type is Binary File) | | Document Title | String | (auto-detect) | Override the document title | | Chunk Size | Number | 2000 | Maximum characters per chunk | | Namespace | String | (auto-generate) | Namespace for vector store organization | | Parse Markdown | Boolean | true | Enable markdown structure parsing |

Usage

Basic Example: Process a Single Document for Vector Store

1. Add a "Read Binary File" node or "HTTP Request" node to get your document
2. Add the "Vector Store Processor" node
3. Configure:
   - Mode: Run Once for Each Item
   - Input Type: Binary File (or Text Field if you have text)
   - Parse Markdown: true
   - Chunk Size: 2000
4. Connect to a Vector Store node (Pinecone, Qdrant, Supabase Vector, etc.)
5. Connect to an embeddings node (OpenAI Embeddings, etc.)

Example: Combine Multiple Documents into One Knowledge Base

1. Add a node that outputs multiple items (e.g., "Read Files From Folder")
2. Add the "Vector Store Processor" node
3. Configure:
   - Mode: Run Once for All Items
   - Input Type: Binary File
   - Chunk Size: 2000
4. All documents will be combined and chunked together as one knowledge base
5. Connect to your vector store for ingestion

Example Output

Each chunk produces an output item with:

{
  "pageContent": "This is the actual text content of the chunk...",
  "metadata": {
    "document_title": "My Document",
    "chapter": "Introduction",
    "section": "Getting Started",
    "content_type": "overview",
    "chunk_index": 0,
    "local_chunk_index": 0,
    "chapter_index": 0,
    "total_chunks": 15,
    "namespace": "my-document",
    "source_file": "document.md",
    "character_count": 1850,
    "processing_timestamp": "2025-01-15T10:30:00.000Z"
  },
  "document_title": "My Document",
  "document_title_clean": "my-document",
  "chapter": "Introduction",
  "section": "Getting Started",
  "chapter_clean": "introduction",
  "section_clean": "getting-started",
  "namespace": "my-document"
}

Markdown Support

When Parse Markdown is enabled, the node recognizes:

  • Headings: #, ##, ###, etc. for chapter and section detection
  • Structure: Automatically organizes content by heading hierarchy
  • Lists: Preserves list formatting in chunks
  • Code Blocks: Keeps code blocks intact when possible

How It Works

  1. Title Extraction: Automatically detects document title from:

    • Metadata fields (title, info.Title, metadata['dc:title'])
    • File name
    • First heading in markdown
    • First meaningful line of text
  2. Structure Analysis:

    • Detects chapters (H1, H2 headings or specific patterns)
    • Identifies sections (H3-H6 headings or subsection patterns)
    • Classifies content type (examples, basics, advanced, etc.)
  3. Intelligent Chunking:

    • Splits by paragraphs first
    • Falls back to sentence splitting for long paragraphs
    • Respects chunk size limits
    • Filters out very short chunks
  4. Metadata Enrichment:

    • Global chunk indexing across entire document
    • Local chunk indexing within sections
    • Content type classification
    • Timestamp and source tracking

Compatibility

  • Tested with n8n version 1.0.0+
  • Works with all vector store nodes (Pinecone, Qdrant, Supabase, etc.)
  • Compatible with LangChain nodes

Resources

License

MIT