n8n-nodes-qwen-embedding

v0.10.0

Published

2 months ago

n8n nodes for Ollama embeddings - Dynamic model loading + Support for Qwen, EmbeddingGemma, Nomic and more. Generate text embeddings for vector stores and AI applications.

Downloads

191

n8n-nodes-ollama-embeddings

n8n community nodes for generating text embeddings via Ollama with your n8n workflows. Supports multiple embedding models including Qwen, EmbeddingGemma, Nomic and more for vector stores, similarity search, and AI applications.

🌟 Features

Two Specialized Nodes:
- Ollama Embeddings: For vector store integration (Supabase, Qdrant, PGVector, etc.)
- Ollama Embeddings Tool: For direct embedding generation in workflows
Multi-Model Support: Works with Qwen, EmbeddingGemma, Nomic, Snowflake and more
LangChain Compatible: Seamlessly integrates with n8n's AI ecosystem
Flexible Dimensions: Auto-detection and validation based on model capabilities
Instruction-Aware: Optimized embeddings for queries vs documents
Batch Processing: Efficient bulk embedding generation
Ollama Integration: Direct connection to Ollama for embedding generation
No Middleware Required: Works directly with Ollama's API endpoints
Compact Format Option: Single-line embeddings for easy copy/paste

📦 Installation

In n8n

Go to Settings > Community Nodes
Search for n8n-nodes-qwen-embedding
Click Install

Manual Installation

npm install n8n-nodes-qwen-embedding

🚀 Prerequisites

You need to have Ollama installed and running with a Qwen model:

Install Ollama: Visit https://ollama.com for installation instructions
Pull an embedding model:

# Qwen3-Embedding models (specialized for embeddings):
ollama pull qwen3-embedding:0.6b  # 1024 dimensions, 32K context

# EmbeddingGemma models (Google's lightweight embeddings):
ollama pull embeddinggemma:300m  # 768 dimensions, 2K context
ollama pull embeddinggemma:300m-bf16  # Higher precision variant

# Nomic Embed models (performant embeddings):
ollama pull nomic-embed-text  # 768 dimensions, 8K context

# Snowflake Arctic Embed:
ollama pull snowflake-arctic-embed:110m  # 1024 dimensions

Verify Ollama is running:

ollama list  # Should show your pulled models

Ollama will be available at http://localhost:11434 by default

📊 Supported Models Comparison

| Model | Dimensions | Context | Size | Best For | |-------|------------|---------|------|----------| | qwen3-embedding:0.6b | 32-1024 (flexible) | 32K tokens | ~639MB | General purpose, multilingual | | embeddinggemma:300m | 128-768 (MRL) | 2K tokens | ~338MB | Lightweight, fast inference | | nomic-embed-text | 768 (fixed) | 8K tokens | ~274MB | Balanced performance | | snowflake-arctic-embed | 1024 (fixed) | 512 tokens | ~332MB | Short text, high precision |

Choosing a Model:

Qwen: Best overall flexibility with largest context window (32K)
EmbeddingGemma: Best for resource-constrained environments
Nomic: Good balance of performance and context size
Snowflake: Optimized for short text with high dimensional output

🔧 Setup

⚠️ CRITICAL: Ollama URL Configuration

ALWAYS remove trailing slashes from your Ollama URL!

✅ CORRECT:   http://localhost:11434
❌ WRONG:     http://localhost:11434/

Why this matters: A trailing slash creates a double-slash in the API path (http://host:11434//api/embed), which causes HTTP parsers to silently transform POST requests to GET requests, resulting in HTTP 405 "Method Not Allowed" errors.

This is the #1 cause of 405 errors with this node. Always verify your Ollama URL format first.

1. Configure Credentials (Optional)

For self-hosted Ollama without authentication: You can skip credential configuration. The node will connect directly to your Ollama instance.

For authenticated Ollama instances:

In n8n, go to Credentials > New
Select Qwen Embedding API (Ollama)
Enter:
- Ollama URL: http://localhost:11434 (NO trailing slash!)
- Model Name: qwen3-embedding:0.6b (or your chosen model)
- API Key: Your authentication token (if required)
IMPORTANT: Verify your URL has NO trailing slash before saving
Click Test Connection to verify

2. Using the Nodes

Qwen Embedding (Vector Store Integration)

Connect to any vector store node:

[Vector Store] ← [Qwen Embedding]
   (Stores)        (Provides embeddings)

Use Cases:

RAG applications
Semantic search
Document indexing
Knowledge bases

Qwen Embedding Tool (Direct Usage)

Use in any workflow for direct embedding generation:

[Trigger] → [Qwen Embedding Tool] → [Process/Store]

Use Cases:

Similarity calculations
Text clustering
Anomaly detection
Content deduplication

⚙️ Configuration Options

Performance Mode

Controls timeout and retry behavior based on your hardware:

Auto-Detect (default): Automatically detects GPU/CPU on first request
- GPU detected (<1s response): 10s timeout, 2 retries
- CPU detected (>5s response): 60s timeout, 3 retries
- Works great for dynamic environments
GPU Optimized: Manual setting for GPU hardware
- 10 second timeout
- 2 retry attempts
- Best for NVIDIA GPU setups
CPU Optimized: Manual setting for CPU hardware
- 60 second timeout
- 3 retry attempts
- Prevents timeout errors on CPU-only systems
Custom: User-defined timeout and retry settings
- Set your own timeout (in milliseconds)
- Configure max retry attempts (0-5)

How Auto-Detection Works:

First request measures actual response time:
- Response < 1s → GPU detected → timeout = 10s
- Response > 5s → CPU detected → timeout = 60s
- 1s ≤ response ≤ 5s → keep default 30s

Dimensions

Adjust embedding vector size based on model capabilities:

Auto-detect (0): Uses optimal dimensions for the selected model
Manual adjustment: Set specific dimensions (validated per model)
- Qwen: 32-1024 (flexible via MRL)
- EmbeddingGemma: 128-768 (flexible via MRL)
- Nomic: 768 (fixed)
- Snowflake: 1024 (fixed)

Implementation: Models with MRL support (Qwen, EmbeddingGemma) can adjust dimensions without retraining. Fixed-dimension models will show a warning if you try to adjust.

Instruction Type

Optimize embeddings for specific use cases:

None (default): Standard embeddings without special instructions
Query: Optimized for search queries
- Prefix: "Instruct: Retrieve semantically similar text.\nQuery: "
- Use for: User questions, search inputs
Document: Optimized for document storage
- Prefix: "Instruct: Represent this document for retrieval.\nDocument: "
- Use for: Indexing documents, knowledge base entries

Performance Impact: 1-5% better semantic matching when query/document types match their use case.

Context Prefix

Add custom context to all texts before embedding:

Example:

Context Prefix: "Medical context:"
Input text: "patient symptoms include fever"
Embedded as: "Medical context: patient symptoms include fever"

Use Cases:

Domain-specific context (legal, medical, technical)
Language hints
Task-specific framing

Return Format (Tool Only)

Controls output structure:

Full (default):

{
  "embedding": [0.123, 0.456, ...],
  "dimensions": 1024,
  "text": "original text",
  "model": "qwen3-embedding:0.6b",
  "metadata": { ... }  // if includeMetadata enabled
}

Simplified:

{
  "text": "original text",
  "vector": [0.123, 0.456, ...],
  "dimensions": 1024
}

Embedding Only:
```
{
  "embedding": [0.123, 0.456, ...]
}
```

Include Metadata (Tool Only)

When enabled, adds processing metadata to output:

{
  "metadata": {
    "prefix": "Medical context:",      // Context prefix used
    "instruction": "query",            // Instruction type applied
    "timestamp": "2025-10-07T19:44:23Z",  // Processing time
    "batchSize": 5                     // Number of texts (batch mode only)
  }
}

Use Cases:

Debugging embedding configurations
Tracking when embeddings were generated
Auditing processing parameters

Custom Timeout & Max Retries (Custom Mode Only)

Fine-tune request behavior:

Custom Timeout: Milliseconds to wait before timeout (default: 30000)
Max Retries: Number of retry attempts on failure (0-5, default: 2)

Retry Logic:

Exponential backoff: 1s → 2s → 4s → 5s (capped)
Clear console logging for debugging
Automatic retry on transient failures

Operation (Tool Only)

Generate Embedding: Single text embedding
Generate Batch Embeddings: Multiple texts in one request

📝 Examples

Example 1: Vector Store with RAG

1. Add "Supabase Vector Store" node
2. Add "Qwen Embedding" node
3. Connect Qwen to Vector Store's embedding input
4. Configure your collection and start indexing

Example 2: Semantic Search

1. Add "Manual Trigger" node
2. Add "Qwen Embedding Tool" node (set to "Generate Embedding")
3. Add another "Qwen Embedding Tool" for documents
4. Add "Code" node to calculate similarities

Example 3: Batch Processing

// Input: Array of texts
{
  "texts": [
    "First document",
    "Second document",
    "Third document"
  ]
}

// Qwen Embedding Tool (Batch mode) output:
{
  "embeddings": [[...], [...], [...]],
  "count": 3,
  "dimensions": 1024
}

🔬 Technical Details

Model Information

Qwen Models:

qwen3-embedding:0.6b - 1024d, 32K context, multilingual (100+ languages)
MRL support for flexible dimensions (32-1024)

EmbeddingGemma Models:

embeddinggemma:300m - 768d, 2K context, lightweight
MRL support for flexible dimensions (128-768)
Variants: bf16, q8, q4 for different precision/size tradeoffs

Nomic Models:

nomic-embed-text - 768d, 8K context, balanced performance
Fixed dimensions (no MRL)

Snowflake Models:

snowflake-arctic-embed - 1024d, 512 context, high precision
Fixed dimensions (no MRL)

Performance Characteristics

GPU Mode: ~200-270ms per embedding (NVIDIA GPU)
CPU Mode: 5-10 seconds per embedding
Auto-Detection: Automatically adjusts timeouts based on hardware
Query Optimization: 1-5% performance boost with instruction type
Batch Processing: Processes texts sequentially (Ollama API limitation)
Timeout: Adaptive (10s GPU, 60s CPU, or 30s default)

API Compatibility

Ollama API: /api/embed endpoint (POST method)
Request Format: {model: string, input: string}
Response Format: {embeddings: number[][]}
Authentication: Optional (credentials not required for self-hosted)

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Qwen Team for the amazing embedding models
n8n Community for support and feedback
LangChain for the embedding interface

🐛 Troubleshooting

HTTP 405 "Method Not Allowed" Errors

Most Common Cause (90% of cases): Trailing slash in Ollama URL configuration.

# Check your credential configuration:
✅ Correct: http://localhost:11434
❌ Wrong:   http://localhost:11434/

Quick Fix:

Go to N8N Credentials
Edit your Ollama credential
Remove the trailing slash from the URL
Save and test again

For other 405 error causes, see the comprehensive HTTP 405 Troubleshooting Guide which covers:

URL formatting issues (trailing slashes)
N8N authentication system issues
Working patterns vs anti-patterns
Step-by-step verification

Model Not Found Errors

# Pull the correct model first:
ollama pull qwen3-embedding:0.6b

# Verify it's available:
ollama list

Performance Issues

CPU mode timing out: Use Performance Mode "CPU Optimized" or "Auto-Detect"
GPU not detected: Ensure NVIDIA drivers and Docker GPU runtime are configured
Slow responses: Check Ollama is running with ollama list

🔗 Links

📮 Support

For issues and questions:

Made with ❤️ for the n8n community

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

n8n-nodes-ollama-embeddings

🌟 Features

📦 Installation

In n8n

Manual Installation

🚀 Prerequisites

📊 Supported Models Comparison

🔧 Setup

⚠️ CRITICAL: Ollama URL Configuration

1. Configure Credentials (Optional)

2. Using the Nodes

Qwen Embedding (Vector Store Integration)

Qwen Embedding Tool (Direct Usage)

⚙️ Configuration Options

Performance Mode

Dimensions

Instruction Type

Context Prefix

Return Format (Tool Only)

Include Metadata (Tool Only)

Custom Timeout & Max Retries (Custom Mode Only)

Operation (Tool Only)

📝 Examples

Example 1: Vector Store with RAG

Example 2: Semantic Search

Example 3: Batch Processing

🔬 Technical Details

Model Information

Performance Characteristics

API Compatibility

🤝 Contributing

📄 License

🙏 Acknowledgments

🐛 Troubleshooting

HTTP 405 "Method Not Allowed" Errors

Model Not Found Errors

Performance Issues

🔗 Links

📮 Support