n8n-nodes-llm-chat
v0.1.0
Published
n8n nodes for LLM chat via OpenAI-compatible Chat Completions API (llama-server, Ollama /v1, vLLM, etc.)
Maintainers
Readme
n8n-nodes-llm-chat
Custom n8n nodes for chatting with LLMs via OpenAI-compatible Chat Completions API (/v1/chat/completions).
I built these nodes because I couldn't find convenient n8n nodes for quickly integrating with self-hosted LLMs. I run a bunch of models through llama-swap and needed a simple way to call them from n8n workflows. Provided as-is, without warranty, under the MIT license. Feel free to modify and extend.
This package provides two nodes:
| Node | Purpose | |------|---------| | LLM Chat Standalone | Standalone chat node. Place it anywhere in a workflow, send a message to an LLM, and get the response. | | LLM Chat Model | Model supplier for AI Agent. Connects to the standard n8n "AI Agent" node as a language model. |
Which node should I use?
LLM Chat Standalone — for simple, linear workflows. You send a prompt, you get a response. No agent, no tools, no memory. Think of it as "one-shot LLM call" — classify this text, summarize this document, rewrite this email.
LLM Chat Model — for AI Agent workflows. The agent decides what to do, uses tools, maintains conversation memory, and can take multiple steps. Think of it as "LLM brain for an agent" — the agent reasons, calls tools, and iterates.
Use cases
LLM Chat Standalone
You tell the LLM exactly what to do, and it does it. One prompt in, one response out. No tools, no memory, no decision-making. Use it when your workflow already handles the logic — you just need the LLM to transform, classify, or generate text at a specific step.
Email auto-reply draft
Webhook → LLM Chat Standalone (system prompt: "Write a short professional reply to this customer email") → Send Email
Ticket classification
Schedule Trigger → Read Tickets from DB → LLM Chat Standalone (system prompt: "Classify into one of: bug, feature request, support. Reply with category only.") → Switch (route by category) → Handle each type differently
Document summarization
Webhook → Read File (text content) → LLM Chat Standalone (system prompt: "Summarize in 3 bullet points") → Save to Notion
Batch processing
Read Spreadsheet → LLM Chat Standalone (processes each row: "Extract the company name and industry from this text") → Write Spreadsheet
Sentiment analysis
Poll RSS Feed → LLM Chat Standalone (system prompt: "Classify sentiment as: positive, neutral, or negative. Reply with one word only.") → Filter (only negative) → Send Alert
LLM Chat Model
This node provides the LLM to n8n's built-in AI Agent node. The AI Agent can use tools, remember conversation context, and take multiple steps to achieve a goal — but it needs a language model to think with. This node connects your self-hosted LLM to the AI Agent, so the agent can reason and decide what to do using your model instead of a cloud API. Use it when you want the LLM to be autonomous — you give it a goal, and it figures out how to achieve it.
Conversational agent with tools
AI Agent + LLM Chat Model + SerpAPI (web search) + Calculator → the agent decides when to search the web and when to calculate
RAG assistant
AI Agent + LLM Chat Model + Vector Store (retrieve relevant docs) + Window Buffer Memory → the agent answers questions based on your documents
IT helpdesk agent
Chat Trigger → AI Agent + LLM Chat Model + Tool (search knowledge base) + Tool (lookup system status) + Window Buffer Memory → the agent helps employees with IT issues using internal docs
Why not just use the HTTP Request node?
You could — but these nodes save you from:
- Manually constructing the
messagesarray with system/user roles - Parsing the
choices[0].message.contentfrom the response - Handling authentication (Bearer token) via n8n credentials
- Setting up LangChain
ChatOpenAIwith the right baseURL for AI Agent (LLM Chat Model)
Compatibility
Both nodes use the standard OpenAI Chat Completions API format. They work with any server that exposes /v1/chat/completions:
| Server | Endpoint | Notes |
|--------|----------|-------|
| llama-server (llama.cpp) | http://localhost:8080/v1/chat/completions | Native support |
| Ollama | http://localhost:11434/v1/chat/completions | Via OpenAI-compatible layer (experimental) |
| vLLM | http://localhost:8000/v1/chat/completions | Native support |
| LiteLLM | Depends on configuration | Proxy for multiple providers |
| TGI (HuggingFace) | http://localhost:8080/v1/chat/completions | Native support |
| Others | Any OpenAI-compatible API | Any server with /v1/chat/completions |
Note: Ollama has its own API (
/api/chat) which is not compatible with these nodes. Make sure you use the/v1/chat/completionsendpoint.
LLM Chat Standalone
A standalone node for chatting with an LLM. Does not require an AI Agent — works independently anywhere in a workflow.
Features
- Send messages with a system prompt to any OpenAI-compatible endpoint
- Two input modes: from previous node or direct input
- Configurable parameters: temperature, max_tokens, top_p, frequency_penalty, presence_penalty
- Optional API key support
- Can be used as a tool (usableAsTool)
Parameters
Main
| Parameter | Description |
|-----------|-------------|
| Input Source | "From Previous Node" or "Direct Input" |
| Input Field | Field from input data (text, message, output, response, data). Supports expressions: {{ $json.chatInput }} |
| Model | Model name (as recognized by your API endpoint) |
| System Prompt | System prompt |
Options
| Parameter | Default | Description | |-----------|---------|-------------| | Temperature | 1 | 0-2, controls randomness | | Max Tokens | - | Maximum tokens in the response | | Top P | 1 | 0-1, nucleus sampling | | Frequency Penalty | 0 | 0-2, frequency penalty | | Presence Penalty | 0 | 0-2, presence penalty | | Timeout (ms) | 60000 | Request timeout |
Response format
{
"response": "Response text from the model",
"model": "llama-3.1-8b"
}LLM Chat Model
A language model supplier for the standard n8n AI Agent node. Connects to AI Agent as a model (output type: AiLanguageModel).
How to use
- Add an AI Agent node to your workflow
- Add an LLM Chat Model node and connect it to the AI Agent via the "Model" connector
- Create credentials with your API endpoint
- Specify the model name
Parameters
| Parameter | Description | |-----------|-------------| | Model | Model name (e.g., "llama-3.1-8b", "mistral-7b") |
Options
| Parameter | Default | Description | |-----------|---------|-------------| | Temperature | 1 | 0-2, controls randomness | | Max Tokens | - | Maximum tokens in the response | | Top P | 1 | 0-1, nucleus sampling | | Frequency Penalty | 0 | 0-2, frequency penalty | | Presence Penalty | 0 | 0-2, presence penalty |
Credentials
Both nodes share the same credentials:
| Parameter | Description |
|-----------|-------------|
| API Endpoint | Base URL of your API (e.g., http://localhost:8080) |
| API Key | API key if required (leave empty if not needed) |
Note: SSRF protection
If your n8n instance has SSRF protection enabled (available since n8n 2.12.0, disabled by default), the LLM Chat Standalone node may be unable to reach localhost or private IP addresses because its HTTP requests go through n8n's SSRF filter.
To allow access to local LLM servers, add their IP ranges to the allowlist (this takes precedence over the blocklist):
N8N_SSRF_ALLOWED_IP_RANGES="127.0.0.1/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16"The LLM Chat Model node uses LangChain's ChatOpenAI which makes HTTP requests directly, so n8n's SSRF protection does not apply to it.
License
MIT
