n8n-nodes-openrouter-cache-chat-model
v0.2.0
Published
OpenRouter Chat Model with prompt caching support for n8n AI Agent workflows
Downloads
51
Maintainers
Readme
n8n-nodes-openrouter-cache
OpenRouter Chat Model node for n8n with prompt caching support. Drop-in replacement for the built-in OpenRouter Chat Model that injects cache_control markers to reduce costs up to 90% on Anthropic and Gemini models.
Features
- Prompt caching for Anthropic and Gemini models via OpenRouter (system message + optional last user message)
- Configurable TTL: 5-minute default or 1-hour extended cache
- Cache breakpoints: System message only, or system + last user message
- Caching enabled by default — the whole point of this node
- Tool call fix preserved from upstream (fixes empty arguments from Anthropic-via-OpenRouter)
- Uses existing
openRouterApicredential — no extra setup
Install
In your n8n instance:
- Go to Settings → Community Nodes
- Click Install
- Enter
n8n-nodes-openrouter-cache-chat-model - Click Install
Usage
- Add the OpenRouter Cache Chat Model node to your workflow
- Connect it to an AI Agent or AI Chain node
- Select your model (e.g.,
anthropic/claude-sonnet-4-20250514orgoogle/gemini-2.5-flash) - Caching is enabled by default — configure under Options if needed
Options
| Option | Default | Description |
|--------|---------|-------------|
| Enable Prompt Caching | true | Toggle caching on/off |
| Cache TTL | 5 Minutes | 5 Minutes or 1 Hour (1h has 2x write cost on Anthropic) |
| Cache Breakpoints | System Message Only | System Message Only or System + Last User Message |
How it works
The node wraps the standard ChatOpenAI from LangChain with a custom fetch interceptor that:
- Intercepts outgoing API requests
- Transforms system message content from string to content blocks with
cache_control: { type: "ephemeral" } - Optionally marks the last user message for caching too
- Sends the modified request to OpenRouter
On the response side, it also fixes empty tool call arguments (a known issue with Anthropic models via OpenRouter).
Which models benefit?
| Provider | Caching | Notes | |----------|---------|-------| | Anthropic (Claude) | Explicit — this node adds it | Min 1024-4096 tokens depending on model | | OpenAI (GPT) | Automatic | No modification needed, but node won't break anything | | DeepSeek | Automatic | Same as OpenAI | | Google (Gemini 2.5) | Explicit — this node adds it | Min 1,028 tokens (Flash) / 2,048 tokens (Pro). Fixed ~3-5min TTL |
Verifying cache hits
Check the OpenRouter Activity dashboard after running your workflow twice:
- First run:
cache_write_tokens > 0(cache established) - Second run:
cached_tokens > 0(cache hit, reduced cost)
Development
npm install
npm run buildFor local testing, copy dist/ to your n8n custom nodes directory:
cp -r dist/ ~/.n8n/custom/node_modules/n8n-nodes-openrouter-cache-chat-model/dist/
cp package.json ~/.n8n/custom/node_modules/n8n-nodes-openrouter-cache-chat-model/License
MIT
