npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@stabgan/openrouter-mcp-multimodal

v4.5.1

Published

MCP server for OpenRouter with text chat, image analysis + generation, audio analysis + generation, video analysis, and video generation (Veo 3.1 / Sora 2 Pro / Seedance / Wan)

Readme


Verified on MseeP

Access 300+ LLMs — Claude, Gemini, GPT, Llama, Qwen, Grok, and more — through OpenRouter via the Model Context Protocol. Analyze images, audio, and video. Generate images, speech, music, and video (Veo 3.1, Sora 2 Pro, Seedance, Wan). Chat with any model. Works with Claude Desktop, Cursor, Kiro, VS Code, Windsurf, Cline, and any MCP-compatible client. Every tool returns structured _meta.code errors so MCP clients can switch on failure modes without parsing strings.

One-Click Install

After clicking, the target client opens a confirmation prompt. You'll need to paste your OPENROUTER_API_KEY — the deeplink ships a placeholder so no secrets end up in shared links.

Why This One?

| Feature | Status | | :--- | :--- | | Text chat with 300+ models | ✅ | | Image analysis (vision) | ✅ Native with sharp optimization | | Audio analysis | ✅ Transcription + analysis, base64 auto-encoded | | Audio generation | ✅ Conversational, speech, and music with format auto-detection | | Image generation | ✅ Path-sandboxed disk output | | Video understanding | ✅ v3 — mp4, mpeg, mov, webm from files, URLs, or data URLs | | Video generation | ✅ v3 — Veo 3.1 / Sora 2 Pro / Seedance / Wan via async API with progress notifications | | Response caching | ✅ v4.5X-OpenRouter-Cache passthrough, zero tokens billed on hit, 80–300ms latency | | Web search plugin | ✅ v4.5online: true on chat_completion injects OpenRouter's Exa-backed plugin | | Rerank | ✅ v4.5rerank_documents tool against /rerank (Cohere, Fireworks) | | Health check | ✅ v4.5health_check verifies API key + OpenRouter reachability | | Reasoning tokens | ✅ v4.5 — passthrough of DeepSeek R1 / Gemini Thinking / Opus 4.7 traces on _meta.reasoning | | MCP 2025-06-18 spec | ✅ v4.5 — structured outputs (outputSchema), progress notifications, title + openWorldHint | | Auto image resize + compress | ✅ Configurable (defaults 800px max, JPEG 80%) | | Model search + validation | ✅ Filter by vision / audio / video modality | | Free model support | ✅ Default: free Nemotron VL | | Docker support | ✅ Multi-arch (amd64 + arm64), ~345 MB Alpine | | Retry-After + jitter | ✅ Honors Retry-After header, avoids thundering herd | | IPv4 + IPv6 SSRF blocklist | ✅ Covers mapped, compat, multicast, 6to4, Teredo, ORCHID | | Structured error taxonomy | ✅ Closed _meta.code so clients can switch on failure modes | | Reasoning-model awareness | ✅ Detects max_tokens cutoff during CoT, guides the caller | | MCP 2025 tool annotations | ✅ readOnlyHint / destructiveHint / idempotentHint on every tool |

Tools

| Tool | Description | | :--- | :--- | | chat_completion | Send messages to any OpenRouter model. Detects reasoning-model cutoffs. Supports provider routing (quantizations, ignore, sort, order, require_parameters, data_collection, allow_fallbacks), model suffixes (:nitro for fastest, :floor for cheapest, :exacto for Auto Exacto tool-calling), response caching (cache, cache_ttl, cache_clear), reasoning passthrough (include_reasoning), and web search (online, web_max_results). | | analyze_image | Analyze images from local files, URLs, or data URIs. Auto-optimized with sharp. Optional cache_input: true attaches cache_control: ephemeral for Anthropic / Gemini 2.5+ prompt caching. | | analyze_audio | Analyze/transcribe audio (WAV, MP3, FLAC, OGG, etc.) from files, URLs, or data URIs. Optional cache_input: true for prompt caching. | | analyze_video | Analyze/transcribe video (mp4, mpeg, mov, webm) from files, URLs, or data URIs. Optional cache_input: true for prompt caching. | | generate_image | Generate images from text prompts. Supports aspect_ratio (14 values), image_size (0.5K–4K), and max_tokens. Optional path-sandboxed disk save. | | generate_audio | Generate audio from text. Auto-detects format, wraps raw PCM in WAV. | | generate_video | Generate video via OpenRouter's async API (Veo 3.1 / Sora 2 Pro / Seedance / Wan). Submits, polls, downloads, saves. Emits MCP notifications/progress when the client sends a progressToken. | | generate_video_from_image | Image-to-video wrapper around generate_video. Narrower schema, higher tool-call hit rate. | | get_video_status | Resume polling a generate_video job by id. Download + save when complete. | | rerank_documents | Rerank candidate documents against a query via OpenRouter's /rerank endpoint. Supports Cohere and Fireworks rerankers. | | search_models | Search/filter models by name, provider, or capabilities (vision / audio / video). Paginated via offset / next_offset / has_more / total. | | get_model_info | Get pricing, context length, and capabilities for any model. | | validate_model | Check if a model ID exists on OpenRouter. | | health_check | Verify API-key validity, OpenRouter reachability, and return server + protocol versions. |

All error responses carry _meta.code from a closed taxonomy: INVALID_INPUT · UNSAFE_PATH · UPSTREAM_HTTP · UPSTREAM_TIMEOUT · UPSTREAM_REFUSED · UNSUPPORTED_FORMAT · RESOURCE_TOO_LARGE · ZDR_INCOMPATIBLE · MODEL_NOT_FOUND · JOB_FAILED · JOB_STILL_RUNNING · INTERNAL

Quick Start

Prerequisites

Get a free API key from openrouter.ai/keys.

Option 1: npx (no install)

{
  "mcpServers": {
    "openrouter": {
      "command": "npx",
      "args": ["-y", "@stabgan/openrouter-mcp-multimodal"],
      "env": {
        "OPENROUTER_API_KEY": "sk-or-v1-..."
      }
    }
  }
}

Option 2: Docker

{
  "mcpServers": {
    "openrouter": {
      "command": "docker",
      "args": [
        "run", "--rm", "-i",
        "-e", "OPENROUTER_API_KEY=sk-or-v1-...",
        "stabgan/openrouter-mcp-multimodal:latest"
      ]
    }
  }
}

Option 3: Global install

npm install -g @stabgan/openrouter-mcp-multimodal
{
  "mcpServers": {
    "openrouter": {
      "command": "openrouter-multimodal",
      "env": { "OPENROUTER_API_KEY": "sk-or-v1-..." }
    }
  }
}

Option 4: Smithery

npx -y @smithery/cli install @stabgan/openrouter-mcp-multimodal --client claude

Configuration

| Variable | Required | Default | Description | | :--- | :---: | :--- | :--- | | OPENROUTER_API_KEY | Yes | — | Your OpenRouter API key | | OPENROUTER_DEFAULT_MODEL | No | nvidia/nemotron-nano-12b-v2-vl:free | Default model for chat + analyze tools | | DEFAULT_MODEL | No | — | Alias for above | | OPENROUTER_MAX_TOKENS | No | — | Default max_tokens for chat_completion when not set in the request. Useful on low-credit / free-tier accounts to avoid the full-context-window reservation. | | OPENROUTER_PROVIDER_QUANTIZATIONS | No | — | CSV. Filter providers by quantization (e.g. fp16,int8). | | OPENROUTER_PROVIDER_IGNORE | No | — | CSV. Exclude these provider slugs (e.g. openai,anthropic). | | OPENROUTER_PROVIDER_SORT | No | — | price / throughput / latency. | | OPENROUTER_PROVIDER_ORDER | No | — | JSON array or CSV of provider IDs (e.g. ["meta-llama","google"]). | | OPENROUTER_PROVIDER_REQUIRE_PARAMETERS | No | — | true / false. Only use providers supporting every request parameter. | | OPENROUTER_PROVIDER_DATA_COLLECTION | No | — | allow / deny. Opt out of providers that log request data. | | OPENROUTER_PROVIDER_ALLOW_FALLBACKS | No | — | true / false. | | OPENROUTER_CACHE_RESPONSES | No | — | 1 / true. Enable response caching server-wide. Sends X-OpenRouter-Cache: true on chat + analyze_* calls unless overridden per-request with cache: false. Zero tokens billed on hits. | | OPENROUTER_INCLUDE_REASONING | No | — | 1 / true. Enable reasoning tokens passthrough server-wide for DeepSeek R1 / Gemini Thinking / Opus 4.7. Adds _meta.reasoning to chat_completion responses. | | OPENROUTER_MODEL_CACHE_TTL_MS | No | 3600000 | Model cache TTL (ms) | | OPENROUTER_IMAGE_MAX_DIMENSION | No | 800 | Longest edge for resize (px) | | OPENROUTER_IMAGE_JPEG_QUALITY | No | 80 | JPEG quality (1–100) | | OPENROUTER_IMAGE_FETCH_TIMEOUT_MS | No | 30000 | Image URL timeout | | OPENROUTER_IMAGE_MAX_DOWNLOAD_BYTES | No | 26214400 | Image URL size cap (~25 MB) | | OPENROUTER_IMAGE_MAX_REDIRECTS | No | 8 | Image URL redirect cap | | OPENROUTER_IMAGE_MAX_DATA_URL_BYTES | No | 20971520 | Image data URL size cap (~20 MB) | | OPENROUTER_AUDIO_FETCH_TIMEOUT_MS | No | 30000 | Audio URL timeout | | OPENROUTER_AUDIO_MAX_DOWNLOAD_BYTES | No | 26214400 | Audio URL size cap (~25 MB) | | OPENROUTER_AUDIO_MAX_REDIRECTS | No | 8 | Audio URL redirect cap | | OPENROUTER_AUDIO_MAX_DATA_URL_BYTES | No | 20971520 | Audio data URL size cap | | OPENROUTER_DEFAULT_VIDEO_MODEL | No | google/gemini-2.5-flash | Default for analyze_video | | OPENROUTER_DEFAULT_VIDEO_GEN_MODEL | No | google/veo-3.1 | Default for generate_video | | OPENROUTER_VIDEO_FETCH_TIMEOUT_MS | No | 60000 | Video URL timeout | | OPENROUTER_VIDEO_MAX_DOWNLOAD_BYTES | No | 104857600 | Video URL size cap (~100 MB) | | OPENROUTER_VIDEO_MAX_REDIRECTS | No | 8 | Video URL redirect cap | | OPENROUTER_VIDEO_MAX_DATA_URL_BYTES | No | 104857600 | Video data URL size cap | | OPENROUTER_VIDEO_POLL_INTERVAL_MS | No | 15000 | Async video poll cadence | | OPENROUTER_VIDEO_MAX_WAIT_MS | No | 600000 | Max wait before returning a resumable handle | | OPENROUTER_VIDEO_GEN_MAX_BYTES | No | 268435456 | Generated video download cap (~256 MB) | | OPENROUTER_VIDEO_INLINE_MAX_BYTES | No | 10485760 | Inline video ceiling (~10 MB) | | OPENROUTER_OUTPUT_DIR | No | process.cwd() | Sandbox root for save_path | | OPENROUTER_ALLOW_UNSAFE_PATHS | No | — | 1 disables the sandbox | | OPENROUTER_LOG_LEVEL | No | info | error / warn / info / debug |

Security notes

  • Analyze tools can read local files and fetch HTTP(S) URLs. URL fetches block private/link-local/reserved IPv4 and IPv6 targets (SSRF mitigation) and cap response size.
  • Generate tools write to disk through a path sandbox: save_path is resolved against OPENROUTER_OUTPUT_DIR and any traversal attempt is rejected. Override with OPENROUTER_ALLOW_UNSAFE_PATHS=1.
  • IPv6 SSRF blocklist covers loopback, unspecified, IPv4-mapped, IPv4-compatible, link-local, site-local, ULA, multicast, documentation, Teredo, ORCHID, and 6to4 of private IPv4.

Usage Examples

# Chat
Use chat_completion to explain quantum computing in simple terms.

# Chat with provider routing — prefer cheapest provider, exclude OpenAI, opt out of data collection
Use chat_completion with model "anthropic/claude-3.5-sonnet", prompt "Summarize this",
provider { sort: "price", ignore: ["openai"], data_collection: "deny" }

# Chat with :nitro variant for faster response
Use chat_completion with model "openai/gpt-4o:nitro", prompt "Reason step-by-step about this problem"

# Chat with :floor variant for cheapest provider of the requested model
Use chat_completion with model "mistralai/mistral-7b-instruct:floor", prompt "Quick check"

# Chat with response caching + reasoning passthrough (v4.5)
Use chat_completion with model "deepseek/deepseek-r1", prompt "Prove sqrt(2) is irrational",
cache: true, cache_ttl: 3600, include_reasoning: true
# → response.meta.cache = { status: "hit" | "miss", age, ttl }
# → response.meta.reasoning = "<upstream reasoning trace>"

# Chat with web search plugin (v4.5)
Use chat_completion with model "openai/gpt-4o", prompt "What shipped in OpenRouter last week?",
online: true, web_max_results: 5

# Rerank documents against a query (v4.5)
Use rerank_documents with query "best practices for MCP server auth",
documents: ["doc A text...", "doc B text...", "doc C text..."], top_n: 3

# Generate video from an image (v4.5)
Use generate_video_from_image with image "./frame.png", prompt "zoom out slowly",
model "google/veo-3.1", save to ./clip.mp4

# Health check (v4.5)
Use health_check
# → { ok: true, server_version: "4.5.0", protocol_version: "2025-06-18", api_key_valid: true, models_cached: 312 }

# Vision
Use analyze_image on /path/to/photo.jpg and tell me what you see.

# Audio transcription
Use analyze_audio on /path/to/recording.mp3 to transcribe it.

# Video understanding
Use analyze_video on /path/to/clip.mp4 — what happens at 00:15?

# Generate audio
Use generate_audio with prompt "Explain neural networks" and voice "alloy", save to ./response.wav

# Generate music
Use generate_audio with model "google/lyria-3-clip-preview" and prompt "upbeat jazz piano trio"

# Generate image
Use generate_image with prompt "a cat astronaut on mars", aspect_ratio "16:9", image_size "1K", save to ./cat.png

# Generate video
Use generate_video with model "google/veo-3.1", prompt "a calm river at sunrise",
resolution 720p, duration 4, save to ./river.mp4

# Resume a video job
Use get_video_status with video_id "vid_abc123" and save_path "./river.mp4"

Architecture

src/
├── index.ts                    # Entry, env validation, graceful shutdown
├── tool-handlers.ts            # 14 tools (annotated) + dispatch
├── model-cache.ts              # TTL + in-flight coalescing
├── openrouter-api.ts           # REST client (chat + /videos)
├── errors.ts                   # Closed ErrorCode enum
├── logger.ts                   # JSON-line structured logger
└── tool-handlers/
    ├── fetch-utils.ts          # SSRF, bounded fetch, data-URL parser
    ├── openrouter-errors.ts    # SDK/HTTP → ErrorCode classifier
    ├── completion-utils.ts     # Reasoning-model cutoff detection
    ├── path-safety.ts          # save_path sandbox
    ├── chat-completion.ts      # Text + multimodal chat
    ├── analyze-image.ts        # Vision analysis
    ├── analyze-audio.ts        # Audio transcription
    ├── analyze-video.ts        # Video understanding
    ├── generate-image.ts       # Image generation
    ├── generate-audio.ts       # Audio generation + streaming
    ├── generate-video.ts       # Video generation (async)
    ├── image-utils.ts          # Sharp optimization, MIME sniffing
    ├── audio-utils.ts          # Audio format detection
    ├── video-utils.ts          # Video format detection
    ├── search-models.ts        # Model search
    ├── get-model-info.ts       # Model detail lookup
    └── validate-model.ts       # Model existence check

Design Principles & Research

v4.5.0's design draws from three threads of research and industry guidance. Rather than building in isolation, every feature ties to a cited source so decisions can be re-examined later.

MCP-first design principles

We follow Phil Schmid's production guide for MCP servers (Jan 2026), which argues that an MCP server is "a user interface for AI agents, not a REST API wrapper":

  • Outcomes, not operations. Our tools like analyze_image and generate_video encapsulate a whole workflow (fetch, validate, invoke, save) rather than exposing raw OpenRouter primitives.
  • Flattened arguments. Top-level primitives with enums (aspect_ratio, image_size), no deeply nested configuration blobs. The one nested object (provider) is required by OpenRouter's routing schema.
  • Descriptions are context. Every tool description includes "Fails when:" and "Works with:" sections (see next section for the research backing).
  • Curated surface. 14 tools total. Each is a distinct outcome; no "helper" tools that exist only for internal composition.

Apigene's "12 Rules for Production MCP Deployment" (March 2026) guided the error-handling posture: structured errors with suggestions and retry_after_seconds on _meta beat raw error strings the agent has to interpret.

MCP 2025-06-18 spec compliance

  • Structured outputs. validate_model, get_model_info, search_models, rerank_documents, and health_check emit structuredContent with outputSchema, per §5.2.6-7. Agents can validate responses typefully.
  • Progress notifications. generate_video emits notifications/progress on every poll tick when the client passes a progressToken in _meta. Progress values are guaranteed strictly monotonic per spec.
  • Tool annotations. Every tool carries title + readOnlyHint + destructiveHint + idempotentHint + openWorldHint so clients can render appropriate UI affordances.

Research-backed tool-design decisions

These papers shaped specific v4.5.0 choices:

| Finding | Source | How it shaped v4.5.0 | | :--- | :--- | :--- | | Failure-mode docs and inter-tool relationships measurably improve tool-selection accuracy | Schlapbach, Convergence of SGD & MCP (Feb 2026) | Every tool description has explicit "Fails when:" (ErrorCode triggers) and "Works with:" (related tools). | | Tool-call success drops with parameter count and schema complexity | Fu et al., ROSBag MCP Server (Nov 2025) | generate_video_from_image is a narrower image-to-video wrapper around generate_video — fewer params, higher hit rate. | | Indirect prompt injection via tool-returned content is a real attack vector | Zhao et al., ClawGuard (Apr 2026) · Yu et al., Defense via Tool Result Parsing (Jan 2026) | analyze_image / analyze_audio / analyze_video tag their output _meta.content_is_untrusted: true. Downstream agents know to treat that text as data, not instructions. | | Provider-level tool-calling variance is large and persists across providers for the same model | OpenRouter Auto Exacto announcement (Mar 2026) | chat_completion documents the :exacto model suffix alongside :nitro / :floor. 80-88% error reduction on top tool-calling models. | | LLM JSON defects compound at scale | OpenRouter Response Healing (Dec 2025) | Structured outputs + outputSchema declarations give clients a parseable contract. (Response-healing plugin itself is opt-in on OpenRouter's side.) | | MCP servers are vulnerable to preference-manipulation and tool-poisoning attacks | Wang et al., MPMA (May 2025) · Turgut & Gümüş, CASCADE (Apr 2026) | Tool descriptions audited for injection surface; audit logging (logger.audit()) captures every paid-op invocation with a prompt preview for forensics. |

OpenRouter platform parity

v4.5.0 surfaces platform features shipped between Q4 2025 and Q2 2026:

Security posture

  • Path sandbox. All file writes (save_path) and reads (input_images, frame images) go through resolveSafeOutputPath / resolveSafeInputPath, which reject traversal escapes. Legacy bypass: OPENROUTER_ALLOW_UNSAFE_PATHS=1.
  • SSRF blocklist. Loopback, private, link-local, multicast, 6to4, Teredo, ORCHID, and IPv4-mapped IPv6 all rejected at the fetch layer.
  • Audit logging. logger.audit() emits a JSON line at level=audit for every generate_video, generate_audio, and generate_image call. Bypasses OPENROUTER_LOG_LEVEL so unintended spend is always traceable. 80-char prompt preview is the hard PII boundary.
  • Structured errors. Closed _meta.code taxonomy means agents switch on failure modes without regex-parsing free text. Rate-limit errors include retry_after_seconds derived from Retry-After headers.
  • No credential leakage. OPENROUTER_API_KEY is read once at startup, passed to the SDK, and never echoed in logs, tool responses, or error messages. Fatal-error logging whitelists fields explicitly (name / message / trimmed stack) — no raw error objects. Verified by an independent bug-hunter audit (Apr 2026).

Development

git clone https://github.com/stabgan/openrouter-mcp-multimodal.git
cd openrouter-mcp-multimodal
npm install
cp .env.example .env  # Add your API key
npm run build
npm start
npm test                    # 163 unit tests, <1s
npm run test:integration    # Live API tests
npm run lint
node scripts/live-e2e.mjs  # 16 live E2E scenarios

Upgrading from v2

v3 is additive — no tool schemas or env vars were removed.

  • Three new tools: analyze_video, generate_video, get_video_status
  • Structured _meta.code on every error response (text messages preserved)
  • save_path sandboxed by default — set OPENROUTER_OUTPUT_DIR or OPENROUTER_ALLOW_UNSAFE_PATHS=1
  • Reasoning-model awareness: content: null + finish_reason: length now returns INVALID_INPUT with a preview instead of empty string
  • IPv6 SSRF coverage extended to mapped, compat, multicast, 6to4, Teredo, ORCHID

Compatibility

Works with any MCP client: Kiro · Claude Desktop · Cursor · Windsurf · Cline · any MCP-compatible client.

License

Apache 2.0. See LICENSE. v1.0.0 through v3.2.0 were released under MIT; v4.0.0 relicensed to Apache 2.0 (Apache 2.0 is a permissive superset of MIT with explicit patent grant).

Contributing

Issues and PRs welcome. Please open an issue first for major changes.