npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

mcp-plugin-vision

v0.1.0

Published

MCP server that bridges text-only models with vision capabilities — image recognition and web page reading via any OpenAI-compatible vision model.

Downloads

297

Readme

mcp-plugin-vision

MCP server for Claude Code / Claude Desktop that bridges text-only models with vision capabilities:

  • Image recognition — read images from local path, clipboard, Claude upload, URL, or base64, then send to an OpenAI-compatible vision model
  • Web page reading — fetch web links, extract readable text, then ask the model to summarize or answer questions

Perfect for using models like DeepSeek V4 that don't natively support image input.

No model included. You bring your own OpenAI-compatible API endpoint and key.

Quick Start (npx, recommended)

Add to your Claude Code .claude.json or Claude Desktop mcpServers config:

{
  "mcpServers": {
    "vision-web-bridge": {
      "command": "npx",
      "args": ["-y", "mcp-plugin-vision"],
      "env": {
        "MODEL_BASE_URL": "https://api.openai.com/v1",
        "MODEL_API_KEY": "your-api-key",
        "MODEL_NAME": "gpt-4o",
        "ALLOW_LOCAL_IMAGE_PATHS": "true",
        "ALLOW_CLIPBOARD_IMAGES": "true"
      }
    }
  }
}

Restart Claude Code, then check /mcpvision-web-bridge should show ✔ connected.

Provider Examples

Xiaomi MiMo:

"env": {
  "MODEL_BASE_URL": "https://api.xiaomimimo.com/v1",
  "MODEL_API_KEY": "sk-...",
  "MODEL_NAME": "mimo-v2-omni"
}

SiliconFlow:

"env": {
  "MODEL_BASE_URL": "https://api.siliconflow.cn/v1",
  "MODEL_API_KEY": "sk-...",
  "MODEL_NAME": "Qwen/Qwen3-VL-8B-Instruct"
}

OpenAI:

"env": {
  "MODEL_BASE_URL": "https://api.openai.com/v1",
  "MODEL_API_KEY": "sk-...",
  "MODEL_NAME": "gpt-4o"
}

Gemini (via OpenAI-compatible layer):

"env": {
  "MODEL_BASE_URL": "https://generativelanguage.googleapis.com/v1beta/openai",
  "MODEL_API_KEY": "AIza...",
  "MODEL_NAME": "gemini-2.0-flash"
}

Any OpenAI-compatible /v1 endpoint works.

Alternative: Local Install

If you prefer to run from a local checkout:

git clone https://github.com/dangpolly927-eng/mcp-plugin-vision.git
cd mcp-plugin-vision
npm install

Then configure with local path:

{
  "mcpServers": {
    "vision-web-bridge": {
      "command": "node",
      "args": ["D:\\path\\to\\mcp-plugin-vision\\src\\server.mjs"],
      "env": {
        "MODEL_BASE_URL": "https://api.example.com/v1",
        "MODEL_API_KEY": "your-api-key",
        "MODEL_NAME": "your-vision-model"
      }
    }
  }
}

You can also use a .env file with --env-file-if-exists instead of inline env vars. See .env.example.

Requirements

  • Node.js >= 20

Capabilities

| Capability | macOS | Windows | Linux | | --- | --- | --- | --- | | MCP server | Supported | Supported | Supported | | Local image path | Supported | Supported | Supported | | Clipboard image | Supported | PowerShell / WinForms | wl-paste / xclip | | Claude upload image | Supported | Best effort | Best effort | | Web page reading | Supported | Supported | Supported |

Security Defaults

All dangerous features are opt-in (disabled by default):

| Feature | Default | | --- | --- | | ALLOW_LOCAL_IMAGE_PATHS | false | | ALLOW_CLIPBOARD_IMAGES | false | | ALLOW_PRIVATE_NETWORK_URLS | false | | USE_JINA_READER | false |

Set to "true" in env vars to enable.

Tools

  • read_image_with_model — Read image from local path, clipboard, URL, base64, or latest Claude upload
  • read_links_with_model — Fetch and summarize web page content

Usage

Read the latest image uploaded to the Claude client:

Use read_image_with_model with use_latest_upload=true.

Read the current clipboard image:

Use read_image_with_model with use_clipboard=true and use_latest_upload=false.

Read a local image path after enabling ALLOW_LOCAL_IMAGE_PATHS=true:

Use read_image_with_model with image_path="/absolute/path/to/image.png".

Read web links:

Use read_links_with_model to summarize https://example.com/article

Tool Details

read_image_with_model

Supported image sources:

  • latest Claude upload;
  • public image URL;
  • base64 image;
  • data URL;
  • local image path, opt-in only;
  • clipboard image, opt-in only.

The tool returns the model response and a non-sensitive source label such as latest uploaded image, clipboard image, or local image path.

read_links_with_model

The tool extracts URLs from the user input, fetches readable page content locally, and asks the configured model to summarize or answer questions.

Private-network URLs are blocked by default. Optional Jina Reader fallback can be enabled with USE_JINA_READER=true, which sends the URL to Jina Reader.

Environment Variables

| Variable | Default | Description | | --- | --- | --- | | MODEL_BASE_URL | https://api.example.com/v1 | OpenAI-compatible /v1 endpoint | | OPENAI_BASE_URL | unset | Fallback base URL if MODEL_BASE_URL is not set | | MODEL_API_KEY | unset | API key for the model provider | | MODEL_NAME | replace-with-your-vision-model | Chat or vision model name | | CLAUDE_UPLOAD_DIRS | client-specific defaults | Override upload directories | | CLAUDE_UPLOAD_DIRS_DELIMITER | platform default | Directory list delimiter | | ALLOW_LOCAL_IMAGE_PATHS | false | Allow explicit local image paths | | ALLOW_CLIPBOARD_IMAGES | false | Allow reading image data from clipboard | | ALLOW_PRIVATE_NETWORK_URLS | false | Allow private-network web and image URLs | | USE_JINA_READER | false | Allow Jina Reader fallback | | MAX_IMAGE_BYTES | 10485760 | Maximum image size in bytes |

Development

npm test
npm run check:secrets

Before publishing, run:

npm pack --dry-run

Check the file list carefully. .env, logs, images, local screenshots, and personal paths must not be included.

Windows Notes

  • Use full absolute paths in claude_desktop_config.json.
  • Save JSON config as UTF-8 without BOM.
  • Restart Claude from the system tray after editing config.
  • Clipboard image reading uses PowerShell / Windows Forms.

License

MIT