npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@jettoblack/image_mcp

v1.0.1

Published

MCP server for image summarization using OpenAI-compatible chat completion endpoints

Readme

Image Summarization MCP Server

A Model Context Protocol (MCP) server that accepts image files and sends them to an OpenAI-compatible chat completion endpoint for analysis, description, and comparison tasks.

Use Case

Many LLMs used for agentic coding are text-only and lack support for image inputs. This tool allows you to use a secondary model dedicated to describing and analyzing images, without having to use a multi-modal LLM for your primary model. It supports both cloud and local LLMs via any server that supports the OpenAI chat completion endpoint (including llama.cpp / llama-swap, Ollama, open-webui, OpenRouter, etc).

For local models, gemma3:4b-it-qat works quite well with a relatively small footprint and fast performance (even on CPU-only).

Features

  • Accepts images via unified image_url parameter with multiple input formats
  • Supports custom_prompt to perform specific tasks other than just general description
  • Sends images to OpenAI-compatible chat completion endpoints
  • Returns detailed image descriptions
  • Configurable endpoint URL, API key, and model
  • Command-line interface for configuration
  • Comprehensive error handling
  • TypeScript support

Quick install from NPM

Add this to your global mcp_settings.json or project mcp.json:

  "image_summarization": {
    "command": "npx",
    "args": [
      "-y",
      "@jettoblack/image_mcp",
      "--api-key",
      "key",
      "--base-url",
      "http://localhost:8080/v1",
      "--model",
      "gemma3:4b-it-qat",
      "--timeout",
      "120000",
      "--max-retries",
      "3"
    ],
    "timeout": 300
  }

Replace the base url, API key, model, etc. as required.

Configuration

The MCP server can be configured using environment variables, command-line arguments, or defaults.

Environment Variables

  • OPENAI_API_KEY: Your API key for the OpenAI-compatible service
  • OPENAI_BASE_URL: The base URL of the OpenAI-compatible service (default: http://localhost:9292/v1)
  • OPENAI_MODEL: The model to use for image analysis
  • OPENAI_TIMEOUT: Request timeout in milliseconds (default: 60000). When running local models you may need to increase this.
  • OPENAI_MAX_RETRIES: Maximum number of retry attempts (default: 3)

Command Line Arguments

npx -y @jettoblack/image_mcp \
  --api-key your-api-key \
  --base-url https://api.openai.com/v1 \
  --model gpt-4-vision-preview \
  --timeout 60000 \
  --max-retries 5

Configuration Priority

  1. Command-line arguments
  2. Environment variables
  3. Default values

Dev Installation

  1. Clone the repository:
git clone https://github.com/jettoblack/image_mcp.git
cd image_mcp
  1. Install dependencies:
npm install
  1. Build the project:
npm run build
  1. Starting the Server
node build/index.js

The server will start and listen on stdio for MCP protocol communications.

MCP Tool Installation (local build)

Add this to your global mcp_settings.json or project mcp.json:

  "image_summarizer": {
    "command": "node",
    "args": [
      "/path/to/image_mcp/build/index.js",
      "--api-key",
      "key",
      "--base-url",
      "http://localhost:9292/v1",
      "--model",
      "gemma3:4b-it-qat",
      "--timeout",
      "120000",
      "--max-retries",
      "3"
    ],
    "timeout": 300,
  }

Usage

MCP Tools

The server provides two tools for image analysis:

summarize_image

Analyzes and describes a single image in detail.

Parameters

  • image_url (string): URL to the image file to analyze. Supports:
    • Absolute file paths
    • file:// URLs
    • HTTP/HTTPS URLs (will be downloaded and converted to base64)
    • Data URLs with base64 encoded image files
  • custom_prompt (string, optional): Custom prompt to use instead of the default image description prompt

Example Usage

Using file path:

{
  "name": "summarize_image",
  "arguments": {
    "image_url": "/path/to/your/image.jpg"
  }
}

Using file:// URL:

{
  "name": "summarize_image",
  "arguments": {
    "image_url": "file:///path/to/your/image.jpg"
  }
}

Using HTTP/HTTPS URL:

{
  "name": "summarize_image",
  "arguments": {
    "image_url": "https://example.com/image.jpg"
  }
}

Using data URL with base64:

{
  "name": "summarize_image",
  "arguments": {
    "image_url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUg..."
  }
}

With custom prompt:

{
  "name": "summarize_image",
  "arguments": {
    "image_url": "/path/to/your/image.jpg",
    "custom_prompt": "What objects are visible in this image?"
  }
}

compare_images

Compares 2 or more images and describes their similarities and differences.

Parameters
  • image_urls (array of strings): Array of image URLs to compare (minimum 2 images required). Each URL supports:
    • Absolute file paths
    • file:// URLs
    • HTTP/HTTPS URLs (will be downloaded and converted to base64)
    • Data URLs with base64 encoded image files
  • custom_prompt (string, optional): Custom prompt to use instead of the default image comparison prompt
Example Usage

Comparing two images:

{
  "name": "compare_images",
  "arguments": {
    "image_urls": [
      "/path/to/image1.jpg",
      "/path/to/image2.jpg"
    ]
  }
}

Comparing multiple images with custom prompt:

{
  "name": "compare_images",
  "arguments": {
    "image_urls": [
      "https://example.com/image1.jpg",
      "https://example.com/image2.jpg"
    ],
    "custom_prompt": "Compare these UI screenshots and describe the differences in color themes."
  }
}

Testing

Running Tests

Run the test suite:

npm test

The test suite includes:

  • Unit tests for image processing functionality
  • Integration tests that require a mock server
  • Tests for both summarize_image and compare_images tools

Mock Server Testing

The project includes a mock OpenAI-compatible server for testing purposes.

  1. Start the mock server in a separate terminal:
node tests/mock-server.js

The mock server will start on http://localhost:9293 and provides endpoints for:

  • GET /v1/models - Lists available models
  • POST /v1/chat/completions - Mock chat completions with image support
  • POST /v1/test/image-process - Test endpoint for image processing validation
  1. Set environment variables for the mock server:
export OPENAI_BASE_URL=http://localhost:9293/v1
export OPENAI_API_KEY=test-key
export OPENAI_MODEL=test-model-vision
  1. Run the integration tests:
npm test tests/integration.test.ts

Real OpenAI-Compatible Server Testing

To test with a real OpenAI-compatible endpoint:

  1. Set up your environment variables:
export OPENAI_API_KEY=your-actual-api-key
export OPENAI_BASE_URL=https://api.openai.com/v1
export OPENAI_MODEL=gpt-4-vision-preview

Or for other OpenAI-compatible services:

export OPENAI_API_KEY=your-service-api-key
export OPENAI_BASE_URL=https://your-service-endpoint/v1
export OPENAI_MODEL=your-vision-model
  1. Start the MCP server:
node build/index.js
  1. Send test requests using an MCP client or test the tools directly.

Manual Testing

You can manually test the MCP server using tools like curl or MCP clients:

# Test with a local image file
curl -X POST http://localhost:8080/sse \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "id": 1,
    "method": "tools/call",
    "params": {
      "name": "summarize_image",
      "arguments": {
        "image_url": "/path/to/your/test/image.jpg"
      }
    }
  }'

API Reference

OpenAI-Compatible API Integration

The server sends requests to the OpenAI-compatible chat completion endpoint with the following structure:

{
  "model": "your-model",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Describe this image in detail, including all text."
        },
        {
          "type": "image_url",
          "image_url": {
            "url": "data:image/png;base64,..."
          }
        }
      ]
    }
  ],
  "stream": false
}

Supported Image Formats

  • JPEG (.jpg, .jpeg)
  • PNG (.png)
  • GIF (.gif)
  • WebP (.webp)
  • SVG (.svg)
  • BMP (.bmp)
  • TIFF (.tiff)

Error Handling

The server includes comprehensive error handling for:

  • Invalid image files
  • Unsupported image formats
  • Missing API keys
  • Network connectivity issues
  • API response errors

Development

Project Structure

src/
├── config.ts          # Configuration management
├── image-processor.ts # Image processing utilities
├── index.ts          # Main MCP server
└── openai-client.ts  # OpenAI-compatible API client

Building

npm run build

Testing

npm test

License

This project is licensed under the MIT License.

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Submit a pull request

Support

For issues and questions, please open an issue on the GitHub repository.

Tips

Tips / donations always appreciated to help fund future development.