@abyssbug/vision-mcp

v0.1.0

Published

3 months ago

Local, API-free Vision MCP server for image and video analysis

0High
0Medium
0Low

abyssbug

mcp vision image video ffmpeg api-free

@abyssbug/vision-mcp

Local, API-free Vision MCP server for image and video analysis. Works with any MCP-compatible client (OpenCode, Claude, etc.) without requiring external API keys.

Features

Image Analysis: Process images with optional resizing/compression
Video Analysis: Extract frames using ffmpeg with uniform or scene-based sampling
No API Keys: Works entirely locally with your chosen model
Provider Agnostic: Compatible with GLM 4.6/4.5, Claude, and other vision-capable models

Installation

For OpenCode/Claude Desktop

Add to your MCP configuration:

{
  "mcpServers": {
    "vision-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "@abyssbug/vision-mcp"
      ]
    }
  }
}

Prerequisites

Node.js >= 22.0.0
ffmpeg and ffprobe (for video analysis)

Install ffmpeg:

# macOS
brew install ffmpeg

# Ubuntu/Debian
sudo apt install ffmpeg

# Windows
# Download from https://ffmpeg.org/download.html

Usage

Image Analysis

Call the image_analysis tool with:

{
  "path": "./image.png",
  "maxWidth": 1024
}

Video Analysis

Call the video_analysis tool with:

{
  "path": "./video.mp4",
  "maxFrames": 12,
  "width": 1024,
  "strategy": "uniform"
}

Configuration

Set environment variables for limits (optional):

MAX_BYTES=52428800      # Max file size (default: 50MB)
FRAME_LIMIT=24          # Max frames per video (default: 24)
DEFAULT_WIDTH=1024       # Default resize width (default: 1024)
TEMP_DIR=/tmp           # Temp directory (default: system temp)

How It Works

No Inference: This MCP only preprocesses media (resize, extract frames)
Model Agnostic: Your chosen model performs the actual vision understanding
Local Processing: All operations happen locally with ffmpeg and sharp
Base64 Output: Returns processed media as base64-encoded content

Tools

`image_analysis`

Validates and optionally resizes images
Returns base64-encoded image content
Supports local paths and URLs

`video_analysis`

Extracts frames using ffmpeg
Supports uniform and scene-based sampling
Returns multiple base64-encoded frame images
Includes timestamps for each frame

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@abyssbug/vision-mcp

Features

Installation

For OpenCode/Claude Desktop

Prerequisites

Usage

Image Analysis

Video Analysis

Configuration

How It Works

Tools

image_analysis

video_analysis

License

`image_analysis`

`video_analysis`