sight-mcp

v1.0.4

Published

9 months ago

MCP server for AI-powered image and video analysis - supports OpenAI, Claude, and multimodal vision APIs with local file and URL processing

Sight MCP - AI Vision Analysis Server

A powerful MCP server that brings AI vision capabilities to Claude Desktop. Analyze images and videos using OpenAI GPT-4o, Claude, or any compatible vision API.

中文版本: See README_CN.md for the full Chinese documentation.

Features

🖼️ Image Analysis: Analyze PNG, JPG, JPEG files (max 5MB)
🎥 Video Analysis: Analyze MP4, MOV, M4V files (max 8MB)
🌐 Remote URL Support: Process images and videos from HTTP/HTTPS URLs
🔧 Multi-API Support: Works with OpenAI, Anthropic, and compatible APIs
📁 Local File Processing: Secure file validation and automatic encoding
🛡️ Error Handling: Comprehensive error management and validation

Quick Start

1. Install the Server

Option A: npx (Recommended - No Installation)

# Just add to Claude Desktop config below - nothing to install!

Option B: Global Install

npm install -g sight-mcp
# Or: bun install -g sight-mcp

2. Configure Claude Desktop

Add this to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "sight-mcp": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "sight-mcp"],
      "env": {
        "OPENAI_API_KEY": "your-openai-api-key",
        "API_URL": "https://api.openai.com/v1/chat/completions",
        "MODEL": "gpt-4o"
      }
    }
  }
}

3. Configure Your API Provider

Required Environment Variables:

OPENAI_API_KEY (required): Your API key
API_URL (required): API endpoint URL
MODEL (required): Model name

OpenAI GPT-4o

"env": {
  "OPENAI_API_KEY": "sk-your-openai-key",
  "API_URL": "https://api.openai.com/v1/chat/completions",
  "MODEL": "gpt-4o"
}

Anthropic Claude

"env": {
  "OPENAI_API_KEY": "sk-ant-your-claude-key",
  "API_URL": "https://api.anthropic.com/v1/messages",
  "MODEL": "claude-3-5-sonnet-20241022"
}

Zhipu AI GLM-4.5v

"env": {
  "OPENAI_API_KEY": "your-zhipu-api-key",
  "API_URL": "https://open.bigmodel.cn/api/paas/v4/chat/completions",
  "MODEL": "glm-4v"
}

Other Compatible APIs

Any API that follows the OpenAI vision format will work. Just update the API_URL and MODEL accordingly.

4. Restart Claude Desktop

After updating the config, restart Claude Desktop and you'll see the vision tools available!

Usage Examples

Once configured, you can use these tools in Claude:

Image Analysis

Analyze this image: /path/to/photo.jpg

What do you see in this screenshot? /Users/desktop/screen.png

Video Analysis

Analyze the video at https://example.com/demo.mp4 and describe what happens

What's in this video file? /path/to/recordings.mov

Using in Claude Code

When using Claude Code, add this MCP server with:

claude mcp add sight-mcp --env OPENAI_API_KEY=your_api_key --env API_URL=https://api.openai.com/v1/chat/completions --env MODEL=gpt-4o -- npx -y sight-mcp

Or for different providers:

# Anthropic Claude
claude mcp add sight-mcp --env OPENAI_API_KEY=your_claude_key --env API_URL=https://api.anthropic.com/v1/messages --env MODEL=claude-3-5-sonnet-20241022 -- npx -y sight-mcp

# Zhipu AI
claude mcp add sight-mcp --env OPENAI_API_KEY=your_zhipu_key --env API_URL=https://open.bigmodel.cn/api/paas/v4/chat/completions --env MODEL=glm-4v -- npx -y sight-mcp

After adding, the tools are available directly in Claude Code conversations as:

mcp__sight-mcp__analyze_image
mcp__sight-mcp__analyze_video

Supported File Formats

Images

Formats: PNG, JPG, JPEG
Max Size: 5MB
Source: Local files or HTTP/HTTPS URLs

Videos

Formats: MP4, MOV, M4V
Max Size: 8MB
Source: Local files or HTTP/HTTPS URLs

Available Tools

`analyze_image`

Analyzes images using AI vision models.

Parameters:

image (string): Local file path or remote URL to the image
prompt (string): What you want to know about the image

`analyze_video`

Analyzes videos using AI vision models.

Parameters:

video (string): Local file path or remote URL to the video
prompt (string): What you want to know about the video

Troubleshooting

Common Issues

"Server not found" error:

Make sure Claude Desktop is restarted after config changes
Check that your API key is valid and has credits
Verify the API_URL is correct for your provider

"File too large" error:

Images: Max 5MB
Videos: Max 8MB
Try compressing files or using URLs for larger files

"Unsupported format" error:

Images: Use PNG, JPG, or JPEG
Videos: Use MP4, MOV, or M4V

API authentication errors:

Double-check your OPENAI_API_KEY
Ensure the key has vision capabilities enabled
Verify the API_URL matches your provider

Debug Mode

Add "DEBUG": "true" to your environment variables to see detailed logs:

"env": {
  "OPENAI_API_KEY": "your-key",
  "API_URL": "https://api.openai.com/v1/chat/completions",
  "MODEL": "gpt-4o",
  "DEBUG": "true"
}

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

📧 Security Issues: Please report via private GitHub issue