kolosal-vision-mcp
v1.0.0
Published
MCP server for Kolosal Vision - AI-powered image analysis and OCR
Maintainers
Readme
Kolosal Vision MCP
An MCP (Model Context Protocol) server that provides AI-powered image analysis and OCR using the Kolosal Vision API. Seamlessly integrate vision capabilities into Claude Desktop, Cursor IDE, or any MCP-compatible client.
✨ Features
- 🖼️ Image Analysis - Analyze images with natural language queries
- 🔗 URL Support - Automatically downloads and processes images from URLs
- 📁 Local File Support - Directly analyze images from your filesystem
- 📝 Base64 Support - Accepts base64-encoded images
- 🎯 Structured Responses - Returns organized analysis with key observations
- 🔄 Multiple Formats - Supports JPEG, PNG, GIF, WebP, and BMP
📦 Installation
Using npx (Recommended)
No installation needed! Just configure your MCP client to use:
npx kolosal-vision-mcpGlobal Installation
npm install -g kolosal-vision-mcpLocal Installation
npm install kolosal-vision-mcp🔑 Configuration
Get Your API Key
- Visit Kolosal AI
- Sign up or log in to your account
- Generate an API key from your dashboard
Setup with Cursor IDE
Add this configuration to your Cursor MCP settings (~/.cursor/mcp.json):
{
"mcpServers": {
"kolosal-vision": {
"command": "npx",
"args": ["-y", "kolosal-vision-mcp"],
"env": {
"KOLOSAL_API_KEY": "your_api_key_here"
}
}
}
}Setup with Claude Desktop
Add this to your Claude Desktop config:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"kolosal-vision": {
"command": "npx",
"args": ["-y", "kolosal-vision-mcp"],
"env": {
"KOLOSAL_API_KEY": "your_api_key_here"
}
}
}
}Alternative: Using Global Installation
If you installed globally, replace the command configuration:
{
"mcpServers": {
"kolosal-vision": {
"command": "kolosal-vision-mcp",
"env": {
"KOLOSAL_API_KEY": "your_api_key_here"
}
}
}
}🛠️ Tool: analyze_image
Parameters
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| image | string | Yes | Image source: URL, local file path, or base64-encoded data |
| description | string | Yes | What to analyze (e.g., "Describe this image", "Extract text") |
Supported Image Sources
- URLs -
https://example.com/image.jpg - Local files -
/path/to/image.pngor./relative/path.jpg - Base64 - Raw base64-encoded image data
Supported Formats
- JPEG / JPG
- PNG
- GIF
- WebP
- BMP
💡 Usage Examples
In Cursor IDE
Simply reference an image file and ask questions:
Analyze @./photos/product.jpg and describe what you seeWhat text is visible in @./screenshots/document.png?Example Prompts
- "What objects are in this image?"
- "Describe the scene in detail"
- "Extract any visible text (OCR)"
- "What is the main subject?"
- "Describe the colors and composition"
- "Are there any people? What are they doing?"
- "What brand logos are visible?"
- "Is this image appropriate for a professional website?"
Response Format
The tool returns structured responses:
## Image Analysis
[Detailed analysis based on your query]
## Details
1. [Key observation 1]
2. [Key observation 2]
3. [Key observation 3]
...🔧 Development
Prerequisites
- Node.js 18+
- npm or yarn
Setup
# Clone the repository
git clone https://github.com/madebyaris/kolosal-vision-mcp.git
cd kolosal-vision-mcp
# Install dependencies
npm install
# Build
npm run build
# Run in development mode (watch)
npm run devProject Structure
kolosal-mcp-vision/
├── src/
│ └── index.ts # Main MCP server implementation
├── dist/ # Compiled JavaScript
├── package.json
├── tsconfig.json
└── README.md🐛 Troubleshooting
"KOLOSAL_API_KEY environment variable is not set"
Make sure you've added your API key to the MCP configuration's env section.
"Invalid image format"
Ensure your image is in a supported format (JPEG, PNG, GIF, WebP, or BMP). PDF files are not currently supported.
"Failed to download image"
Check that the URL is accessible and returns a valid image. Some URLs may require authentication or have CORS restrictions.
MCP Server Not Loading
- Restart your IDE/client after configuration changes
- Check the MCP configuration JSON syntax
- Verify the API key is correct
📄 License
MIT © Aris Setiawan
🔗 Links
- Kolosal AI - Get your API key
- MCP Documentation - Learn more about MCP
- GitHub Repository
- npm Package
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
