imagen-mcp

v1.0.0

Published

7 months ago

MCP server for Google AI Studio (Gemini) image generation with support for text-to-image, image editing, and multi-turn conversational image creation

0High
0Medium
0Low

marcxavier

mcp mcp-server model-context-protocol gemini imagen image-generation ai google-ai claude text-to-image image-editing dall-e-alternative

Imagen MCP Server

A production-grade Model Context Protocol (MCP) server that brings Google's Gemini AI image generation capabilities to Claude and other MCP-compatible AI assistants.

Simply ask Claude to generate, edit, or compose images conversationally, and this server handles all the magic behind the scenes.

✨ Features

Text-to-Image Generation: Create stunning images from text descriptions
Image Editing: Modify existing images with natural language instructions
Multi-Image Composition: Combine and blend multiple images together
Multi-Turn Conversations: Iteratively refine images through back-and-forth dialogue
Production-Ready: Comprehensive error handling, validation, and logging
Zero Configuration: Works out of the box with sensible defaults

🎯 What Can It Do?

Generate Images

"Can you create a photorealistic image of a sunset over mountains?"

Edit Images

"I have an image at /path/to/photo.jpg - can you add a wizard hat to the cat?"

Compose Multiple Images

"Combine these two images: put the dress from image1.jpg onto the model in image2.jpg"

Iterative Refinement

"Generate a blue sports car" "Now make it a convertible" "Change the color to red"

All of this happens naturally through conversation with Claude - no need to know about tools or parameters!

📋 Prerequisites

Node.js 18 or higher
Google AI Studio API Key (free tier available)

🚀 Quick Start

1. Get Your API Key

Get a free API key from Google AI Studio: 👉 https://aistudio.google.com/app/apikey

2. Install via Claude Desktop

Add this server to your Claude Desktop configuration:

# Using npx (recommended - always gets latest version)
claude mcp add imagen-mcp npx -y imagen-mcp@latest --env GEMINI_API_KEY=your_api_key_here

Or manually edit your Claude Desktop config file:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "imagen-mcp": {
      "command": "npx",
      "args": ["-y", "imagen-mcp@latest"],
      "env": {
        "GEMINI_API_KEY": "your_api_key_here"
      }
    }
  }
}

3. Restart Claude Desktop

Close and reopen Claude Desktop for the changes to take effect.

4. Start Creating!

Just talk to Claude naturally:

You: "Create a minimalist logo for a coffee shop called 'Morning Brew'"

You: "Generate a photo of a cat wearing sunglasses on a beach"

You: "I have a photo at ~/Desktop/photo.jpg - can you remove the background?"

🎨 Usage Examples

Basic Image Generation

You: Can you create an illustration of a futuristic city at night?

Claude will automatically use the generate_image tool and return the image.

Image Editing

You: I have an image at /path/to/room.jpg - can you change the blue sofa to a brown leather one?

Claude will use the edit_image tool, load your image, and return the edited version.

Multi-Image Composition

You: I have two images:
- /path/to/dress.jpg
- /path/to/model.jpg

Can you put the dress on the model?

Claude will use the compose_images tool to blend them together.

Conversational Editing

You: Generate a landscape with mountains and a lake

Claude: [generates image and shows path]

You: Can you add a sunset to this image?

Claude: [edits the previous image with sunset]

You: Perfect! Now make it look more dramatic

Claude automatically maintains the conversation context using the continue_editing tool.

⚙️ Configuration

All configuration is optional. The server works with sensible defaults.

Environment Variables

You can customize behavior with these environment variables:

# Required
GEMINI_API_KEY=your_api_key_here

# Optional
OUTPUT_DIR=./images                    # Where to save generated images
MAX_IMAGE_SIZE_MB=20                   # Max input image size
CONVERSATION_TIMEOUT_MIN=30            # How long conversations stay active
LOG_LEVEL=info                         # Logging verbosity (debug|info|warn|error)

Adding Environment Variables to Claude

Update your config with additional environment variables:

{
  "mcpServers": {
    "imagen-mcp": {
      "command": "npx",
      "args": ["-y", "imagen-mcp@latest"],
      "env": {
        "GEMINI_API_KEY": "your_api_key_here",
        "OUTPUT_DIR": "/Users/you/Pictures/ai-generated",
        "LOG_LEVEL": "debug"
      }
    }
  }
}

🛠️ Advanced Usage

Specifying Aspect Ratios

You can request specific aspect ratios:

You: Generate a 16:9 landscape image of a mountain range

Available aspect ratios:

1:1 (square, default)
16:9 (widescreen)
9:16 (portrait/mobile)
4:3, 3:4 (standard photo)
21:9 (ultrawide)
2:3, 3:2, 4:5, 5:4 (various photo ratios)

Working with Large Images

For images larger than 20MB, consider:

Resizing them first
Increasing MAX_IMAGE_SIZE_MB (up to 100MB is supported)

Managing Conversations

Conversations automatically expire after 30 minutes of inactivity. To start fresh:

You: Let's start a new image generation - create a photo of a dog

Claude will automatically handle conversation management.

📁 File Locations

Generated images are saved to:

Default: ./images/ (relative to where the server runs)
Custom: Set via OUTPUT_DIR environment variable

Image files are named with timestamps for easy organization:

generated_2025-01-14_10-30-45.png
edited_2025-01-14_10-31-22.png
composed_2025-01-14_10-32-10.png
conv_xyz123_2025-01-14_10-33-05.png

🔧 Troubleshooting

"Configuration errors: GEMINI_API_KEY is required"

Solution: Make sure you've added your API key to the environment configuration. Get one from https://aistudio.google.com/app/apikey

"Rate limit exceeded"

Solution: You've hit the free tier limit. Wait a few minutes or upgrade your API quota at https://aistudio.google.com/

"Image file too large"

Solution: Your input image exceeds the size limit (default 20MB). Either:

Use a smaller image
Set MAX_IMAGE_SIZE_MB higher in your environment config

"Conversation not found or expired"

Solution: Conversations expire after 30 minutes. Start a new one by making a fresh request without referencing previous edits.

Server not showing in Claude

Solution:

Check that your config file syntax is valid JSON
Make sure you restarted Claude Desktop after making changes
Check Claude's developer console for error messages

🏗️ Development

Want to contribute or run locally?

# Clone the repository
git clone https://github.com/yourusername/imagen-mcp.git
cd imagen-mcp

# Install dependencies
npm install

# Copy environment template
cp .env.example .env

# Add your API key to .env
echo "GEMINI_API_KEY=your_key_here" >> .env

# Build
npm run build

# Run in development mode
npm run dev

📚 Technical Details

Available Tools (for LLM)

The server exposes four tools that Claude can use:

generate_image: Create images from text prompts
edit_image: Modify existing images with instructions
compose_images: Combine 2-3 images together
continue_editing: Multi-turn conversational editing

Users never interact with these directly - Claude handles tool selection automatically.

Architecture

MCP SDK: Official TypeScript SDK for Model Context Protocol
Gemini API: Google's gemini-2.5-flash-image model
Transport: stdio (standard input/output) for Claude Desktop
State Management: In-memory conversation store with auto-expiry

🔒 Security & Privacy

API Key: Stored locally in your Claude config, never shared
Images: Saved locally to your machine only
No Tracking: No analytics or telemetry
Rate Limiting: Enforced by Google's API (not by this server)

📄 License

MIT License - see LICENSE file for details

🙏 Acknowledgments

Built with Model Context Protocol by Anthropic
Powered by Google Gemini AI
Inspired by the amazing MCP community =

Made with ❤️ for the MCP community