text2image-mcp

v2.4.0

Published

6 days ago

MCP server for Google's Gemini image generation API (Flash + Pro) with cross-platform support

0High
0Medium
0Low

kishor-kukreja

mcp text2image gemini image-generation ai google-ai claude-code cursor image-editing text-to-image

Nano-Banana MCP Server 🍌

🤖 This project was entirely generated by Claude Code - an AI coding assistant that can create complete, production-ready applications from scratch.

A Model Context Protocol (MCP) server that provides AI image generation and editing capabilities using Google's Gemini API. Supports both Gemini 2.5 Flash (fast, efficient) and Gemini 3 Pro (high quality, 4K, search grounding). Generate stunning images, edit existing ones, and iterate on your creations with simple text prompts.

✨ Features

🎨 Generate Images: Create new images from text descriptions
✏️ Edit Images: Modify existing images with text prompts
🔄 Multi-Turn Editing: Continue editing with full conversation context via chat sessions
🖼️ Multiple Reference Images: Up to 3 (Flash) or 14 (Pro) reference images for style transfer and guidance
🤖 Dual Model Support: Choose between Flash (speed) and Pro (quality) per request
📐 Aspect Ratios: 10 aspect ratio options from 1:1 to 21:9
🔍 Resolution Control: Generate at 1K, 2K, or 4K resolution (Pro model)
🌐 Google Search Grounding: Generate images from real-time data — weather, news, charts (Pro model)
🌍 Cross-Platform: Smart file paths for Windows, macOS, and Linux
🔧 Easy Setup: Simple configuration with API key
📁 Auto File Management: Automatic image saving with organized naming

🤖 Models

Nano-Banana supports two Gemini models:

| Model | ID | Best For | |---|---|---| | Flash (default) | gemini-2.5-flash-image | Fast generation, high-volume tasks, free-tier friendly | | Pro | gemini-3-pro-image-preview | High quality, 4K resolution, search grounding, complex prompts |

Auto-upgrade: If you request Pro-only features (2K/4K resolution, Google Search grounding) while using Flash, the server automatically upgrades to Pro for that request.

🔑 Setup

Get your Gemini API key:
- Visit Google AI Studio
- Create a new API key
- Copy it for configuration
Configure the MCP server: See configuration examples for your specific client below (Claude Code, Cursor, or other MCP clients).

💻 Usage with Claude Code

Configuration:

Add this to your Claude Code MCP settings:

Option A: With environment variable (Recommended - Most Secure)

{
  "mcpServers": {
    "nano-banana": {
      "command": "npx",
      "args": ["text2image-mcp"],
      "env": {
        "GEMINI_API_KEY": "your-gemini-api-key-here"
      }
    }
  }
}

Option B: Without environment variable

{
  "mcpServers": {
    "nano-banana": {
      "command": "npx",
      "args": ["text2image-mcp"]
    }
  }
}

Usage Examples:

Generate an image of a sunset over mountains

Generate a 4K image of a product mockup using the pro model

Edit this image to add some birds in the sky with a 16:9 aspect ratio

Continue editing to make it more dramatic

🎯 Usage with Cursor

Configuration:

Add to your Cursor MCP configuration:

Option A: With environment variable (Recommended)

{
  "nano-banana": {
    "command": "npx",
    "args": ["text2image-mcp"],
    "env": {
      "GEMINI_API_KEY": "your-gemini-api-key-here"
    }
  }
}

Option B: Without environment variable

{
  "nano-banana": {
    "command": "npx",
    "args": ["text2image-mcp"]
  }
}

Usage Examples:

Ask Cursor to generate images for your app
Create mockups and prototypes
Generate assets for your projects

🔧 For Other MCP Clients

If you're using a different MCP client, you can configure text2image-mcp using any of these methods:

Configuration Methods

Method A: Environment Variable in MCP Config (Recommended)

{
  "nano-banana": {
    "command": "npx",
    "args": ["text2image-mcp"],
    "env": {
      "GEMINI_API_KEY": "your-gemini-api-key-here"
    }
  }
}

Method B: System Environment Variable

export GEMINI_API_KEY="your-gemini-api-key-here"
npx text2image-mcp

Method C: Using the Configure Tool

npx text2image-mcp
# The server will prompt you to configure when first used
# This creates a local .nano-banana-config.json file

🌐 Remote Server Mode

Run text2image-mcp as an HTTP server so anyone can connect with their own Gemini API key — no local install needed.

Quick Start

# Install and build
npm install && npm run build

# Start the remote server (default port 3000)
npm run start:remote

# Or with a custom port
PORT=8080 npm run start:remote

Docker Deployment

# Build the image
docker build -t text2image-mcp .

# Run the container
docker run -p 3000:3000 text2image-mcp

Deploy to any container platform (Railway, Render, Fly.io, etc.) by pointing to the Dockerfile.

Client Configuration

Connect any MCP client using a URL and your Gemini API key as a Bearer token:

{
  "mcpServers": {
    "text2image": {
      "url": "https://your-server.com/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_GEMINI_API_KEY"
      }
    }
  }
}

Endpoints

| Method | Path | Description | |--------|------|-------------| | POST | /mcp | MCP JSON-RPC (initialize + tool calls) | | GET | /mcp | SSE stream for existing sessions | | DELETE | /mcp | Terminate a session | | GET | /health | Health check + active session count |

How It Works

Auth: Your Gemini API key is passed as Authorization: Bearer <key> — the server never stores it
Sessions: Each connection gets an isolated server instance with independent state (chat history, last image, etc.)
Timeout: Idle sessions are cleaned up after 30 minutes
Tools: Remote mode exposes 4 tools (generate_image, edit_image, continue_editing, get_last_image_info) — config tools are excluded since the API key comes from the auth header

🛠️ Available Commands

`generate_image`

Create a new image from a text prompt. Supports model selection, aspect ratios, resolution, and search grounding.

generate_image({
  prompt: "A futuristic city at night with neon lights",
  model?: "flash" | "pro",           // default: "flash"
  aspectRatio?: "1:1" | "2:3" | "3:2" | "3:4" | "4:3" | "4:5" | "5:4" | "9:16" | "16:9" | "21:9",
  resolution?: "1K" | "2K" | "4K",   // Pro model only
  useGoogleSearch?: true              // Pro model only — real-time data
})

`edit_image`

Edit a specific image file with optional reference images, model selection, and output control.

edit_image({
  imagePath: "/path/to/image.png",
  prompt: "Add a rainbow in the sky",
  referenceImages?: ["/path/to/reference.jpg"],  // up to 3 (Flash) or 14 (Pro)
  model?: "flash" | "pro",
  aspectRatio?: "16:9",
  resolution?: "2K"                              // Pro model only
})

`continue_editing`

Continue editing the last generated/edited image with multi-turn conversation context.

continue_editing({
  prompt: "Make it more colorful",
  referenceImages?: ["/path/to/style.jpg"],
  aspectRatio?: "16:9",
  resolution?: "2K"
})

`get_last_image_info`

Get information about the last generated image.

get_last_image_info()

`configure_gemini_token`

Configure your Gemini API key.

configure_gemini_token({
  apiKey: "your-gemini-api-key"
})

`get_configuration_status`

Check if the API key is configured.

get_configuration_status()

📐 Supported Aspect Ratios

| Aspect Ratio | Use Case | |---|---| | 1:1 | Square — social media posts, profile pictures | | 2:3 / 3:2 | Portrait / landscape photos | | 3:4 / 4:3 | Classic photo format | | 4:5 / 5:4 | Instagram portrait / landscape | | 9:16 / 16:9 | Vertical / horizontal video, stories | | 21:9 | Ultra-wide, cinematic |

⚙️ Configuration Priority

The MCP server loads your API key in the following priority order:

🥇 MCP Configuration Environment Variables (Highest Priority)
- Set in your claude_desktop_config.json or MCP client config
- Most secure as it's contained within the MCP configuration
- Example: "env": { "GEMINI_API_KEY": "your-key" }
🥈 System Environment Variables
- Set in your shell/system environment
- Example: export GEMINI_API_KEY="your-key"
🥉 Local Configuration File (Lowest Priority)
- Created when using the configure_gemini_token tool
- Stored as .nano-banana-config.json in current directory
- Automatically ignored by Git and NPM

💡 Recommendation: Use Method 1 (MCP config env variables) for the best security and convenience.

📁 File Storage

Images are automatically saved to platform-appropriate locations:

Windows: %USERPROFILE%\Documents\nano-banana-images\
macOS/Linux: ./generated_imgs/ (in current directory)
System directories: ~/nano-banana-images/ (when run from system paths)

File naming convention:

Generated images: generated-[timestamp]-[id].png
Edited images: edited-[timestamp]-[id].png

🎨 Example Workflows

Basic Image Generation

generate_image - Create your base image
continue_editing - Refine and improve
continue_editing - Add final touches

High-Resolution Pro Generation

generate_image with model: "pro", resolution: "4K" - Create a high-quality base
continue_editing - Iterate with full conversation context

Search-Grounded Image

generate_image with useGoogleSearch: true - Generate from real-time data
- Example: "Visualize the current weather forecast for Tokyo as an infographic"
- Example: "Create a chart showing recent tech stock performance"

Social Media Assets

generate_image with aspectRatio: "9:16" - Vertical story format
generate_image with aspectRatio: "16:9" - YouTube thumbnail
generate_image with aspectRatio: "1:1" - Instagram post

Style Transfer

generate_image - Create base content
edit_image with reference images - Apply a style from reference
continue_editing - Fine-tune the result

Iterative Design

generate_image - Start with a concept
get_last_image_info - Check current state
continue_editing - Make adjustments
Repeat until satisfied

🔧 Development

This project was created with Claude Code and follows these technologies:

TypeScript - Type-safe development
Node.js - Runtime environment
Zod - Schema validation
Google GenAI - Image generation API (Flash + Pro models)
MCP SDK - Model Context Protocol

Local Development

# Clone the repository
git clone https://github.com/kishorkukreja/Nano-Banana-MCP.git
cd Nano-Banana-MCP

# Install dependencies
npm install

# Run in development mode
npm run dev

# Build for production
npm run build

# Run tests
npm test

📋 Requirements

Node.js 18.0.0 or higher
Gemini API key from Google AI Studio
Compatible with Claude Code, Cursor, and other MCP clients

🤝 Contributing

This project was generated by Claude Code, but contributions are welcome! Please feel free to:

Report bugs
Suggest new features
Submit pull requests
Improve documentation

📄 License

MIT License - see LICENSE file for details.

🙏 Acknowledgments

Claude Code - For generating this entire project
Google AI - For the Gemini 2.5 Flash and Gemini 3 Pro Image APIs
Anthropic - For the Model Context Protocol
Open Source Community - For the amazing tools and libraries

📞 Support

🐛 Issues: GitHub Issues
📖 Documentation: This README and inline code comments
💬 Discussions: GitHub Discussions

✨ Generated with Claude Code

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Nano-Banana MCP Server 🍌

✨ Features

🤖 Models

🔑 Setup

💻 Usage with Claude Code

Configuration:

Usage Examples:

🎯 Usage with Cursor

Configuration:

Usage Examples:

🔧 For Other MCP Clients

Configuration Methods

🌐 Remote Server Mode

Quick Start

Docker Deployment

Client Configuration

Endpoints

How It Works

🛠️ Available Commands

generate_image

edit_image

continue_editing

get_last_image_info

configure_gemini_token

get_configuration_status

📐 Supported Aspect Ratios

⚙️ Configuration Priority

📁 File Storage

🎨 Example Workflows

Basic Image Generation

High-Resolution Pro Generation

Search-Grounded Image

Social Media Assets

Style Transfer

Iterative Design

🔧 Development

Local Development

📋 Requirements

🤝 Contributing

📄 License

🙏 Acknowledgments

📞 Support

`generate_image`

`edit_image`

`continue_editing`

`get_last_image_info`

`configure_gemini_token`

`get_configuration_status`