video-gen-mcp

v1.0.0

Published

a month ago

MCP server for Google Veo 3.1 video generation

0High
0Medium
0Low

connorturland

mcp model-context-protocol veo video-generation google-genai

Video Generation MCP Server

A Model Context Protocol (MCP) server that enables AI assistants to generate videos using Google's Veo 3.1 model through the Google GenAI SDK.

Features

Text-to-Video Generation: Create videos from detailed text prompts
Image-to-Video Generation: Generate videos starting from a reference image
Reference Images: Support for up to 3 reference images for character/object/style consistency
Video Configuration: Customize aspect ratio, negative prompts, and random seeds
MCP Compatible: Works seamlessly with Claude Desktop, Cursor, and other MCP clients

Prerequisites

Node.js 18 or higher
Google GenAI API key (from Google AI Studio or Vertex AI)
An MCP-compatible AI assistant (e.g., Claude Desktop, Cursor)

Installation

Clone or download this repository
Install dependencies:

npm install

Build the project:

npm run build

Set up your Google GenAI API key:

export GOOGLE_GENAI_API_KEY='your-api-key-here'

You can get an API key from:

Google AI Studio (for Gemini API)
Google Cloud Console (for Vertex AI)

Usage

With Claude Desktop

Add this server to your Claude Desktop configuration file:

MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "video-gen": {
      "command": "node",
      "args": ["/absolute/path/to/video-gen-mcp/dist/index.js"],
      "env": {
        "GOOGLE_GENAI_API_KEY": "your-api-key-here"
      }
    }
  }
}

With Cursor

Add to your Cursor MCP settings:

{
  "mcpServers": {
    "video-gen": {
      "command": "node",
      "args": ["/absolute/path/to/video-gen-mcp/dist/index.js"],
      "env": {
        "GOOGLE_GENAI_API_KEY": "your-api-key-here"
      }
    }
  }
}

Available Tools

1. generate_video

Generate a video from a text prompt.

Parameters:

prompt (string, required): Detailed description of the video to generate
config (object, optional):
- aspectRatio (string): "16:9", "9:16", or "1:1"
- negativePrompt (string): Things to avoid in the video
- seed (number): Random seed for reproducible generation
referenceImages (array, optional): Up to 3 reference images
- imageBytes (string): Base64-encoded image data
- mimeType (string): Image MIME type (default: "image/png")
- referenceType (string): "asset" or "style"
outputPath (string, optional): Where to save the video (default: ~/Videos with unique timestamped filename)

Example:

Generate a video of a cinematic shot of a majestic lion walking through the savannah at sunset, in 16:9 aspect ratio

2. generate_video_from_image

Generate a video starting from an image (image-to-video).

Parameters:

prompt (string, required): Description of the video to generate
imageBytes (string, required): Base64-encoded starting image
imageMimeType (string, optional): MIME type (default: "image/png")
config (object, optional): Same as generate_video
referenceImages (array, optional): Additional reference images
outputPath (string, optional): Where to save the video (default: ~/Videos with unique timestamped filename)

Example:

Using this image of a sleeping kitten, generate a video showing the kitten waking up and stretching

Example Prompts

When using the server through Claude or another AI assistant, you can ask:

Simple text-to-video:

"Generate a video of a close-up shot of a butterfly landing on a flower"

With configuration:

"Create a 9:16 vertical video of a waterfall in a lush forest. Avoid any people or animals in the scene."

Image-to-video:

"I have a photo of a city street. Can you create a video that brings it to life with cars moving and people walking?"

With reference images:

"Generate a video of a character walking through a marketplace. Use these reference images to maintain consistent appearance."

Video Generation Tips

Be Descriptive: Provide detailed prompts including camera movement, lighting, mood, and action
Specify Technical Details: Include aspect ratio, duration preferences, and shot types
Use Reference Images: For character consistency across multiple shots
Negative Prompts: Explicitly mention what you want to avoid
Patience: Video generation can take several minutes (typically 1-3 minutes per video)

Architecture

The server follows a clean, layered architecture:

src/
├── index.ts           # MCP server setup and tool handlers
├── video-service.ts   # Google GenAI SDK integration
└── types.ts          # Type definitions and Zod schemas

Development

Watch Mode

npm run dev

Testing Locally

# Set your API key
export GOOGLE_GENAI_API_KEY='your-key'

# Run the server (it will wait for MCP JSON-RPC messages on stdin)
npm start

Troubleshooting

"API key not found" error

Make sure your GOOGLE_GENAI_API_KEY environment variable is set in the MCP configuration.

"Video generation failed" error

Check that you have access to the Veo 3.1 model in your Google AI account
Verify your API key has the necessary permissions
Ensure your prompt doesn't violate content policies

Server not appearing in Claude Desktop

Verify the path to dist/index.js is absolute
Check that the build completed successfully (npm run build)
Restart Claude Desktop after configuration changes
Check Claude Desktop logs for error messages

API Rate Limits

Be aware of Google GenAI API rate limits:

Free tier: Limited requests per minute
Pay-as-you-go: Check Google AI pricing

License

MIT

Resources

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.