@ugarchance/mcp-gemini-video-understanding
v0.3.0
Published
MCP server for analyzing videos using Gemini API - converts videos to text for Claude Code. Supports local files (with Unicode/Turkish characters) and YouTube URLs.
Maintainers
Readme
MCP Gemini Video Understanding
An MCP (Model Context Protocol) server that uses Google's Gemini API to analyze videos and convert them to text descriptions that Claude Code can understand and act upon.
What is this?
This MCP server acts as a bridge between video content and Claude Code. When you have a video (screen recording, Loom video, YouTube tutorial, etc.), this server uses Gemini's powerful video understanding capabilities to extract meaningful text descriptions that Claude Code can then use to write code, fix bugs, or implement features.
Use Cases
- Bug Reproduction Videos: Record a video showing a bug → Get detailed steps to reproduce and debugging insights
- Design Mockups: Show a design in a video → Get implementation guidance with UI component breakdowns
- YouTube Tutorials: Share a tutorial URL → Extract key learnings and implementation steps
- Responsive Issues: Record layout problems → Get specific CSS fixes and responsive solutions
Installation
npm install -g @ugarchance/mcp-gemini-video-understandingOr use directly with npx:
npx @ugarchance/mcp-gemini-video-understandingSetup
1. Get a Gemini API Key
- Go to Google AI Studio
- Click "Get API Key"
- Create or select a project
- Copy your API key
2. Set Environment Variable
export GEMINI_API_KEY="your-api-key-here"3. Configure Claude Code
Add to your claude_desktop_config.json:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"gemini-video": {
"command": "npx",
"args": [
"-y",
"@ugarchance/mcp-gemini-video-understanding"
],
"env": {
"GEMINI_API_KEY": "your-api-key-here"
}
}
}
}Or if installed globally:
{
"mcpServers": {
"gemini-video": {
"command": "mcp-gemini-video",
"env": {
"GEMINI_API_KEY": "your-api-key-here"
}
}
}
}Usage
All tools support these common parameters:
model(string, optional): Gemini model to use. Options:gemini-2.5-pro- Most capable, best for complex analysisgemini-2.5-flash- Default, balanced speed and qualitygemini-2.5-flash-lite- Fastest, lighter analysisgemini-2.0-flash- Previous generation fast modelgemini-2.0-flash-exp- Experimental features
output_file(string, optional): Path to save analysis. If file exists, cached result is used (no re-analysis!)
Tool 1: analyze_bug_video
Analyze a video showing a bug or error.
Parameters:
video_path(string): Path to video file or YouTube URLis_youtube(boolean, optional): Set totrueif using YouTube URLadditional_context(string, optional): Extra context about the bugmodel(string, optional): Gemini model to useoutput_file(string, optional): Path to save analysis
Example with Claude Code:
I have a bug video at /Users/me/Desktop/bug-demo.mp4
Save the analysis to bug-analysis.md and fix the issue.With model selection:
Analyze /Users/me/Desktop/complex-bug.mp4 using gemini-2.5-pro
Save to analysis.txt and help me fix it.Tool 2: analyze_design_video
Analyze a video showing a design mockup or feature demonstration.
Parameters:
video_path(string): Path to video file or YouTube URLis_youtube(boolean, optional): Set totrueif using YouTube URLtech_stack(string, optional): Technologies to use (e.g., "React with Tailwind")model(string, optional): Gemini model to useoutput_file(string, optional): Path to save analysis
Example with Claude Code:
I recorded a design mockup at /Users/me/Desktop/new-feature.mp4
Save analysis to design-spec.md then implement using React and Tailwind CSS.Tool 3: analyze_tutorial_video
Analyze a YouTube tutorial to extract key learnings.
Parameters:
video_url(string): YouTube URLfocus_area(string, optional): Specific topic to focus onmodel(string, optional): Gemini model to useoutput_file(string, optional): Path to save analysis
Example with Claude Code:
Watch this tutorial: https://www.youtube.com/watch?v=xxxxx
Save the learnings to tutorial-notes.md then implement the auth system.Using faster model for quick summaries:
Analyze https://www.youtube.com/watch?v=xxxxx with gemini-2.5-flash-lite
Just give me the key points.Tool 4: analyze_responsive_issues
Analyze a video showing responsive design problems.
Parameters:
video_path(string): Path to video file or YouTube URLis_youtube(boolean, optional): Set totrueif using YouTube URLtarget_devices(string, optional): Target devices (e.g., "mobile, tablet")model(string, optional): Gemini model to useoutput_file(string, optional): Path to save analysis
Example with Claude Code:
I recorded responsive issues at /Users/me/Desktop/mobile-issues.mp4
Save analysis to responsive-fixes.md and fix the layout for mobile.How It Works
- You record a video or find a YouTube URL
- You ask Claude Code to analyze it via MCP (optionally specifying model and output file)
- MCP Server checks if cached analysis exists (if
output_filespecified) - If no cache: Sends video to Gemini API with chosen model
- Gemini analyzes video and returns detailed text description
- MCP Server saves result to file (if
output_filespecified) - Claude Code receives the text and can now write/fix code based on it
Caching Strategy
When you specify an output_file:
- First run: Video is analyzed and result is saved to the file
- Subsequent runs: Cached file is read instantly (no API call, no cost!)
- To re-analyze: Delete the output file first
This is perfect for:
- Iterating on implementations without re-analyzing videos
- Sharing analysis results with team members
- Reducing API costs and latency
Supported Video Formats
- MP4
- MOV
- AVI
- WebM
- MKV
- FLV
- WMV
- 3GP
- MPEG
Available Models
| Model | Speed | Quality | Best For | Cost |
|-------|-------|---------|----------|------|
| gemini-2.5-pro | Slow | Highest | Complex bugs, detailed designs | $$$ |
| gemini-2.5-flash | Fast | High | General use (default) | $$ |
| gemini-2.5-flash-lite | Fastest | Good | Quick summaries, simple videos | $ |
| gemini-2.0-flash | Fast | Good | Previous gen, reliable | $$ |
| gemini-2.0-flash-exp | Fast | Varies | Experimental features | $$ |
Limitations
- YouTube: Only public videos (not private or unlisted)
- File Size: Files >20MB automatically use Gemini's File API (may take longer to process)
- Video Length: Longer videos take more time to process
- Rate Limits: Subject to Gemini API rate limits
- Caching: Only works when
output_fileis specified
Development
Local Development
# Clone the repo
git clone https://github.com/ugarchance/mcp-gemini-video-understanding
cd mcp-gemini-video-understanding
# Install dependencies
npm install
# Build
npm run build
# Test locally with Claude Code
# Add to claude_desktop_config.json:
{
"mcpServers": {
"gemini-video": {
"command": "node",
"args": ["/absolute/path/to/mcp-gemini-video-understanding/build/index.js"],
"env": {
"GEMINI_API_KEY": "your-key"
}
}
}
}Publishing to npm
# Update package.json with your npm username
npm login
npm publishTroubleshooting
"GEMINI_API_KEY environment variable is required"
Make sure you've set the GEMINI_API_KEY in your claude_desktop_config.json under the env section.
"Error analyzing video"
- Check that the video file path is absolute (not relative)
- Verify the video format is supported
- For YouTube videos, ensure the URL is valid and the video is public
- Check Gemini API quotas and rate limits
Tools not showing in Claude Code
- Restart Claude Code completely (Cmd+Q on Mac, not just close window)
- Check
claude_desktop_config.jsonsyntax is valid JSON - Look at Claude Code logs:
~/Library/Logs/Claude/mcp*.log(macOS)
License
MIT
Contributing
Contributions welcome! Please open an issue or PR.
Credits
Built with:
- Gemini API for video understanding
- Model Context Protocol for Claude integration
