video-context-mcp-server

v1.3.1

Published

a month ago

A Model Context Protocol server that gives GitHub Copilot the ability to understand and analyze video content

Downloads

586

Video Context MCP Server

An MCP server that gives coding assistants (GitHub Copilot, Cursor, Claude Code) the ability to understand and analyze video content and generate media (text-to-speech, images, videos, music) via the MiniMax API.

Official Documentation Tutorials Playlist

Features

🎬 Video Q&A — Ask questions about video content and get AI-powered answers
📝 Video Summarization — Generate structured summaries with key scenes and timelines
🖼️ Frame Extraction — Extract frames at specific timestamps or intervals
🔍 Timestamp Search — Find the exact moment when something happens in a video
📊 Video Metadata — Get duration, resolution, fps, codec, and other technical details
🎙️ Audio Transcription — Transcribe speech with paragraph-level timestamps ([MM:SS]) or export as SRT/VTT subtitles and JSON using Deepgram, AssemblyAI, Groq/Whisper, or Gemini
🔊 Speaker Diarization — Identify who said what (Deepgram and AssemblyAI)
🔊 Audio-Enhanced Analysis — Auto-transcribes audio and injects transcripts into AI prompts for richer results (all non-Gemini video providers: GLM, MiniMax-M3, Kimi, Qwen, MiMo)
🔄 Multi-Provider Support — Choose between GLM-4.6V, Qwen3.7, Kimi K2.6, Gemini, MiMo-V2.5, or MiniMax-M3
🎯 Smart Video Handling — Extracts keyframes from long videos to reduce token usage
🗣️ Text-to-Speech ⚠️ Experimental — Convert text to natural speech audio (MiniMax TTS)
🖼️ Image Generation ⚠️ Experimental — Generate images from text prompts (MiniMax image-01)
🎬 Video Generation ⚠️ Experimental — Generate videos from text or image prompts (MiniMax Hailuo)
🎵 Music Generation ⚠️ Experimental — Generate music from prompts and lyrics (MiniMax music-2.6)
🔒 Sensitive Redaction ⚠️ Experimental — Blur, pixelate, or blackout API keys and secrets in screen recordings (manual coordinates or AI-assisted detection)
⭐ Pro tier — Extended frame extraction, multi-platform downloads, higher resolution, media generation. Learn more ↓

Quick Start

Prerequisites

Node.js 20+
A supported MCP client (VS Code + GitHub Copilot Chat, Cursor, or Claude Code)
At least one video and one audio API key — see API Keys below

Install

npm install -g video-context-mcp-server

Updating & version check

npm ls -g video-context-mcp-server   # installed version
npm outdated -g video-context-mcp-server        # check for updates
npm install -g video-context-mcp-server@latest  # update

Configure Your MCP Client

You can configure the MCP server globally (for all projects) or at the workspace level.

Global configuration

Open VS Code → Settings → MCP and add the server configuration.

Workspace-level configuration

Create or update .vscode/mcp.json in your workspace.

Important: This file contains sensitive API keys. Never commit it to source control. Ensure it is added to your .gitignore file.

Minimal configuration

You only need one video API key to get started. The example below uses Gemini, which has a free tier and requires no credit card:

{
  "servers": {
    "videoMcp": {
      "type": "stdio",
      "command": "video-context-mcp",
      "env": {
        "GEMINI_API_KEY": "your-gemini-key"
      }
    }
  }
}

Full configuration (all providers)

Set all keys to enable the full fallback chain. If one provider is unavailable or rate-limited, the server automatically tries the next:

{
  "servers": {
    "videoMcp": {
      "type": "stdio",
      "command": "video-context-mcp",
      "env": {
        "GEMINI_API_KEY": "your-gemini-key",
        "Z_AI_API_KEY": "your-zai-key",
        "DASHSCOPE_API_KEY": "your-dashscope-key",
        "MOONSHOT_API_KEY": "your-moonshot-key",
        "MIMO_API_KEY": "your-mimo-key",
        "DEEPGRAM_API_KEY": "your-deepgram-key",
        "ASSEMBLYAI_API_KEY": "your-assemblyai-key",
        "GROQ_API_KEY": "your-groq-key",
        "MINIMAX_API_KEY": "your-minimax-key"
      }
    }
  }
}

Open Copilot Chat — the MCP server starts automatically when tools are needed.

You can configure the MCP server globally (for all projects) or at the project level. The configuration format is the same for both:

{
  "mcpServers": {
    "videoMcp": {
      "command": "video-context-mcp",
      "env": {
        "GEMINI_API_KEY": "your-gemini-key",
        "Z_AI_API_KEY": "your-zai-key",
        "DASHSCOPE_API_KEY": "your-dashscope-key",
        "MOONSHOT_API_KEY": "your-moonshot-key",
        "MIMO_API_KEY": "your-mimo-key",
        "DEEPGRAM_API_KEY": "your-deepgram-key",
        "ASSEMBLYAI_API_KEY": "your-assemblyai-key",
        "GROQ_API_KEY": "your-groq-key",
        "MINIMAX_API_KEY": "your-minimax-key"
      }
    }
  }
}

Global configuration

Open Cursor → Settings → MCP and add the server configuration above.

Project-level configuration

Create a .cursor/mcp.json (or .mcp.json) in your project root with the configuration above.

Important: This file contains sensitive API keys. Never commit it to source control. Ensure it is added to your .gitignore file.

Option A — CLI (claude mcp add)

claude mcp add \
  --env GEMINI_API_KEY=your-gemini-key \
  --env Z_AI_API_KEY=your-zai-key \
  --env DASHSCOPE_API_KEY=your-dashscope-key \
  --env MOONSHOT_API_KEY=your-moonshot-key \
  --env MIMO_API_KEY=your-mimo-key \
  --env DEEPGRAM_API_KEY=your-deepgram-key \
  --env ASSEMBLYAI_API_KEY=your-assemblyai-key \
  --env GROQ_API_KEY=your-groq-key \
  videoMcp -- video-context-mcp

Verify with claude mcp list.

Option B — project-level .mcp.json

Create .mcp.json in your project root:

Important: This file contains sensitive API keys. Never commit it to source control. Ensure it is added to your .gitignore file.

{
  "mcpServers": {
    "videoMcp": {
      "command": "video-context-mcp",
      "env": {
        "GEMINI_API_KEY": "your-gemini-key",
        "Z_AI_API_KEY": "your-zai-key",
        "DASHSCOPE_API_KEY": "your-dashscope-key",
        "MOONSHOT_API_KEY": "your-moonshot-key",
        "MIMO_API_KEY": "your-mimo-key",
        "DEEPGRAM_API_KEY": "your-deepgram-key",
        "ASSEMBLYAI_API_KEY": "your-assemblyai-key",
        "GROQ_API_KEY": "your-groq-key"
      }
    }
  }
}

Antigravity stores its MCP configuration at ~/.gemini/antigravity/mcp_config.json.

Option A — MCP Store UI

Click the ... dropdown at the top of the Antigravity agent panel and open the MCP Store.
Click Manage MCP Servers → View raw config.
Add the following entry to your mcpServers object:

Option B — Direct Edit

Open ~/.gemini/antigravity/mcp_config.json and add videoMcp:

{
  "mcpServers": {
    "videoMcp": {
      "command": "video-context-mcp",
      "env": {
        "GEMINI_API_KEY": "your-gemini-key",
        "Z_AI_API_KEY": "your-zai-key",
        "DASHSCOPE_API_KEY": "your-dashscope-key",
        "MOONSHOT_API_KEY": "your-moonshot-key",
        "MIMO_API_KEY": "your-mimo-key",
        "DEEPGRAM_API_KEY": "your-deepgram-key",
        "ASSEMBLYAI_API_KEY": "your-assemblyai-key",
        "GROQ_API_KEY": "your-groq-key",
        "MINIMAX_API_KEY": "your-minimax-key"
      }
    }
  }
}

Codex stores MCP configuration in ~/.codex/config.toml (global) or .codex/config.toml in a trusted project root.

Option A — CLI

codex mcp add videoMcp \
  --env GEMINI_API_KEY=your-gemini-key \
  --env Z_AI_API_KEY=your-zai-key \
  --env DASHSCOPE_API_KEY=your-dashscope-key \
  --env MOONSHOT_API_KEY=your-moonshot-key \
  --env MIMO_API_KEY=your-mimo-key \
  --env DEEPGRAM_API_KEY=your-deepgram-key \
  --env ASSEMBLYAI_API_KEY=your-assemblyai-key \
  --env GROQ_API_KEY=your-groq-key \
  --env MINIMAX_API_KEY=your-minimax-key \
  -- video-context-mcp

Verify with codex mcp --help or type /mcp inside the Codex TUI.

Option B — config.toml

Add the following to ~/.codex/config.toml (or a project-scoped .codex/config.toml):

[mcp_servers.videoMcp]
command = "video-context-mcp"

[mcp_servers.videoMcp.env]
GEMINI_API_KEY = "your-gemini-key"
Z_AI_API_KEY = "your-zai-key"
DASHSCOPE_API_KEY = "your-dashscope-key"
MOONSHOT_API_KEY = "your-moonshot-key"
MIMO_API_KEY = "your-mimo-key"
DEEPGRAM_API_KEY = "your-deepgram-key"
ASSEMBLYAI_API_KEY = "your-assemblyai-key"
GROQ_API_KEY = "your-groq-key"
MINIMAX_API_KEY = "your-minimax-key"

Important: This file contains sensitive API keys. Never commit it to source control. Ensure it is added to your .gitignore file.

Use npx -y video-context-mcp-server@latest as the command instead of video-context-mcp. This adds a startup delay (npm registry check) but self-updates automatically.

VS Code example:

{
  "servers": {
    "videoMcp": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "video-context-mcp-server@latest"],
      "env": {
        "GEMINI_API_KEY": "your-gemini-key",
        "Z_AI_API_KEY": "your-zai-key",
        "DASHSCOPE_API_KEY": "your-dashscope-key",
        "MOONSHOT_API_KEY": "your-moonshot-key",
        "MIMO_API_KEY": "your-mimo-key",
        "DEEPGRAM_API_KEY": "your-deepgram-key",
        "ASSEMBLYAI_API_KEY": "your-assemblyai-key",
        "GROQ_API_KEY": "your-groq-key",
        "MINIMAX_API_KEY": "your-minimax-key"
      }
    }
  }
}

Claude Code:

claude mcp add \
  --env GEMINI_API_KEY=your-gemini-key \
  --env Z_AI_API_KEY=your-zai-key \
  --env DASHSCOPE_API_KEY=your-dashscope-key \
  --env MOONSHOT_API_KEY=your-moonshot-key \
  --env MIMO_API_KEY=your-mimo-key \
  --env DEEPGRAM_API_KEY=your-deepgram-key \
  --env ASSEMBLYAI_API_KEY=your-assemblyai-key \
  --env GROQ_API_KEY=your-groq-key \
  --env MINIMAX_API_KEY=your-minimax-key \
  videoMcp -- npx -y video-context-mcp-server@latest

Tip: Set keys for all providers you can access — this maximises the fallback chain. If the default provider is unavailable or rate-limited, the server automatically retries with the next available one. At minimum you need one video key and one audio key, but more is better.

Available Tools

| Tool | Description | Key Parameters | | ----------------------- | ------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------- | | analyze_video | Ask questions about video content | videoPath, question, provider? | | summarize_video | Generate a structured video summary | videoPath, provider? | | extract_frames | Extract frames from a video | videoPath, mode, count/intervalSec/timestamps/sceneThreshold, maxImages?, offset? | | search_timestamp | Find when something specific happens | videoPath, query, provider? | | get_video_info | Get video metadata | videoPath | | transcribe_video | Transcribe audio/speech from a video | videoPath, provider?, language?, diarize?, translate?, outputFormat? | | redact_sensitive ⚠️ | Blur, pixelate, or blackout sensitive regions in a video (experimental) | videoPath, effect?, regions?, detectionMode?, previewOnly?, provider? | | text_to_speech ⚠️ | Convert text to natural speech audio (experimental) | text, model?, voice_id?, speed?, vol?, pitch?, emotion?, format? | | generate_image ⚠️ | Generate images from a text prompt (experimental) | prompt, model?, aspect_ratio?, n?, prompt_optimizer? | | generate_video ⚠️ | Generate video from text/image prompt (experimental) | prompt?, model?, first_frame_image?, duration?, resolution? | | generate_music ⚠️ | Generate music from prompt + lyrics (experimental) | prompt, lyrics?, model?, is_instrumental?, lyrics_optimizer? | | query_generation_task | Poll async generation task status + download result | task_id |

All tools accept local file paths, file:// URIs, and remote http(s) URLs. Remote videos are downloaded automatically.

Usage Examples

Local files

"Analyze the video at ./demo.mp4 and tell me what happens in it"

"Summarize ./long-video.mp4"

"Extract 5 evenly-spaced frames from ./recording.mp4"

"Extract frames at scene changes from ./video.mp4" (set mode: scene)

"At what timestamp does the person wave in ./clip.mp4?"

"Transcribe ./interview.mp4 with speaker diarization using AssemblyAI"

"Transcribe this Spanish video and translate it to English: ./video.mp4"

"Generate SRT subtitles for ./interview.mp4" (set outputFormat: srt)

"Export a WebVTT subtitle file from ./talk.mp4" (set outputFormat: vtt)

Remote video files (direct URLs — works for all users)

"Analyze this video: https://example.com/product-demo.mp4"

"Get the metadata for https://cdn.example.com/sample.webm"

Platform videos (YouTube, Vimeo, TikTok, Bilibili, etc. — requires Pro for non-YouTube)

"Summarize this YouTube video: https://www.youtube.com/watch?v=dQw4w9WgXcQ"

"At what point does the speaker mention pricing in https://www.youtube.com/watch?v=abc123?"

"Transcribe the audio from https://vimeo.com/123456789"

Redacting sensitive regions

"Redact the API key visible in ./screen-recording.mp4 at region x=100 y=200 width=400 height=30"

"Preview the redaction before applying it: use previewOnly: true with ./recording.mp4"

"Blur the password field in ./demo.mp4 (region x=50 y=300 width=250 height=40) from 5s to 20s"

"Auto-detect and blur sensitive info in ./screen-recording.mp4" (set detectionMode: ai, allowRemoteDetection: true)

API Keys

If you need a MiniMax API key, get it from the MiniMax interface key page. It is used for the media generation tools: text-to-speech, image generation, video generation, music generation, and generation task polling.

Note: MiniMax media generation also requires account balance. Top up or recharge at https://platform.minimax.io/user-center/payment/balance.

Video Providers

Set all keys to get the full fallback chain. The server will try Gemini first, then MiniMax-M3, then Kimi, then Qwen, then MiMo, then GLM last (the free tier is aggressively rate-limited, so GLM is the last-resort fallback), so having all keys ensures it never gets stuck without a working provider:

| Provider | Key | Link | | ----------------------------------------- | ------------------- | --------------------------------------------------------------------------------------------- | | Gemini 3.5 Flash (default, free-tier) | GEMINI_API_KEY | Get key | | GLM-4.6V (free-tier) | Z_AI_API_KEY | Get key | | Qwen3.7 (paid) | DASHSCOPE_API_KEY | Get key | | Kimi K2.6 (paid) | MOONSHOT_API_KEY | Get key | | MiMo-V2.5 (paid) | MIMO_API_KEY | Get key | | MiniMax-M3 (paid) | MINIMAX_API_KEY | Get key |

When a provider's API key is missing or its API call fails at runtime, tools automatically fall back through the remaining providers in priority order, starting from the configured default. GLM is last in the chain — its free tier applies aggressive rate-limit throttling, so all other providers are preferred when their keys are available. Kimi is placed ahead of Qwen because Kimi accepts native file upload up to 100 MB, so large local videos work out-of-the-box without setting up the optional S3 relay (which most users skip):

Gemini default (standard): Gemini → MiniMax-M3 → Kimi → Qwen → MiMo → GLM
MiniMax-M3 default: MiniMax-M3 → Gemini → Kimi → Qwen → MiMo → GLM
Kimi default: Kimi → Gemini → MiniMax-M3 → Qwen → MiMo → GLM
Qwen default: Qwen → Gemini → MiniMax-M3 → Kimi → MiMo → GLM
MiMo default: MiMo → Gemini → MiniMax-M3 → Kimi → Qwen → GLM
GLM default: GLM → Gemini → MiniMax-M3 → Kimi → Qwen → MiMo

Audio Providers

Similarly, set all audio keys so transcription always has a fallback provider available:

| Provider | Key | Link | | ----------------------------------------- | -------------------- | ----------------------------------------------- | | Deepgram (default, $200 free credits) | DEEPGRAM_API_KEY | Get key | | AssemblyAI ($50 free credits) | ASSEMBLYAI_API_KEY | Get key | | Groq/Whisper (free-tier) | GROQ_API_KEY | Get key | | Gemini (free-tier) | GEMINI_API_KEY | Reuses the same key as the video provider |

When an audio key is missing or an audio API call fails at runtime, tools automatically fall back through the remaining providers in priority order, starting from the configured default (e.g. with Deepgram default: Deepgram → AssemblyAI → Groq → Gemini).

Provider Comparison

Video Providers

| Feature | Gemini 3.5 Flash (default) | GLM-4.6V | Qwen3.7 | Kimi K2.6 | MiMo-V2.5 | MiniMax-M3 | | -------------- | ---------------------------------------------- | ------------------------------------------------------ | ------------------------------------------------------ | ---------------------------------------------- | ------------------------------------------------------ | ------------------------------------------------------------------------------------------------------ | | Price | Free tier available | Free tier available (GLM-4.6V-Flash) | $0.50 input / $3.00 output per 1M tokens | $0.60 input / $3.00 output per 1M tokens | $0.40 input / $2.00 output per 1M tokens | $0.30 / $1.20 (effective; permanent 50% off) ≤512K | | Video formats | mp4, mpeg, mov, avi, flv, mpg, webm, wmv, 3gpp | mp4, avi, mov, wmv, webm, m4v | mp4, avi, mov, wmv, webm, m4v | mp4, mpeg, mov, avi, flv, mpg, webm, wmv, 3gpp | mp4, mov, avi, wmv | mp4, mov, avi, etc. (image frames; ≤100 per request) | | Context window | 1M tokens | 128K tokens | 1M tokens | 256K tokens | 256K tokens | 1M tokens (built for long-video understanding) | | Max file size | 2 GB | ~12 MB base64 / frames fallback / unlimited w/ S3↓ | ~10 MB base64 / frames fallback / unlimited w/ S3↓ | 100 MB | ~10 MB base64 / frames fallback / unlimited w/ S3↓ | 100 MB image batch budget / frames fallback | | Best for | Default — free, no card required | Free, no card required | SOTA agentic coding | Paid — broadest format support | Paid — thinking mode; multimodal | Paid — best for long videos (1M context, 100-frame budget, prompt caching 5× cheaper on follow-up) |

Gemini 3.5 Flash is the default — it offers a free tier with no credit card required, 1M context window, and 2 GB file support. MiniMax-M3 is the second fallback — paid, but has the largest context window (1M tokens) and 100-image-per-request budget, making it the best long-video fallback. Kimi K2.6 is the third fallback — paid, with native file upload up to 100 MB so large local videos work without S3 relay. Qwen3.7 is the fourth fallback — paid at $0.50 input / $3.00 output per 1M tokens with SOTA agentic coding performance (capped at 10 MB base64 — large files require S3 relay or frame fallback). MiMo-V2.5 is the fifth fallback — Xiaomi's multimodal model with thinking mode support ($0.40 input / $2.00 output per 1M tokens). GLM-4.6V is the last-resort fallback — also free with no card required, but the free tier applies aggressive rate-limit throttling, so all other providers are preferred when their keys are available.

Set VIDEO_MCP_DEFAULT_PROVIDER=gemini, glm, qwen, kimi, mimo, or m3 to change the default provider used when a tool call does not pass provider. If a tool call includes provider, that per-call value takes precedence.

ℹ️ Large local files with GLM, Qwen, or MiMo: All three providers have a 10–12 MB base64 limit for local files. When a file exceeds this limit, the server first tries to fall back to an upload-capable provider (Gemini or Kimi) if one is available in the fallback chain. Frame-based analysis (evenly-spaced keyframes sent as images) is used only as a last resort when no upload-capable provider is available, or when all upload-capable providers fail at runtime — no configuration needed. For the highest quality with large local videos, set up the optional S3 relay (below) — GLM, Qwen, and MiMo will receive a presigned URL to the full video, bypassing the limit entirely and taking priority over both fallbacks.

GLM-4.6V, Qwen3.7, and MiMo-V2.5 all accept direct video URLs, but base64-encoding a local file caps out at 10–12 MB. Above that limit, the server first tries to fall back to an upload-capable provider (Gemini or Kimi) if one is available, then falls back to frame-based analysis as a last resort. For the best results on large local videos, set AWS_S3_BUCKET — the server uploads the full video to S3 and passes a presigned URL to GLM, Qwen, and MiMo, bypassing the base64 limit entirely and taking priority over both fallbacks. No manual upload step needed.

Why S3 works

GLM, Qwen, and MiMo require the serving endpoint to provide Content-Length and Content-Type headers alongside the video. AWS S3 presigned URLs include both headers automatically.

Prerequisites

Before setting up the S3 relay, you'll need an AWS account and access credentials.

1. Create an AWS account

Go to aws.amazon.com and click Create an AWS Account.
Enter your email address, a password, and an AWS account name.
Choose the Basic Support — Free plan (sufficient for S3 relay usage).
Fill in your contact and billing information. A valid credit or debit card is required, but S3 usage within the free tier costs nothing.
Verify your identity via phone call or SMS.
Once confirmed, sign in to the AWS Management Console.

New AWS accounts include a 12-month Free Tier with 5 GB of S3 storage and 20,000 GET requests per month — more than enough for typical video analysis workflows.

2. Get your AWS Access Key ID and Secret Access Key

The S3 relay needs programmatic access to your S3 bucket. You'll create an IAM user with limited permissions:

In the AWS Console, search for IAM in the top search bar and open the IAM dashboard.
Click Users in the left sidebar, then Create user.
Enter a user name (e.g., video-mcp-s3) and click Next.
Under Permissions options, select Attach policies directly.

Click Create policy — this opens a new tab:

Switch to the JSON tab and paste the following minimum-permission policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:GetObject",
        "s3:DeleteObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::your-globally-unique-bucket-name",
        "arn:aws:s3:::your-globally-unique-bucket-name/*"
      ]
    }
  ]
}

Replace your-globally-unique-bucket-name with your actual globally unique bucket name (you'll create it in the next step).
Click Next, give the policy a name like VideoMcpS3Access, then Create policy.

Go back to the user creation tab, click the refresh icon, search for VideoMcpS3Access, select it, and click Next → Create user.
Open the newly created user, go to the Security credentials tab.
Under Access keys, click Create access key.
On Access key best practices & alternatives, choose Other or the closest equivalent programmatic/local-code option shown in your console, then click Next.
Optionally add a description tag, then click Create access key.
Copy and save the Access Key ID and Secret Access Key — you won't be able to see the secret key again after closing this dialog.

Never commit your Secret Access Key to version control or share it publicly. Only add it to your local MCP configuration or AWS credentials file.

3. (Optional) Install and configure the AWS CLI

The AWS CLI is only needed if you want to create buckets from the terminal or use the ~/.aws/credentials method instead of environment variables. If you plan to add credentials directly to your MCP env block, you can skip this step.

Install the AWS CLI

Windows: Download the installer from aws.amazon.com/cli or run:
```
winget install Amazon.AWSCLI
```

macOS:

curl "https://awscli.amazonaws.com/AWSCLIV2.pkg" -o "AWSCLIV2.pkg"
sudo installer -pkg AWSCLIV2.pkg -target /

Linux:

curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install

Configure your credentials

Run the following command and paste your Access Key ID and Secret Access Key when prompted:

aws configure

You'll be asked for:

AWS Access Key ID — paste the key you saved earlier
AWS Secret Access Key — paste the secret key you saved earlier
Default region name — enter your preferred region (e.g., us-east-1)
Default output format — press Enter for json

This stores your credentials in ~/.aws/credentials and ~/.aws/config, which the MCP server reads automatically.

One-time setup

1. Create an S3 bucket

If you installed the AWS CLI:

aws s3 mb s3://your-globally-unique-bucket-name

Or create it manually in the S3 Console:

Open the S3 dashboard and click Create bucket.
Enter a globally unique bucket name (e.g., your-globally-unique-bucket-name).
Choose the AWS Region you want to use. This should match AWS_REGION in your MCP config or AWS CLI profile.
Leave Block all public access enabled. The bucket does not need to be public — the server uses presigned URLs.
Keep the default Object Ownership setting (ACLs disabled / Bucket owner enforced).
Leave the remaining settings at their defaults, then click Create bucket.

You do not need to add a bucket policy or make objects public. A private bucket works fine because the MCP server generates time-limited presigned URLs for each uploaded video.

2. Add `AWS_S3_BUCKET` to your MCP config

VS Code (.vscode/mcp.json)

{
  "servers": {
    "videoMcp": {
      "type": "stdio",
      "command": "video-context-mcp",
      "env": {
        "AWS_S3_BUCKET": "your-globally-unique-bucket-name",
        "GEMINI_API_KEY": "your-gemini-key"
      }
    }
  }
}

AWS Credential Resolution

The server resolves AWS credentials in this order — you only need to configure one:

Environment variables — add directly to your MCP env block (no AWS CLI needed):

"AWS_S3_BUCKET": "your-globally-unique-bucket-name",
"AWS_ACCESS_KEY_ID": "AKIA...",
"AWS_SECRET_ACCESS_KEY": "your-secret-key",
"AWS_REGION": "us-east-1"

~/.aws/credentials — if the AWS CLI is configured, credentials are picked up automatically. Only AWS_S3_BUCKET is needed in your MCP config:
```
"AWS_S3_BUCKET": "your-globally-unique-bucket-name"
```
IAM instance role / ECS task role — for AWS-hosted environments.

How it works at runtime

Every time you analyze a local video (or a platform download like YouTube) with GLM, Qwen, or MiMo:

The server detects the file is too large for base64 encoding.
The file is uploaded to s3://your-globally-unique-bucket-name/video-mcp-relay/<uuid>.<ext>.
A presigned URL (valid for 1 hour) is passed to the AI provider.
The provider downloads the video directly from S3.
The object is kept in the bucket for reuse within the same session.

Cleanup: Relayed S3 objects are deleted automatically when the MCP server session ends. Orphaned objects from crashed sessions are swept at next startup.

To keep objects in the bucket for reuse across sessions (useful for large files you analyze repeatedly):

"AWS_S3_RELAY_CLEANUP": "false"

Cost: AWS S3 free tier covers 5 GB storage + 20K GET requests/month for 12 months. After the free tier, storage costs roughly $0.023/GB/month — negligible for most use cases.

Manual presigned URLs (alternative)

You can also pass a presigned URL directly to any tool without configuring the relay:

aws s3 cp my-video.mp4 s3://your-globally-unique-bucket-name/my-video.mp4
aws s3 presign s3://your-globally-unique-bucket-name/my-video.mp4 --expires-in 3600
# → https://your-globally-unique-bucket-name.s3.amazonaws.com/my-video.mp4?X-Amz-...

Then pass the URL directly to analyze_video, summarize_video, or any other tool.

Audio Providers

| Feature | Deepgram (default) | AssemblyAI | Groq/Whisper | Gemini | | ------------------- | ---------------------------- | ------------------------ | ----------------------- | -------------------------- | | Price | Paid ($200 free credits) | Paid ($50 free credits) | Free tier available | Free tier available | | Speaker diarization | Yes | Yes | No | No | | English Translation | No | No | Yes | Yes | | Best for | Default (fast, accurate) | High-quality diarization | Free/cost-conscious use | Users already using Gemini |

Note: "Translation" here means speech-to-English conversion (output English regardless of the spoken language). It is distinct from multilingual transcription — Deepgram and AssemblyAI both support transcribing dozens of languages natively, but they always output in the source language. Use Groq or Gemini when you need an English transcript of non-English audio.

Set AUDIO_MCP_DEFAULT_PROVIDER to change the default.

Environment Variables

Core

| Variable | Description | Default | | ---------------------------- | --------------------------------------------------------------------------------------------------------------------- | -------- | | Z_AI_API_KEY | Z.AI API key for GLM-4.6V | — | | DASHSCOPE_API_KEY | Alibaba Cloud API key for Qwen3.7 | — | | MOONSHOT_API_KEY | Moonshot AI API key for Kimi K2.6 | — | | GEMINI_API_KEY | Google API key for Gemini | — | | MIMO_API_KEY | Xiaomi MiMo API key for MiMo-V2.5 | — | | VIDEO_MCP_DEFAULT_PROVIDER | Default video provider used when a tool call does not pass provider; a per-call provider argument can override it | gemini |

S3 Relay

| Variable | Description | Default | | ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- | | AWS_S3_BUCKET | S3 bucket name for automatic video relay. When set, local videos (and platform downloads) are uploaded to S3 and GLM/Qwen receive a presigned URL — bypassing the 10–12 MB base64 limit. | — | | AWS_ACCESS_KEY_ID | AWS access key ID. Required if you are not using ~/.aws/credentials or an IAM role. | — | | AWS_SECRET_ACCESS_KEY | AWS secret access key. Required alongside AWS_ACCESS_KEY_ID. | — | | AWS_REGION | AWS region for the S3 bucket (e.g. us-east-1). Required if not set in ~/.aws/config. | — | | AWS_S3_RELAY_CLEANUP | Set to false to keep relayed S3 objects in the bucket for reuse across sessions. Default: relayed objects are deleted when the MCP server session ends and orphaned objects are swept at startup. | true |

Audio

| Variable | Description | Default | | ------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------- | | DEEPGRAM_API_KEY | Deepgram API key | — | | ASSEMBLYAI_API_KEY | AssemblyAI API key | — | | GROQ_API_KEY | Groq API key for Whisper transcription | — | | AUDIO_MCP_DEFAULT_PROVIDER | Default audio provider: deepgram, assemblyai, groq, or gemini. Falls back in that order when the selected key is missing. | deepgram | | AUDIO_ENHANCE_VIDEO_ANALYSIS | Inject audio transcripts into analyze_video/summarize_video prompts (GLM/Kimi/Qwen/MiMo/M3). auto = transcribe when audio track detected; true = always; false = disabled. | auto |

Media Generation (MiniMax)

MiniMax media generation requires account balance. Top up or recharge at https://platform.minimax.io/user-center/payment/balance.

| Variable | Description | Default | | ---------------------------- | -------------------------------------------- | --------------------------- | | MINIMAX_API_KEY | MiniMax API key for all generation tools | — (required for generation) | | MINIMAX_BASE_URL | Override MiniMax API base URL | https://api.minimax.io | | MINIMAX_REQUEST_TIMEOUT_MS | Client-side timeout for MiniMax API requests | 240000 (4 min) |

Caching

Downloaded videos, extracted frames, audio tracks, and transcripts are cached together in a persistent per-video directory. Subsequent tool calls that reference the same video reuse cached artifacts instead of re-running ffmpeg, re-downloading, or re-calling audio provider APIs.

| Variable | Description | Default | | ----------------------------- | ----------------------------------------------------------------------------------------------------------------------------- | ----------------- | | VIDEO_MCP_CACHE_TTL_MINUTES | How long video artifacts (video file, frames, audio, transcripts) are cached across tool calls (minutes). Set 0 to disable. | 43200 (30 days) | | VIDEO_MCP_CACHE_MAX_ENTRIES | Max entries in the artifact cache. LRU eviction at the video level. Set 0 for unbounded. | 100 | | VIDEO_MCP_CACHE_MAX_MB | Max total disk size for the cache (megabytes). LRU eviction when the limit is reached. Set 0 to disable. | 5120 (5 GB) |

Cache Storage Location

The cache is stored in a video-mcp-cache folder within your system's temporary directory:

Windows: %TEMP%\video-mcp-cache
(e.g., C:\Users\<you>\AppData\Local\Temp\video-mcp-cache)
macOS: $TMPDIR/video-mcp-cache
(e.g., /var/folders/xx/yyyy/T/video-mcp-cache)
Linux: /tmp/video-mcp-cache

Automatic Cleanup

The server automatically manages the cache by:

Startup Sweep: Removing any cache entries older than the TTL at server startup.
LRU Eviction: Evicting the least-recently-used video entry (all its artifacts together) when the VIDEO_MCP_CACHE_MAX_ENTRIES limit is reached.
Size Eviction: Evicting the least-recently-used video entry when the total cache size exceeds VIDEO_MCP_CACHE_MAX_MB (default 5 GB).

Cache Management CLI

video-context-mcp (or the short alias vmcp) ships with built-in cache management commands. Run them from any terminal — they complete immediately and do not start the MCP server.

# Show cache location, size, entry count, TTL, and per-entry breakdown
vmcp cache status

# Print the cache directory path (useful for scripting)
vmcp cache path
cd "$(vmcp cache path)"

# Copy the entire cache tree to ./video-mcp-cache (or a custom path)
vmcp cache copy

# macOS / Linux — absolute path
vmcp cache copy --dest=/Users/alice/projects/cache-backup
vmcp cache copy --dest=/tmp/cache-backup

# Windows — absolute path (both \ and / are supported)
vmcp cache copy --dest=C:\Users\alice\projects\cache-backup
vmcp cache copy --dest=C:/Users/alice/projects/cache-backup

# Relative path (resolved from CWD — works on all platforms)
vmcp cache copy --dest=./backup
vmcp cache copy --dest=../sibling/cache

# Paths with spaces — wrap the value in quotes
vmcp cache copy --dest="/Users/alice/My Projects/cache-backup"
vmcp cache copy --dest="C:\Users\alice\My Documents\cache-backup"

# Remove only entries older than the TTL (defaults to 30 days if not set)
vmcp cache clear:expired

# If you've set a custom TTL in mcp.json, pass it explicitly — CLI commands
# run as a separate process and don't inherit env vars from mcp.json
VIDEO_MCP_CACHE_TTL_MINUTES=10 vmcp cache clear:expired

# Delete ALL cache entries — prompts for confirmation
vmcp cache clear:all
vmcp cache clear:all --yes   # skip confirmation

Video Summarization

| Variable | Description | Default | | ---------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | --------------------- | | VIDEO_MCP_MAX_FRAMES | Max keyframes for summarize_video (GLM/Qwen/MiMo always; Kimi only when video > 100 MB; Gemini always uploads full video). Free: clamped 5–50. Pro: default 100; set 0 for uncapped. Either way, trailing frames are automatically dropped if the total payload exceeds the provider's size limit (12 MB for GLM, 10 MB for Qwen/MiMo, 80 MB for Kimi's frame fallback). | 50 free / 100 pro |

Qwen

| Variable | Description | Default | | ------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------- | | QWEN_BASE_URL | Override the DashScope API endpoint. Use for regional routing: Singapore (default), Virginia (https://dashscope-us.aliyuncs.com/compatible-mode/v1/chat/completions), or Beijing (https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions). | https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions | | QWEN_REQUEST_TIMEOUT_MS | Client-side timeout (ms) for Qwen API requests. DashScope applies a server-side limit of ~5 minutes, so values above ~285000 are unlikely to help. Increase to e.g. 270000 for extra headroom on large frame payloads. | 240000 (4 min) |

GLM

| Variable | Description | Default | | ------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------- | | GLM_REQUEST_TIMEOUT_MS | Client-side timeout (ms) for GLM API requests. Z.AI has no hard server-side streaming timeout, so raising this value (e.g. 480000 for 8 minutes) can help when analysing long videos with many frames. | 240000 (4 min) |

MiMo

| Variable | Description | Default | | ------------------------- | ------------------------------------------------------------------------------ | ---------------- | | MIMO_REQUEST_TIMEOUT_MS | Client-side timeout (ms) for MiMo API requests. Default: 240000 (4 minutes). | 240000 (4 min) |

M3 (MiniMax-M3)

| Variable | Description | Default | | ----------------------- | ------------------------------------------------------------------------------------------------------------------------- | ---------------- | | M3_REQUEST_TIMEOUT_MS | Client-side timeout (ms) for MiniMax-M3 API requests. MiniMax-M3 is image-only (100 images per request, 10 MB per image). | 240000 (4 min) |

yt-dlp (Platform Downloads)

| Variable | Description | Default | | ----------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- | | YT_DLP_MAX_RESOLUTION | Max video height (px) for yt-dlp downloads. Free users capped at 720. | 720 | | YT_DLP_PATH | Path to a custom yt-dlp binary. If unset, the server auto-detects common installs first and then uses the bundled binary from youtube-dl-exec if it is present. | — | | YT_DLP_COOKIES_FILE | Path to a Netscape-format cookies file for authenticated downloads (age-restricted, private videos). | — | | YT_DLP_IMPERSONATE | Browser to impersonate (e.g. chrome, firefox). Needed for sites like Udemy that block non-browser user-agents; requires a yt-dlp build with curl_cffi support. | — |

Pro

Complete Example Config

Warning: This file contains sensitive API keys. Never commit it to source control. Make sure .vscode/mcp.json is in your .gitignore.

{
  "servers": {
    "videoMcp": {
      "type": "stdio",
      "command": "video-context-mcp",
      "env": {
        "GEMINI_API_KEY": "your-gemini-key",
        "Z_AI_API_KEY": "your-zai-key",
        "DASHSCOPE_API_KEY": "your-dashscope-key",
        "MOONSHOT_API_KEY": "your-moonshot-key",
        "MIMO_API_KEY": "your-mimo-key",
        "DEEPGRAM_API_KEY": "your-deepgram-key",
        "ASSEMBLYAI_API_KEY": "your-assemblyai-key",
        "GROQ_API_KEY": "your-groq-key",
        "VIDEO_MCP_DEFAULT_PROVIDER": "gemini",
        "AUDIO_MCP_DEFAULT_PROVIDER": "deepgram",
        "AUDIO_ENHANCE_VIDEO_ANALYSIS": "auto",
        "VIDEO_MCP_CACHE_TTL_MINUTES": "43200",
        "VIDEO_MCP_CACHE_MAX_ENTRIES": "100",
        "VIDEO_MCP_CACHE_MAX_MB": "5120",
        "VIDEO_MCP_MAX_FRAMES": "50",
        "YT_DLP_MAX_RESOLUTION": "720",
        "YT_DLP_PATH": "/usr/local/bin/yt-dlp",
        "YT_DLP_COOKIES_FILE": "/path/to/cookies.txt",
        "YT_DLP_IMPERSONATE": "chrome",
        "AWS_S3_BUCKET": "your-globally-unique-bucket-name",
        "AWS_ACCESS_KEY_ID": "AKIA...",
        "AWS_SECRET_ACCESS_KEY": "your-secret-key",
        "AWS_REGION": "us-east-1",
        "AWS_S3_RELAY_CLEANUP": "true",
        "QWEN_BASE_URL": "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions",
        "QWEN_REQUEST_TIMEOUT_MS": "240000",
        "GLM_REQUEST_TIMEOUT_MS": "240000",
        "MIMO_REQUEST_TIMEOUT_MS": "240000",
        "M3_REQUEST_TIMEOUT_MS": "240000",
        "VIDEO_MCP_LICENSE_KEY": "your-license-key"
      }
    }
  }
}

Remove or omit any variables you don't need — unset optional vars fall back to their defaults.

The server first checks YT_DLP_PATH, then common system install locations, and then the bundled yt-dlp binary shipped by youtube-dl-exec.

If you use platform URLs often, install yt-dlp directly:

brew install yt-dlp          # macOS
winget install yt-dlp        # Windows
pip install yt-dlp           # cross-platform

Then set YT_DLP_PATH to the installed binary. If you installed video-context-mcp-server globally and the bundled binary is missing, repair it with:

npm rebuild youtube-dl-exec

To skip the bundled download at install time:

YOUTUBE_DL_SKIP_DOWNLOAD=true npm install -g video-context-mcp-server

Some videos require authentication. Export cookies from your browser and point yt-dlp at them.

Step 1 — Export cookies. Install Get cookies.txt LOCALLY (Chrome/Edge) or cookies.txt (Firefox). Navigate to the platform while logged in, export in Netscape format, and save the file (e.g. ~/cookies-youtube.txt).

Step 2 — Configure. Add YT_DLP_COOKIES_FILE to your MCP config env block:

{
  "env": {
    "YT_DLP_COOKIES_FILE": "C:/Users/you/cookies-youtube.txt",
  },
}

Restart the MCP server. Keep your cookies file private — never commit it to source control.

Pro

The free tier covers all core functionality. A pro license unlocks five power features:

| Feature | Free | Pro | | ------------------------------------------------- | ------------------- | --------------------------------- | | 🖼️ Extended frame extraction (summarize_video) | Capped at 50 frames | Default 100; set 0 for uncapped | | 🌐 Platform video downloads | YouTube only | Almost all video platforms | | 📺 Download resolution | Capped at 720p | Uncapped | | 🗣️ Media generation (TTS, image, video, music) ⚠️ | Not available | Full access (experimental) | | 🔒 Sensitive video redaction ⚠️ | Not available | Full access (experimental) |

Local files and direct http(s):// video URLs work for all users — the platform gate only applies to yt-dlp URLs (YouTube, Vimeo, TikTok, Bilibili, etc.).

When a free-tier limit is reached, the tool falls back gracefully with a notice — it never hard-blocks.

Getting a License

License keys are valid for 1 year.

🚀 LAUNCH PROMO

For a limited time, you can get a PRO license for just $10. The license is valid for 1 year.

Purchase a Pro License on Gumroad

Add VIDEO_MCP_LICENSE_KEY to your MCP config env block and restart the server:

{
  "env": {
    "VIDEO_MCP_LICENSE_KEY": "<your-license-key>"
  }
}

Troubleshooting

video-context-mcp: command not found (or vmcp: command not found) — Make sure Node.js is installed (node -v). Re-run npm install -g video-context-mcp-server, or use npx -y video-context-mcp-server@latest if global binaries aren't on PATH.
MCP server not appearing — Restart the client app after config changes. Validate JSON syntax. For Claude Code, verify with claude mcp list.
Missing API key errors — Set only the keys for providers you use. When a key is missing, tools automatically fall back to the next available provider and include a notice in the response.

Funding

If you find this project helpful, please consider supporting its development:

Solana (SOL)

CWZccD3Ny3XotFZtnkcyzP3hapmu3ExknN1PF4rEvP3u

Development

This section is for contributors building from source. If you're just using the server, follow the Quick Start instead.

From Source

git clone <repo-url> && cd video-mcp
npm install
npm run build

Use this .vscode/mcp.json to run the local build (never commit this file):

{
  "servers": {
    "videoMcp": {
      "type": "stdio",
      "command": "node",
      "args": ["${workspaceFolder}/dist/index.js"],
      "env": {
        "GEMINI_API_KEY": "your-gemini-key",
        "Z_AI_API_KEY": "your-zai-key",
        "DASHSCOPE_API_KEY": "your-dashscope-key",
        "MOONSHOT_API_KEY": "your-moonshot-key",
        "MIMO_API_KEY": "your-mimo-key",
        "VIDEO_MCP_DEFAULT_PROVIDER": "gemini",
        "DEEPGRAM_API_KEY": "your-deepgram-key",
        "ASSEMBLYAI_API_KEY": "your-assemblyai-key",
        "GROQ_API_KEY": "your-groq-key",
        "MINIMAX_API_KEY": "your-minimax-key",
        "AUDIO_MCP_DEFAULT_PROVIDER": "deepgram",
        "AUDIO_ENHANCE_VIDEO_ANALYSIS": "auto",
        "VIDEO_MCP_CACHE_TTL_MINUTES": "43200",
        "VIDEO_MCP_CACHE_MAX_ENTRIES": "100",
        "VIDEO_MCP_CACHE_MAX_MB": "5120",
        "VIDEO_MCP_MAX_FRAMES": "0",
        "YT_DLP_MAX_RESOLUTION": "720",
        "YT_DLP_PATH": "/usr/local/bin/yt-dlp",
        "YT_DLP_COOKIES_FILE": "/path/to/cookies.txt",
        "YT_DLP_IMPERSONATE": "chrome",
        "VIDEO_MCP_LICENSE_KEY": "your-license-key"
      }
    }
  }
}

Debugging in VS Code

Add a dev block to .vscode/mcp.json for auto-restart and debug integration:

"dev": {
  "watch": "src/**/*.ts",
  "debug": { "type": "node" }
}

This causes verbose MCP protocol logs in Output → MCP: videoMcp — that's expected. Remove dev.debug or the full dev block to reduce noise.

License

Credits

MCP SDK by Anthropic
Kimi K2.6 by Moonshot AI
GLM-4.6V by Z.AI
Qwen3.7 by Alibaba Cloud
MiMo-V2.5 by Xiaomi
Deepgram for audio transcription
AssemblyAI for audio transcription
Groq/Whisper for audio transcription
Gemini for audio transcription and as a fallback provider
ffmpeg for video processing

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Video Context MCP Server

Features

Quick Start

Prerequisites

Install

Updating & version check

Configure Your MCP Client

Minimal configuration

Full configuration (all providers)

Available Tools

Usage Examples

API Keys

Video Providers

Audio Providers

Provider Comparison

Video Providers

Why S3 works

Prerequisites

1. Create an AWS account

2. Get your AWS Access Key ID and Secret Access Key

3. (Optional) Install and configure the AWS CLI

One-time setup

1. Create an S3 bucket

2. Add AWS_S3_BUCKET to your MCP config

AWS Credential Resolution

How it works at runtime

Manual presigned URLs (alternative)

Audio Providers

Environment Variables

Core

S3 Relay

Audio

Media Generation (MiniMax)

Caching

Cache Storage Location

Automatic Cleanup

Cache Management CLI

Video Summarization

Qwen

GLM

MiMo

M3 (MiniMax-M3)

yt-dlp (Platform Downloads)

Pro

Complete Example Config

Pro

Getting a License

Troubleshooting

Funding

Development

From Source

Debugging in VS Code

License

Credits

2. Add `AWS_S3_BUCKET` to your MCP config