npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

video-understand

v0.1.0

Published

CLI tool that enables AI coding agents to understand and analyze videos

Readme

video-understand

CLI tool that enables AI agents to understand and analyze videos.

video-understand analyze video.mp4 "What happens in this video?"

Works with local files, YouTube URLs, and HTTP video URLs. Outputs clean markdown. Designed to be invoked by AI agents (Claude Code, Cursor, Copilot, OpenClaw, etc.) via Bash.

Supports multiple AI providers: Gemini (Google) and Kimi (Moonshot AI).

Install

npm install -g video-understand

Requires Node.js 18+.

Authentication

Gemini (default)

Get a Gemini API key from Google AI Studio.

# Option A: Environment variable
export GEMINI_API_KEY="your-key"

# Option B: CLI login
video-understand login --provider gemini --key "your-key"

Kimi (Moonshot AI)

Get a Moonshot API key from platform.moonshot.ai.

# Option A: Environment variable
export MOONSHOT_API_KEY="your-key"

# Option B: CLI login
video-understand login --provider kimi --key "your-key"

Usage

Analyze a video

# Local file (Gemini or Kimi)
video-understand analyze video.mp4 "Describe what happens"
video-understand analyze video.mp4 "Describe what happens" --provider kimi

# YouTube URL (Gemini: no download needed; Kimi: downloads via yt-dlp then uploads)
video-understand analyze "https://www.youtube.com/watch?v=VIDEO_ID" "Summarize this"
video-understand analyze "https://www.youtube.com/watch?v=VIDEO_ID" "Summarize this" --provider kimi

# With timestamps
video-understand analyze video.mp4 "Key moments?" --timestamps

# JSON output
video-understand analyze video.mp4 "Describe" --json

# Save to file
video-understand analyze video.mp4 "Describe" -o analysis.md

# Use a specific model
video-understand analyze video.mp4 "Describe" --model gemini-3-pro-preview
video-understand analyze video.mp4 "Describe" --provider kimi --model kimi-k2.5

Upload + Ask (multi-turn)

# Upload first
video-understand upload video.mp4
video-understand upload video.mp4 --provider kimi

# Ask follow-up questions without re-uploading
video-understand ask "video.mp4" "What color is the car?"
video-understand ask "video.mp4" "How many people appear?" --provider kimi

File management

video-understand list                             # List uploaded files (default provider)
video-understand list --provider kimi             # List Kimi files
video-understand delete "video.mp4"               # Delete by name
video-understand delete "f8csbxsqrz9111fuxjki" --provider kimi  # Delete by file ID

Configuration

video-understand config        # Show current config
video-understand login --provider gemini --key "key"
video-understand login --provider kimi --key "key"

Agent Skill

This package includes an Agent Skill that teaches AI coding agents when and how to use the CLI.

Install the skill

From GitHub (works with 25+ agents):

npx skills add sifr42/video-understand

Manual (Claude Code):

cp -r skills/video-understand ~/.claude/skills/video-understand

The skill provides:

  • Command reference and examples
  • Installation and auth instructions
  • Guidance on when to use each command

Supported Formats

MP4, MPEG, MOV, AVI, FLV, MPG, WebM, WMV, 3GPP, MKV

Providers & Models

| Provider | Model | Default | Use case | |----------|-------|---------|----------| | gemini | gemini-3-flash-preview | ✓ | Fast, cost-effective | | gemini | gemini-3-pro-preview | | Detailed, nuanced analysis | | kimi | kimi-k2.5 | ✓ | Same as gemini models overall but requires yt-dlp for YouTube videos. Install: winget install yt-dlp (Windows), brew install yt-dlp (macOS), sudo apt install yt-dlp (Linux), or uv tool install yt-dlp (cross-platform). |

How It Works

Gemini:

  1. Local files → Content-hashed → Reused if cached, otherwise uploaded → Polled until ready → Analyzed
  2. YouTube / HTTP URLs → Passed directly (native support, no upload)

Kimi:

  1. Local files → Content-hashed → Reused if cached, otherwise uploaded via Moonshot Files API → Analyzed
  2. YouTube / HTTP URLs → Downloaded to a temp file via yt-dlp (YouTube) or fetch (HTTP) → Uploaded → Analyzed → Temp file deleted
  3. Files persist indefinitely (no auto-expiry), but there are some limits on how many files you can upload at once and the total size of all uploaded files. See Kimi's File API documentation for more information.

The CLI handles upload, deduplication, prompt construction, and output formatting. Spinners are TTY-aware — they fall back to simple log lines when invoked by agents.

Upload cache lives at ~/.video-understand/uploads.json — same file content won't be re-uploaded across sessions or directories.

License

MIT