npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

vidistill

v0.4.4

Published

Video intelligence distiller — extract structured notes, transcripts, and insights from any video using Gemini

Readme

vidistill

Video intelligence distiller — turn any video or audio file into structured notes, transcripts, and insights using Gemini.

Feed it a YouTube URL, local video, or audio file. It analyzes the content through multiple AI passes (scene analysis, transcript, visuals, code extraction, people, chat, implicit signals) and synthesizes everything into organized markdown output.

Install

npm install -g vidistill

Requires Node.js 22+ and ffmpeg.

Usage

vidistill [input] [options]

Arguments:

  • input — YouTube URL, local video, or audio file path (prompted interactively if omitted)

Options:

  • -c, --context — context about the video (e.g. "CS lecture", "product demo")
  • -o, --output — output directory (default: ./vidistill-output/)
  • -l, --lang <code> — output language (e.g. zh, ja, ko, es, fr, de, pt, ru, ar, hi)

Examples:

# Interactive mode — prompts for everything
vidistill

# YouTube video
vidistill "https://youtube.com/watch?v=dQw4w9WgXcQ"

# Local file with context
vidistill ./lecture.mp4 --context "distributed systems lecture"

# Audio file
vidistill ./podcast.mp3

# Custom output directory
vidistill ./demo.mp4 -o ./notes/

# Output in another language
vidistill ./lecture.mp4 --lang zh

API Key

vidistill needs a Gemini API key. It checks these sources in order:

  1. GEMINI_API_KEY environment variable
  2. ~/.vidistill/config.json
  3. Interactive prompt (with option to save for next time)

Get a key at ai.google.dev.

Output

vidistill creates a folder per video with structured files:

vidistill-output/my-video/
├── guide.md           # overview and navigation
├── transcript.md      # full timestamped transcript
├── combined.md        # transcript + visual notes merged
├── notes.md           # meeting/lecture notes
├── code/              # extracted and reconstructed source files
│   ├── *.ext          # individual source files
│   └── code-timeline.md  # code evolution timeline
├── people.md          # speakers and participants
├── chat.md            # chat messages and links
├── action-items.md    # tasks and follow-ups
├── insights.md        # implicit signals and analysis
├── links.md           # all URLs mentioned
├── prereqs.md         # prerequisite knowledge (when detected)
├── timeline.html      # interactive visual timeline
├── metadata.json      # processing metadata
└── raw/               # raw pass outputs

Which files are generated depends on the video content — a coding tutorial gets code/, a meeting gets people.md and action-items.md, etc.

Speaker Naming

When multiple speakers are detected, use rename-speakers to assign real names. Names replace generic labels (SPEAKER_00, SPEAKER_01) in all output files.

To rename speakers:

# Interactive rename — prompts for each speaker
vidistill rename-speakers ./vidistill-output/my-meeting/

# List current speaker state
vidistill rename-speakers ./vidistill-output/my-meeting/ --list

# Quick rename a single speaker
vidistill rename-speakers ./vidistill-output/my-meeting/ --rename "Steven Kang" "Steven K."

# Merge two speakers (e.g. same person on different devices)
vidistill rename-speakers ./vidistill-output/my-meeting/ --merge "K Iphone" "Kristian"

How It Works

Supported video formats: MP4, MOV, WebM, MKV, AVI, MPEG, FLV, WMV, 3GPP. Supported audio formats: MP3, AAC, WAV, FLAC, OGG, M4A.

  1. Input — accepts YouTube URL directly or reads local file (video or audio), compresses if over 2GB
  2. Pass 0 — scene analysis to classify video type and determine processing strategy
  3. Pass 1 — transcript extraction with speaker identification
  4. Pass 2 — visual content extraction (screen states, diagrams, slides)
  5. Pass 3 — specialist passes based on video type:
    • 3c: chat and links (live streams) — per segment, runs 3x with consensus voting
    • 3d: implicit signals (all types) — per segment
    • 3b: people and social dynamics (meetings) — whole video
    • 3a: code reconstruction (coding videos) — whole video, runs 3x with consensus voting and validation
  6. Synthesis — cross-references all passes into unified analysis
  7. Output — generates structured markdown files

Audio files skip visual passes and go straight to transcript, people, implicit signals, and synthesis.

Long videos are segmented automatically. Passes that fail are skipped gracefully.

License

MIT