video-note-taking
v0.1.0
Published
Elegant, modular tool to generate notes from videos, subtitles, or plain transcripts.
Maintainers
Readme
Video Note Taking
Elegant, modular tool to generate notes directly from videos, subtitles, or plain transcripts.
Quick start
Global Installation
- Install globally with pnpm:
pnpm i -g video-note-taking- Ensure your Google Gemini API key is set:
export GEMINI_API_KEY=your_key- Run:
vnt input.mp4 -o note.mdAlternative: Local Development
- Install dependencies and build:
pnpm install
pnpm build- Run:
node dist/cli.js input.mp4 -o note.mdUsage
vnt <inputs...> -o note.md [options]
Options:
-o, --output <file> Output file path, or - for stdout (default: "note-<timestamp>.md")
--style <style> Note style (choices: "handwritten", "bulleted", "outline", "concise", default: "handwritten")
-l, --language <lang> Output language, e.g., zh-TW or en
--timestamps Include timestamps when available
--merge Merge multiple inputs into one note
--provider <name> Provider name (default: "google-genai")
--api-key <key> Provider API key (or use env GEMINI_API_KEY)
--model <name> Provider model name
--prompt <text> Custom system instruction override
--format <fmt> Output format (md|txt) (default: "md")
-h, --help Show helpLibrary usage
You can programmatically use the core API:
import { generateNotes } from "video-note-taking";
const { content } = await generateNotes(["input.mp4"], {
style: "handwritten",
language: "English",
providerOptions: { apiKey: process.env.GEMINI_API_KEY },
});Notes
- Provider currently supports Google Gemini via
@google/genai. - Supported video MIME types include: mp4, mov, avi, mpeg/mpg, flv, webm, wmv, 3gpp.
- You can pass large video files directly (up to 2 GB per file via the Gemini Files API). The CLI uploads videos and asks Gemini to understand content and produce notes. Files are stored for ~48 hours by the API.
- If you also pass .srt or .txt, those will be combined as text context. When multiple inputs include videos, the request can include up to 10 videos at once.
--mergeis ignored when any input is a video to ensure video references remain intact.
