genai-yt
v0.2.1
Published
AI-powered YouTube transcript extractor and analyzer using Claude Code or Cursor CLI
Maintainers
Readme
genai-yt
AI-powered YouTube transcript extractor and analyzer using Claude Code or Cursor CLI.
Features
- Transcript extraction - Extract raw transcripts from YouTube videos
- Speech-to-text fallback - Automatically falls back to yt-dlp + whisper-cpp when subtitles are unavailable
- AI-powered analysis - Summarize, extract insights, translate, or take notes using AI
- Multi-provider support - Claude Code CLI and Cursor CLI
- Custom prompts - Ask any question about a video with
askcommand - Multi-language - Specify transcript language and translation target
How It Works
flowchart TD
A[Start: genai-yt] --> B{Command?}
B -->|transcript| C[Fetch YouTube Transcript]
B -->|summary/insights/translate/memo/ask| D[Fetch YouTube Transcript]
C --> E{Subtitles available?}
D --> F{Subtitles available?}
E -->|Yes| G[Output raw text or timestamped]
E -->|No| H[yt-dlp + whisper-cpp]
H --> G
F -->|Yes| I[Build AI Prompt]
F -->|No| J[yt-dlp + whisper-cpp]
J --> I
I --> K{Provider}
K -->|Claude Code| L[Claude Code CLI]
K -->|Cursor CLI| M[Cursor CLI]
L --> N[Display AI Response]
M --> NPrerequisites
For AI commands, you need at least one of:
- Claude Code CLI - Anthropic's official CLI
- Cursor Agent CLI - Cursor's agent CLI (command:
agent)
Speech-to-Text (optional)
For videos without subtitles, install these tools for automatic fallback:
brew install yt-dlp whisper-cpp ffmpegDownload a Whisper model (one-time setup):
mkdir -p ~/.local/share/whisper-cpp
cd ~/.local/share/whisper-cpp
# Base model (~142MB, fastest)
curl -LO https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.bin
# Or large-v3-turbo (~809MB, most accurate)
curl -LO https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo.binInstallation
# Global installation
npm install -g genai-yt
# Or use directly with npx (no installation required)
npx genai-yt transcript "<url>"Usage
Extract Transcript (No AI)
# Raw transcript text
genai-yt transcript "<url>"
# With timestamps
genai-yt transcript "<url>" --timestamps
# Specify language
genai-yt transcript "<url>" --lang koAI-Powered Commands
All AI commands require -p, --provider option.
# Summarize a video
genai-yt summary "<url>" -p claude-code
# Extract key insights
genai-yt insights "<url>" -p cursor-cli
# Translate transcript
genai-yt translate "<url>" -p claude-code --lang en
# Convert to organized notes
genai-yt memo "<url>" -p claude-code
# Ask a custom question
genai-yt ask "<url>" -p claude-code --prompt "Extract all investment advice from this video"Common Options
# Specify AI model
genai-yt summary "<url>" -p cursor-cli --model claude-4.5-sonnet
# Specify transcript language
genai-yt summary "<url>" -p claude-code --transcript-lang jaCommands
| Command | Description | AI Required |
|---------|-------------|-------------|
| transcript <url> | Extract raw transcript | No |
| summary <url> | Summarize video content | Yes |
| insights <url> | Extract key insights and action items | Yes |
| translate <url> | Translate transcript to target language | Yes |
| memo <url> | Convert to organized notes | Yes |
| ask <url> | Custom prompt-based analysis | Yes |
Options
transcript
| Option | Description | Default |
|--------|-------------|---------|
| --lang <lang> | Transcript language (e.g., ko, en, ja) | auto |
| --timestamps | Include timestamps | false |
AI Commands (summary, insights, translate, memo, ask)
| Option | Description | Default |
|--------|-------------|---------|
| -p, --provider <provider> | AI provider (claude-code or cursor-cli) | required |
| --model <model> | Model to use | provider default |
| --transcript-lang <lang> | Transcript language | auto |
translate (additional)
| Option | Description | Default |
|--------|-------------|---------|
| --lang <lang> | Target language | en |
ask (additional)
| Option | Description | Default |
|--------|-------------|---------|
| --prompt <prompt> | Custom prompt | required |
Transcript Fallback
When YouTube subtitles are not available, genai-yt automatically falls back to local speech-to-text:
- Downloads audio using
yt-dlp(converted to 16kHz WAV) - Transcribes locally using
whisper-cpp(no API key required) - Returns the transcribed text in the same format
The model is searched in these locations (first match wins):
| Priority | Path |
|----------|------|
| 1 | WHISPER_MODEL environment variable |
| 2 | ~/.local/share/whisper-cpp/ggml-large-v3-turbo.bin |
| 3 | ~/.local/share/whisper-cpp/ggml-base.bin |
| 4 | ~/.local/share/whisper-cpp/ggml-small.bin |
| 5 | ~/.local/share/whisper-cpp/ggml-medium.bin |
| 6 | ~/.local/share/whisper-cpp/ggml-large-v3.bin |
Examples
# Get Korean transcript with timestamps
genai-yt transcript "https://youtu.be/dQw4w9WgXcQ" --lang ko --timestamps
# Summarize a tech talk
genai-yt summary "https://www.youtube.com/watch?v=VIDEO_ID" -p claude-code
# Translate Japanese video to English
genai-yt translate "https://youtu.be/VIDEO_ID" -p claude-code --transcript-lang ja --lang en
# Extract investment tips
genai-yt ask "https://youtu.be/VIDEO_ID" -p cursor-cli --prompt "List all actionable investment tips"
# Video without subtitles (auto STT fallback)
genai-yt transcript "https://www.youtube.com/watch?v=VIDEO_ID"Requirements
- Node.js >= 18.0.0
- Claude Code CLI or Cursor CLI installed and authenticated (for AI commands)
- yt-dlp + whisper-cpp + ffmpeg (optional, for videos without subtitles)
License
MIT
