notes-to-video
v1.2.0
Published
Turn notes into animated explainer videos in the style popularized by 3Blue1Brown. Claude Code skill + Manim + TTS + ffmpeg.
Maintainers
Readme
notes-to-video
Turn notes (LaTeX, PDF, or plain text) into animated explainer videos in the style popularized by 3Blue1Brown — using Manim, TTS, and ffmpeg.
A Claude Code skill that handles the full pipeline: content extraction, narration writing with cue markers, Manim scene generation with audio-video sync, validation, rendering, and composition.
Demo
Feed it lecture notes, get an animated explainer video. 🔊 Turn sound on — the video has narration.
Section 11.5 — Variational Auto-Encoder CS229 Lecture Notes — Andrew Ng & Tengyu Ma, Stanford University Used as demo input with attribution.
🔊 Sound on. 3 min video, generated from notes in one command. Download
Install
This is a Claude Code skill, not a standalone CLI. The npm package is a one-shot installer that drops the skill files where Claude Code can find them.
1. Install the skill files:
npx notes-to-videoThis copies:
skills/notes-to-video/→~/.claude/skills/notes-to-video/video_utils/→~/tools/video_utils/(shared Python helpers the skill imports)
Re-run any time to upgrade.
2. Set up the Python environment (one-time, shared across projects):
python3 -m venv ~/tools/.venv
~/tools/.venv/bin/pip install manim edge-tts pydub faster-whisper
# Optional: chatterbox-tts (local voice cloning), torch with CUDA3. (Optional) Add API keys for cloud TTS backends:
mkdir -p ~/tools/credentials
cat > ~/tools/credentials/.env <<'EOF'
MINIMAX_API_KEY=...
MINIMAX_GROUP_ID=...
OPENAI_API_KEY=...
EOFAlternative install methods
Claude Code plugin:
/plugin marketplace add cymcymcymcym/notes-to-video
/plugin install notes-to-video@notes-to-video-marketplaceManual: clone this repo, copy skills/notes-to-video/ to ~/.claude/skills/ and video_utils/ to ~/tools/video_utils/.
Quick Start
Open Claude Code in your project and ask:
make a 3b1b-style video from my_notes.texClaude will:
- Extract key concepts from your notes
- Write a narration script with cue markers
- Generate Manim scenes synced to the narration
- Validate all scenes for visual issues
- Hand you the build command
Features
- Notes to video pipeline — feed in LaTeX, PDF, or plain text notes, get animated explainer videos
- Audio-video sync — cue-based system that synchronizes Manim animations to narration timestamps
- CText kerning fix — workaround for Manim's broken Pango kerning (manim #2844)
- 4 TTS backends — Edge-TTS (free, default), MiniMax (best quality), Chatterbox (local + voice cloning), OpenAI
- Scene validator — catches text overlaps, out-of-bounds elements, text overflow, and line-through-text issues before rendering
- Cross-platform — Linux, macOS, Windows
TTS Options
| Backend | Quality | Cost | Requirements | |---------|---------|------|-------------| | Edge-TTS (default) | Good | Free | None | | MiniMax | Best | ~$0.04/min | API key | | Chatterbox | Good + voice cloning | Free | NVIDIA GPU | | OpenAI TTS | Good | ~$0.06/min | API key |
How It Works
The core innovation is the cue-based audio-video sync system:
- Narration is written with
{CUE_NAME}markers at visual event points - TTS generates per-sentence audio and estimates cue positions by character ratio
- Manim scenes read cue timestamps and sync animations accordingly
until()fills gaps with slow animations,sync()waits for exact cue times
This produces smooth, naturally-paced videos where animations fire exactly when the narrator says the relevant keyword.
Project Structure
Each video is a self-contained <project>/ subfolder. Adding a second video is zero migration — just create another.
final/ # THE DELIVERABLE — what you watch/share
<project>/
<project>.pdf # source paper, if applicable
<project>.mp4 # final video
<project>.srt # soft subtitles (sidecar)
<project>_captioned.mp4 # optional: burned-in captions
intermediate/ # everything else (heavy; .gitignore by default)
<project>/
src/
video_<project>.py # Manim scenes
part_<project>_narration.py # narration with {CUE} markers
generate_tts_<project>.py # TTS runner
build_<project>.py # render + mux + caption
assets/<project>/*.png # extracted source figures
audio/video_<project>/ # TTS output + durations.json
media/videos/video_<project>/ # manim render cache
review/video_<project>/ # validator screenshots
output/ # per-scene muxed MP4s
plan_<project>.md # scene-by-scene planThe skill ships with:
video_utils/ # Bundled library (→ ~/tools/video_utils/ on install)
manim_helpers.py # CText, colors, sync helpers
tts_edge.py # Edge-TTS (free, default)
tts_minimax.py # MiniMax TTS (cloud)
tts_local.py # Chatterbox + Whisper (local)
tts_openai.py # OpenAI TTS (cloud)
validate_scenes.py # Scene validator
captions.py # SRT generator
skills/notes-to-video/
SKILL.md # Claude Code skill definitionAcknowledgments
- Manim Community — the animation engine this project is built on. Manim was originally created by Grant Sanderson (3Blue1Brown) and is now maintained by the Manim Community (MIT License).
- 3Blue1Brown — the visual style this project emulates. Grant Sanderson's videos set the standard for mathematical explainer animation.
- CS229 Lecture Notes — Andrew Ng & Tengyu Ma, Stanford University. Used as source material for the VAE demo video (Section 11.5), with attribution.
Disclaimer: This project is not affiliated with, endorsed by, or sponsored by 3Blue1Brown, Grant Sanderson, Stanford University, or the Manim Community. "3Blue1Brown" is used here as a descriptive reference to the visual style.
License
MIT — see LICENSE for details.
