vitutor
v0.1.1
Published
Generate complete lesson videos (slides, narration, mp4, pdf) from a single prompt. Usable as a library or a CLI.
Maintainers
Readme
vitutor
Generate narrated lesson videos — slides, TTS narration, an MP4, and a printable PDF — from a single learner prompt.
Usable two ways:
- As a CLI:
npx vitutor generate --prompt "如何使用 mac" --out ./lesson/ - As a library:
import { generate } from "vitutor"from any Node 20+ process (e.g. the Electron main process ofdesktop-app).
The architecture mirrors the proven CLI-style pipeline from the edsger package: single process, synchronous, either produces files or throws a clear error. No background workers, no message queues, no half-finished pending rows.
Pipeline
prompt
│
▼ build (claude-agent-sdk, json_schema structured output)
├───► LessonPlan: { title, summary, slides[], sources[] }
│
▼ render (Playwright + ElevenLabs + ffmpeg-static)
├───► slides/slide-N.html (per-slide HTML)
├───► slides/slide-N.png (headless-chromium screenshot)
├───► audio/narration-N.mp3 (ElevenLabs TTS)
├───► segments/segment-N.mp4 (image + audio muxed)
├───► lesson.mp4 (concat of segments)
└───► lesson.pdf (deck PDF via page.pdf())Progress callbacks fire at every stage boundary so UI callers can render a progress bar.
Install
cd packages/vitutor
npm install
# Playwright needs Chromium downloaded once:
npx playwright install chromiumNode 20+ required (Playwright and the Anthropic SDK drop older versions).
Credentials
- ElevenLabs (TTS): set
ELEVENLABS_API_KEY, or pass--elevenlabs-key/tts.apiKey. - Claude (build + slide agent):
@anthropic-ai/claude-agent-sdkshells out to theclaudeCLI; authentication is whatever that CLI uses on your system (claude auth loginonce).
CLI
vitutor generate \
--prompt "如何使用 mac" \
--out ./out/mac-lesson \
--locale zh-CN
vitutor build --prompt "intro to FFT" --output plan.json
vitutor render --input plan.json --out ./out/fft
vitutor preview-voice --text "你好,世界" --out sample.mp3All commands support Ctrl-C mid-run — in-flight TTS requests and ffmpeg processes are aborted cleanly.
Library
import { generate } from "vitutor";
const result = await generate(
{ prompt: "如何使用 mac", locale: "zh-CN" },
{
outDir: "/path/to/output",
tts: { apiKey: process.env.ELEVENLABS_API_KEY },
onStage: (s) => console.log("stage:", s),
onRenderProgress: (p) => console.log(p.stage, p.percent),
},
);
console.log(result.videoPath); // /path/to/output/lesson.mp4Or run build and render separately — useful when you want to persist the plan to a DB between steps:
import { build, render } from "vitutor";
const plan = await build({ prompt: "intro to FFT" });
// ... persist `plan` to Supabase here ...
const result = await render(plan, { outDir: "./out/fft" });Integrating with desktop-app
The current desktop-app/src/main/learn/* is a subset of what this package does — it only handles the render step, and depends on a separate Claude CLI agent to write slides asynchronously. That async step is where lessons get stuck at status = 'pending'.
Migration path:
- Replace
desktop-app/src/main/learn/pipeline.tsimports withimport { render } from "vitutor". - Delete the
CLI agent dispatch(build-conversation.ts, agent-prompts.ts) — build now runs in the Electron main process viaimport { build } from "vitutor". - When a lesson is created in the UI, call
buildsynchronously; on success write slides to Supabase +status = 'ready'; on error setstatus = 'failed'withlast_error. No more orphan rows. slide-capture.tsandpdf-export.tsindesktop-appusedBrowserWindowandwebContents.printToPDF. This package uses Playwright instead, which runs fine inside Electron's main process — but you can also keep the Electron implementations by not importing those two modules from here.
The protocol.ts file (custom lesson:// scheme) stays in desktop-app — it's Electron-specific.
File layout
src/
build.ts # claude-agent-sdk → LessonPlan
build-prompts.ts # system + user prompts for the builder
render.ts # LessonPlan + outDir → mp4 + pdf
generate.ts # build + render orchestrator
slide-capture.ts # Playwright screenshot loop
pdf-export.ts # Playwright page.pdf()
slide-agent.ts # per-slide designer (html/css via Claude)
slide-design-cache.ts # content-addressable cache for designs
slide-designer-prompts.ts
slide-template.ts # deterministic fallback templates
tts.ts # ElevenLabs client + chunking
video-compose.ts # ffmpeg wrappers (segment + concat)
paths.ts # lesson outDir layout
design-tokens.ts # per-subject design-token type + normalizer
html-sanitizer.ts # sanitize agent-produced HTML/CSS
highlighter.ts # zero-dep code highlighter
concurrency.ts # mapConcurrent + concurrency heuristics
types.ts # LessonPlan / PlannedSlide / LessonSource
cli/index.ts # bin entrypoint
index.ts # library exportsFailure modes
Surfaces as, ordered by what's been verified vs expected from reading the code.
| Failure | Surfaces as | Verified |
| ------------------------------------- | ---------------------------------------------- | ---------- |
| Claude agent returns no JSON | LessonBuildError: ...no structured_output | ✅ mocked |
| Agent output fails zod validation | LessonBuildError: output failed validation | ✅ mocked |
| Agent returns an error-variant result | LessonBuildError: non-success result (...) | ✅ mocked |
| Agent stream ends with no result | LessonBuildError: without a result | ✅ mocked |
| Out-of-order / duplicate ordinals | Renumbered to contiguous 0..N-1 in input order | ✅ mocked |
| Slide missing narration_script | render throws slide ordinal N is missing | unverified |
| Render end-to-end (template path) | Produces mp4 + pdf + poster.png | ✅ smoke |
| ElevenLabs API key missing | MissingApiKeyError | unverified |
| ElevenLabs 4xx | Error: ElevenLabs TTS failed (4xx): ... | unverified |
| ElevenLabs 429 / 5xx | Retried with backoff; surfaces after exhaust | unverified |
| ffmpeg-static binary unresolvable | ffmpeg-static did not export a binary path | unverified |
| Playwright chromium not installed | Executable doesn't exist at ... | unverified |
| User cancels (Ctrl-C) during build | LessonBuildError: cancelled | ✅ mocked |
| User cancels (Ctrl-C) during render | RenderCancelledError | unverified |
Every one of these is synchronous — the caller sees the error immediately, not 10 minutes later via a silent pending row. The "unverified" rows are wired through and type-checked but haven't been exercised in tests yet.
Tests
npm test— runs vitest- 54 tests total: 39 pure-function + 14 mocked
build+ 1 real end-to-end smoke test (render.smoke.test.ts) that launches real Chromium and real ffmpeg, stubs ElevenLabs, and verifies an mp4 + pdf land on disk (~3s) - The smoke test self-skips if
npx playwright install chromiumhasn't been run
