@notedit/vtake
v1.0.1
Published
Local video takeaway generator (skill-driven)
Downloads
374
Readme
VTake
VTake turns a local video into transcript, metadata, and an AI-composed
card-based takeaway video. The CLI handles the deterministic, narrow steps
(audio extraction, transcription); a Claude Code skill drives the creative
work (storyboard planning, per-card HTML, GSAP timeline), and hyperframes
renders the final MP4.
The mainline product is local-first and skill-driven:
vtakenpm CLI — extraction, transcription, and adoctorhealthcheckskills/vtake/SKILL.md— agent workflow that designs cards, writes HTML, and callshyperframesto renderweb/— static landing page + thin/api/transcribeproxy on Cloudflare Workers
Install
npm install -g vtakeFrom this repository:
npm install
npm run build
npm run dev:cli -- --helpCLI
vtake doctor # check local prerequisites
vtake extract <video> --out-dir <dir> # → metadata.json + audio.mp3
vtake transcribe <audio> --out-dir <dir> # → transcript.json
[--asr elevenlabs]Rendering and storyboard authoring are not in the CLI by design — the
skill writes per-card HTML and an index.html composition in chat, then
calls npx hyperframes render <dir> to produce the MP4. Run vtake doctor
before starting to confirm ffmpeg, ffprobe, the staged fonts, and the
gsap.min.js vendor file are present.
Environment
ASR (vtake transcribe):
ELEVEN_API_KEY— direct ElevenLabs Scribe (no rate limit). When unset, the CLI falls back tohttps://vtake.app/api/transcribe, rate-limited to 3 requests per minute per IP.VTAKE_TRANSCRIBE_ENDPOINT— override the proxy URL (e.g. for local Wrangler dev:http://localhost:8787/api/transcribe).
Rendering (hyperframes render, run by the skill):
PRODUCER_BROWSER_GPU_MODE=hardware— strongly recommended on macOS.
The CLI loads .env from the repository root when running from this project.
The mainline does not include Supabase, Redis, BullMQ, R2, auth, credits, upload queues, or rendering workers.
Skill Workflow
Use skills/vtake/SKILL.md when you want an agent to drive the card-based
composition:
vtake extract— audio + metadatavtake transcribe— word-level transcript- Agent corrects ASR errors in chat
- Agent drafts a lightweight
storyboard.jsonoutline in chat (no CLI) - Agent writes one HTML fragment per card
- Agent assembles
public/index.htmlwith the GSAP master timeline npx hyperframes render public -o output.mp4
The storyboard / cards / composition all live in the user's working directory — there is no central server, no upload, no queue.
Cloudflare Workers Deployment
The hosted surface is intentionally tiny: web/out is served via Workers
Static Assets and web/worker.ts exposes:
GET /api,GET /api/health,GET /api/version,GET /api/capabilitiesPOST /api/transcribe— authenticated thin relay to ElevenLabs Scribe with per-IP (3/min) and global (60/min) rate limiting
Use Node.js 22+ for Wrangler 4.
cd web
npm install
npm run build
npm run dev:worker
npm run deployweb/wrangler.jsonc maps the Worker to the Custom Domain vtake.app.
See docs/cloudflare-workers.md for the deployment plan.
Development
npm run build
npm test
cd web
npm install
npm run typecheck
npm run build