@guanghechen/kit-video

v0.3.4

Published

5 days ago

AI-powered video generation from scenario files

0High
0Medium
0Low

lemonclown

video ffmpeg slides transition ai tts image

@guanghechen/kit-video

AI-powered video generation from scenario files. This package provides a complete pipeline for generating presentation videos from text scenarios.

Installation

pnpm add @guanghechen/kit-video

Usage

Full Pipeline (autogen)

# Generate complete video from scenario
kit-video autogen -s /path/to/scenario -o /path/to/output --video

# With PDF and PPTX export
kit-video autogen -s /path/to/scenario -o /path/to/output --video --pdf --pptx

# Dry run (show what would be executed)
kit-video autogen -s /path/to/scenario --dry-run

Note: preference / outline / transcript stages currently validate existing files in workspace. They do not generate these files automatically.

Individual Commands

# Prepare workspace
kit-video prepare -s /path/to/scenario -w /path/to/output

# Validate preference
kit-video preference -w /path/to/workspace

# Validate outline
kit-video outline -w /path/to/workspace

# Validate transcript
kit-video transcript -w /path/to/workspace

# Generate images
kit-video image -w /path/to/workspace

# Run OCR
kit-video ocr -w /path/to/workspace

# Generate PDF (full edition)
kit-video pdf -w /path/to/workspace

# Generate PPTX (full edition)
kit-video pptx -w /path/to/workspace

# Generate TTS audio
kit-video tts -w /path/to/workspace

# Align TTS with slides
kit-video tts-align -w /path/to/workspace

# Compose video
kit-video video -w /path/to/workspace --srt --transition fade

Pipeline Stages

The video generation pipeline consists of the following stages:

| Stage | Description | Output | | ------------ | ------------------------------------------------ | -------------------------- | | prepare | Prepare workspace from scenario directory | workspace structure | | preference | Validate existing presentation preferences | preference.json | | outline | Validate existing slide outline | outline.json | | transcript | Validate existing transcript for slides | transcript.md | | image | Generate slide images | img/slide_N_V.png | | ocr | Extract text from slide images | ocr_data.json | | pdf | Generate PDF from images (full edition) | presentation.pdf | | pptx | Generate PPTX from images (full edition) | presentation.pptx | | tts | Generate TTS audio from transcript | voice.mp3 | | tts-align | Align TTS audio with slides | slide_durations.json | | video | Compose final video from images and audio | video.mp4, video.done |

Scenario Directory Structure

scenario/
├── query.md           # Topic/query for generation
├── material/          # (Optional) Reference materials
│   ├── doc1.pdf
│   └── ...
├── preference.json    # (Optional) Pre-defined preferences
├── outline.json       # (Optional) Pre-generated outline
└── transcript.md      # (Optional) Pre-generated transcript

Workspace Structure

After running the pipeline, the workspace will contain:

workspace/
├── material.md            # (Optional) Copied material file
├── material/              # (Optional) Copied material directory
├── query.md               # (Optional) Copied query
├── preference.json        # Presentation preferences
├── outline.json           # Slide outline
├── transcript.md          # Narration transcript
├── img/                   # Generated images
│   ├── slide_0_1.png
│   ├── slide_1_1.png
│   └── ...
├── ocr_data.json          # OCR results
├── voice.mp3              # TTS audio
├── slide_durations.json   # Timing information
├── subtitles.srt          # (Optional) Subtitles
├── presentation.pdf       # (Optional) PDF export
├── presentation.pptx      # (Optional) PPTX export
├── video.done             # Completion marker
└── video.mp4              # Final video

Autogen Options

| Option | ------------------------ | -s, --scenario | -o, --output | --pdf | --pptx | --tts | --video | --ocr | --dry-run | --debug | --image-source | --image-quality | --tts-source | --speech-voice | --transition | --transition-duration | --stage-image-parallel | --stage-ocr-parallel | --srt | --stt | --query | --query-path | --force-<stage> | --only-<stage> | Type | Default | Description | | ------- | ------------ | --------------------------------------- | | string | (required) | Path to scenario directory | | string | auto | Output directory | | boolean | false | Enable PDF generation | | boolean | false | Enable PPTX generation | | boolean | false | Enable TTS voice generation | | boolean | false | Enable video generation (implies --tts) | | boolean | false | Enable OCR text detection | | boolean | false | Show what stages would run | | boolean | false | Enable debug output | | string | gpt-image-1-5 | Image source | | string | high | Image quality (low/medium/high) | | string | speech | TTS source (speech/llmapi) | | string | - | Azure Speech voice name | | string | wipeleft | Transition effect | | string | 0.8 | Transition duration in seconds | | string | 1 | Image stage parallelism | | string | 2 | OCR stage parallelism | | boolean | false | Generate and burn subtitles | | boolean | false | Enable STT for precise timing | | string | - | Inline query text | | string | - | Query file path | | boolean | false | Force re-run specific stage | | boolean | false | Only run specific stage |

Video Options

| Option | Short | Type | Default | Description | | ------------------------ | ----- | ------- | -------- | ---------------------------------- | | -w, --workspace-dir | | string | required | Workspace directory | | --srt | | boolean | false | Burn subtitles into video | | --transition | | string | wipeleft | Transition type between slides | | --transition-duration | | string | 0.8 | Transition duration in seconds | | --force | | boolean | false | Force regenerate video |

Transition Types

Supported transition types:

none - No transition
fade, fadeblack, fadewhite
wipeleft, wiperight, wipeup, wipedown
slideleft, slideright, slideup, slidedown
circleopen, circleclose
dissolve, pixelize, radial

Requirements

Node.js >= 24.0.0
ffmpeg and ffprobe (must be installed and available in PATH)

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@guanghechen/kit-video

Installation

Usage

Full Pipeline (autogen)

Individual Commands

Pipeline Stages

Scenario Directory Structure

Workspace Structure

Autogen Options

Video Options

Transition Types

Requirements