@optima-chat/video-translate-tools
v1.0.8
Published
Path-B styled subtitle rendering + auto BGM ducking for HeyGen-translated videos. Pairs with `gen video-translate` CLI.
Maintainers
Readme
@optima-chat/video-translate-tools
Path-B styled subtitle rendering + BGM ducking for HeyGen-translated videos.
Pairs with the gen video-translate CLI from @optima-chat/optima-gen-cli.
gen video-translate --video-url $URL --lang $LANG -o ./out
curl <caption_url> -o ./out/caption.srt
video-translate render-ass --srt ./out/caption.srt --lang $TAG --out ./out/subs.ass
video-translate mux --raw ./out/translate_*.mp4 --ass ./out/subs.ass [--bgm ./bgm.wav] -o ./final.mp4Install (in optima-ai-shell Dockerfile)
RUN apt-get install -y fonts-noto-core && \
npm install -g @optima-chat/video-translate-tools@latest && \
cp -r $(video-translate fonts-dir)/. /usr/share/fonts/video-translate/ && \
fc-cache -fCLI Reference
render-ass
Parse SRT → ASS path-B styled subtitles.
| Flag | Required | Description |
|---|---|---|
| --srt <path> | ✅ | HeyGen SRT file (BOM auto-stripped) |
| --lang <en\|ms\|vi\|th> | ✅ | Target lang. Determines font + max chars per line |
| --out <path> | ✅ | Output ASS path |
| --translations <path> | ⬜ | Save / read JSON intermediate. If file exists, used as-is; user can hand-edit to add **word** for pink-KW highlight |
mux
Burn ASS onto HeyGen mp4. Optional BGM ducking.
| Flag | Required | Description |
|---|---|---|
| --raw <path> | ✅ | HeyGen-output mp4 (has lip-synced voice) |
| --ass <path> | ✅ | ASS subtitle file |
| --out <path> | ✅ | Final mp4 |
| --bgm <path> | ⬜ | If set, mix BGM under voice via asplit sidechain ducking |
| --fonts-dir <path> | ⬜ | Override fonts dir (defaults to bundled packages/video-translate-tools/fonts/) |
fonts-dir
Print bundled fonts directory absolute path. Used by Dockerfile to register fonts.
Languages
| Tag | Font | Max chars/line | Source |
|---|---|---|---|
| en | Bangers | 32 | npm bundle (this pkg) |
| ms | Bangers | 32 | npm bundle (this pkg) |
| vi | Noto Sans | 30 | apt-get install fonts-noto-core |
| th | Sarabun | 42 | npm bundle (this pkg) |
Thai uses 42 chars/line because Thai has no spaces between words — smartSplit would otherwise cut mid-word.
System deps
- Node ≥ 20
- ffmpeg with libass (
apt-get install ffmpegon Ubuntu) - For Vietnamese subs:
fonts-noto-coreapt pkg (standaloneNoto Sansfamily) - For Thai subs: bundled in this npm package (Sarabun-Regular.ttf)
Note: fonts-noto-cjk does NOT provide standalone Noto Sans — it ships only Noto Sans CJK * families. Use fonts-noto-core.
Scope
v1 ships subtitle rendering + BGM-ducked muxing only. Voice override (custom voice selection from HeyGen library) is deferred to v2 — default flow uses HeyGen auto-clone of original speaker. See SPEC v3.1 §Scope.
License
MIT
