@mulmocast/slide

v0.6.3

Published

22 days ago

Convert presentations (Keynote, PowerPoint, PDF, Marp) and videos to MulmoScript format

Downloads

1,277

0High
0Medium
0Low

snakajima

isamu

MulmoCast-Slides

Convert presentations (PPTX, PDF, Markdown, Keynote) and videos into narrated movies or interactive web viewer bundles.

| Input | Output | Command | |-------|--------|---------| | PPTX / PDF / Markdown / Keynote | Narrated video (.mp4) | mulmo-slide movie <file> -g -l ja | | PPTX / PDF / Markdown / Keynote | Web viewer bundle | mulmo-slide bundle <file> -g -l ja | | Video (.mp4, .webm, etc.) | Transcribed & translated bundle | mulmo-slide transcribe <file> |

# Generate a narrated video from slides
mulmo-slide movie presentation.pptx -g -l ja

# Generate a web viewer bundle and preview in browser
mulmo-slide bundle presentation.pptx -g -l ja
mulmo-slide preview

Overview

MulmoCast-Slides provides converters that extract slides and speaker notes from various presentation formats (Keynote, PowerPoint, PDF, etc.) and generate MulmoScript JSON files. Each slide is exported as an image paired with its speaker notes.

System Requirements

Node.js

Node.js 22 or later
yarn or npm

macOS

# Required for PDF and PPTX conversion
brew install imagemagick ghostscript

# Required for PPTX conversion (converts PPTX to PDF)
brew install --cask libreoffice

# Required for Keynote conversion
# Keynote app must be installed from App Store

Linux (Ubuntu/Debian)

# Required for PDF and PPTX conversion
sudo apt-get update
sudo apt-get install -y imagemagick ghostscript

# Required for PPTX conversion
sudo apt-get install -y libreoffice

Environment Variables

# Required for LLM narration generation (-g option)
export OPENAI_API_KEY=your-openai-api-key

# Optional: Set default language (en, ja, fr, de)
export MULMO_LANG=ja

Feature Requirements Summary

| Feature | macOS | Linux | Required Tools | |---------|-------|-------|----------------| | Marp (.md) | Yes | Yes | Node.js only | | PPTX (.pptx) | Yes | Yes | LibreOffice, ImageMagick, Ghostscript | | PDF (.pdf) | Yes | Yes | ImageMagick, Ghostscript | | Keynote (.key) | Yes | No | Keynote app, Python 3 | | LLM Narration (-g) | Yes | Yes | OPENAI_API_KEY |

Installation

npm (Global Installation)

npm install -g @mulmocast/slide

After installation, use the mulmo-slide command:

mulmo-slide marp presentation.md
mulmo-slide pptx presentation.pptx
mulmo-slide pdf presentation.pdf
mulmo-slide movie presentation.pptx

npx (No Installation)

npx @mulmocast/slide marp presentation.md
npx @mulmocast/slide pptx presentation.pptx -g -l ja

Development Setup

git clone https://github.com/receptron/MulmoCast-Slides.git
cd MulmoCast-Slides
yarn install
yarn build  # Build TypeScript to lib/

Running Sample Files

The samples/ directory contains example files for testing:

# Marp markdown
yarn marp samples/sample.md
yarn marp samples/custom_theme_demo.md --theme samples/custom-ocean.css

# PowerPoint
yarn pptx samples/omochikaeri.pptx
yarn pptx samples/omochikaeri.pptx -g -l ja  # with LLM narration

# PDF
yarn pdf samples/20251008_2.pdf
yarn pdf samples/20251008_2.pdf -g -l ja  # with LLM narration

# Keynote (macOS only)
yarn keynote samples/GraphAI.key

# Generate movie from sample
yarn movie samples/omochikaeri.pptx -g -l ja

# Generate bundle from sample
yarn bundle samples/sample.md -g -l ja

Note: When re-running commands, the existing MulmoScript JSON will be reused. To regenerate:

Delete the existing JSON file: rm scripts/<basename>/<basename>.json
Or use the -f (force) flag with movie/bundle: yarn movie samples/sample.pptx -f -g

Unified CLI

All commands are available through the unified mulmo-slide CLI:

mulmo-slide <command> [options]

Commands:
  mulmo-slide convert <file>     Convert any presentation (auto-detect format)
  mulmo-slide marp <file>        Convert Marp markdown to MulmoScript
  mulmo-slide pptx <file>        Convert PowerPoint to MulmoScript
  mulmo-slide pdf <file>         Convert PDF to MulmoScript
  mulmo-slide keynote <file>     Convert Keynote to MulmoScript (macOS only)
  mulmo-slide transcribe <file>  Transcribe video to MulmoScript with translations and TTS
  mulmo-slide movie <file>       Generate movie from presentation
  mulmo-slide bundle <file>      Generate MulmoViewer bundle from presentation
  mulmo-slide narrate <file>     Generate narrated ExtendedMulmoScript (full pipeline)
  mulmo-slide extend init        Install Claude Code skills (/narrate, /extend)
  mulmo-slide extend validate    Validate ExtendedMulmoScript JSON against schema
  mulmo-slide extend scaffold    Create ExtendedMulmoScript skeleton from MulmoScript
  mulmo-slide parse-md <file>    Parse markdown structure for LLM presentation planning
  mulmo-slide assemble-extended <file>  Assemble ExtendedMulmoScript from presentation plan

The convert command auto-detects file format by extension (.pptx, .md, .key, .pdf, .mp4, .mov, .mkv, .webm, .avi).

For development, you can also use yarn commands:

yarn cli marp presentation.md
yarn marp presentation.md      # shorthand

MulmoScript Format

MulmoScript is a JSON-based format that combines images with text for multimedia presentations. It supports multiple image formats:

PNG Image Format

{
  "$mulmocast": {
    "version": "1.1",
    "credit": "closing"
  },
  "beats": [
    {
      "text": "Speaker notes or narration text",
      "image": {
        "type": "image",
        "source": {
          "kind": "path",
          "path": "/absolute/path/to/slide.png"
        }
      }
    }
  ]
}

Markdown Format

{
  "$mulmocast": {
    "version": "1.1",
    "credit": "closing"
  },
  "beats": [
    {
      "text": "Speaker notes or narration text",
      "image": {
        "type": "markdown",
        "markdown": [
          "# Slide Title",
          "- Bullet point 1",
          "- Bullet point 2"
        ]
      }
    }
  ]
}

Available Tools

Keynote Extractor

Extracts slides and speaker notes from Apple Keynote presentations.

Usage:

# CLI
mulmo-slide keynote path/to/presentation.key

# yarn (development)
yarn keynote path/to/presentation.key

# Test with sample
yarn test:keynote

Requirements:

macOS
Keynote installed
Python 3

Output:

scripts/<basename>/images/ - PNG images of each slide
scripts/<basename>/<basename>.json - MulmoScript JSON file

Marp Extractor

Extracts slides and speaker notes from Marp markdown presentations, generating both PNG and Markdown formats.

Usage:

# CLI
mulmo-slide marp path/to/presentation.md
mulmo-slide marp path/to/presentation.md -g -l en  # with LLM narration

# yarn (development)
yarn marp path/to/presentation.md
yarn marp path/to/presentation.md -g -l en

# Test with sample
yarn test:marp

Options:

-l, --lang - Language for the MulmoScript (en, ja, fr, de)
-g, --generate-text - Generate narration text using OpenAI LLM
--theme - Path to custom theme CSS file
--allow-local-files - Allow local file access in Marp

Requirements:

Node.js
@marp-team/marp-cli
Puppeteer (installed automatically)
OpenAI API key (for -g option)

Output:

scripts/<basename>/images/ - PNG images of each slide
scripts/<basename>/<basename>.json - MulmoScript JSON file (PNG format)
scripts/<basename>/<basename>-markdown.json - MulmoScript JSON file (Markdown format)

Features:

Extracts speaker notes from HTML comments ()
Generates both PNG images and structured Markdown output
Preserves slide formatting and structure

Markdown Converter

Converts plain Markdown files to MulmoScript format with flexible separator options and plugin support.

Usage:

# CLI
mulmo-slide markdown path/to/document.md
mulmo-slide markdown path/to/document.md -g -l ja  # with LLM narration
mulmo-slide markdown path/to/document.md -s heading-2  # split by ## headings
mulmo-slide markdown path/to/document.md --mermaid --directive  # with plugins

# yarn (development)
yarn markdown path/to/document.md
yarn markdown path/to/document.md -s heading --mermaid --style corporate-blue

Options:

-l, --lang - Language for the MulmoScript (en, ja, fr, de)
-g, --generate-text - Generate narration text using LLM
-s, --separator - Slide separator mode (see below)
--mermaid - Convert mermaid code blocks to mermaid beat type
--directive - Remove Marp-style directives
--layout - Auto-detect layout based on content (see below)
--style - Markdown slide style (e.g., corporate-blue, finance-green)

Separator Modes:

| Mode | Description | Example | |------|-------------|---------| | horizontal-rule | Split by --- (default) | Marp, reveal.js style | | heading | Split by any heading | #, ##, ### | | heading-1 | Split by # only | Top-level sections | | heading-2 | Split by ## only | Second-level sections | | heading-3 | Split by ### only | Third-level sections | | blank-lines | Split by 3+ blank lines | Simple documents | | comment | Split by  | HTML-compatible | | page-break | Split by  | Print-style documents |

Layout Auto-Detection (--layout):

When enabled, the converter automatically detects the best layout based on content.

Detection rules are evaluated in order (first match wins):

Phase 1: Header Detection (H1)

If markdown contains an H1 heading (# Title), it becomes the header and remaining content is analyzed:

┌─────────────────────────────────────┬─────────────────┬─────────────────────────────────┐
│ Content Pattern                     │ Layout          │ Conditions                      │
├─────────────────────────────────────┼─────────────────┼─────────────────────────────────┤
│ H1 only                             │ (no layout)     │ Only H1, no other content       │
│                                     │                 │ → default markdown (array)      │
├─────────────────────────────────────┼─────────────────┼─────────────────────────────────┤
│ H1 + unstructured content           │ header+content  │ H1 + text without H2/H3         │
│                                     │                 │ → { header, content: [...] }    │
├─────────────────────────────────────┼─────────────────┼─────────────────────────────────┤
│ H1 + structured content             │ header+row-2    │ H1 + content matching row-2/2x2 │
│                                     │ header+2x2      │ → { header, "row-2": [...] }    │
└─────────────────────────────────────┴─────────────────┴─────────────────────────────────┘

Phase 2: Content Layout Rules (no H1, or applied to content after H1)

┌─────────────────────────────────────┬────────────┬─────────────────────────────────────┐
│ Content Pattern                     │ Layout     │ Conditions                          │
├─────────────────────────────────────┼────────────┼─────────────────────────────────────┤
│ 1. Single code block + text         │ row-2      │ Exactly 1 code block (```)          │
│                                     │            │ Text content > 20 chars             │
│                                     │            │ → [text, code]                      │
├─────────────────────────────────────┼────────────┼─────────────────────────────────────┤
│ 2. Single image + text              │ row-2      │ Exactly 1 image (![]())             │
│                                     │            │ Text content > 20 chars             │
│                                     │            │ → [text, image]                     │
├─────────────────────────────────────┼────────────┼─────────────────────────────────────┤
│ 3. 4+ H3 sections (short)           │ 2x2        │ 4 or more ### headings              │
│                                     │            │ Avg content < 200 chars             │
│                                     │            │ → first 4 sections                  │
├─────────────────────────────────────┼────────────┼─────────────────────────────────────┤
│ 4. 4+ H2 sections (short)           │ 2x2        │ 4 or more ## headings               │
│                                     │            │ Avg content < 200 chars             │
│                                     │            │ → first 4 sections                  │
├─────────────────────────────────────┼────────────┼─────────────────────────────────────┤
│ 5. 4+ H2 sections (long)            │ row-2      │ 4 or more ## headings               │
│                                     │            │ Avg content >= 200 chars            │
│                                     │            │ → first 2 sections                  │
├─────────────────────────────────────┼────────────┼─────────────────────────────────────┤
│ 6. 2+ H2 sections                   │ row-2      │ 2 or more ## headings               │
│                                     │            │ → first 2 sections                  │
├─────────────────────────────────────┼────────────┼─────────────────────────────────────┤
│ 7. Otherwise                        │ default    │ No layout applied                   │
└─────────────────────────────────────┴────────────┴─────────────────────────────────────┘

Notes:

"Meaningful text" = text without headings > 20 characters
Multiple code blocks or images → no layout detected
H3 has no fallback (only 2x2 if short, otherwise no layout)
H1 always becomes header; remaining content is analyzed for structure

Example:

# Auto-detect layout for better visual presentation
mulmo-slide markdown document.md --layout --style corporate-blue

Output:

scripts/<basename>/<basename>.json - MulmoScript JSON file (Markdown format)

PowerPoint (PPTX) Converter

Converts PowerPoint presentations to MulmoScript format with high-quality PNG exports.

Usage:

# CLI
mulmo-slide pptx path/to/presentation.pptx
mulmo-slide pptx path/to/presentation.pptx -g -l ja  # with LLM narration

# yarn (development)
yarn pptx path/to/presentation.pptx
yarn pptx path/to/presentation.pptx -g -l ja

Options:

-l, --lang - Language for the MulmoScript (en, ja, fr, de)
-g, --generate-text - Generate narration text using OpenAI LLM

Requirements:

Node.js
LibreOffice (used for PPTX to PDF conversion)
ImageMagick (for high-quality PNG export with antialiasing)
OpenAI API key (for -g option)

Output:

scripts/<basename>/ - Directory named after input file
scripts/<basename>/images/<basename>-0.png, -1.png, ... - PNG images of each slide
scripts/<basename>/<basename>.json - MulmoScript JSON file

PDF Converter

Converts PDF files to MulmoScript format with high-quality PNG exports.

Usage:

# CLI
mulmo-slide pdf path/to/presentation.pdf
mulmo-slide pdf path/to/presentation.pdf -g -l ja  # with LLM narration

# yarn (development)
yarn pdf path/to/presentation.pdf
yarn pdf path/to/presentation.pdf -g -l ja

Options:

-l, --lang - Language for the MulmoScript (en, ja, fr, de)
-g, --generate-text - Generate narration text using OpenAI LLM

Requirements:

Node.js
ImageMagick (for high-quality PNG export with antialiasing)
OpenAI API key (for -g option)

Output:

scripts/<basename>/ - Directory named after input file
scripts/<basename>/images/<basename>-0.png, -1.png, ... - PNG images of each page
scripts/<basename>/<basename>.json - MulmoScript JSON file

Movie Generation

Generate a movie directly from any supported presentation format.

Usage:

# CLI
mulmo-slide movie path/to/presentation.pptx
mulmo-slide movie path/to/presentation.pdf
mulmo-slide movie path/to/presentation.md
mulmo-slide movie path/to/presentation.key  # macOS only
mulmo-slide movie path/to/presentation.pptx -f -g -l ja  # force regenerate with LLM in Japanese

# yarn (development)
yarn movie path/to/presentation.pptx
yarn movie path/to/presentation.pptx -f -g -l ja

Options:

-l, --lang - Language for the MulmoScript (en, ja, fr, de)
-f, --force - Force regenerate MulmoScript (default: use existing if available)
-g, --generate-text - Generate narration text using OpenAI LLM (only when generating)

This command:

Converts the presentation to MulmoScript format (or uses existing)
Generates audio and images using mulmocast
Creates the final movie

Output:

output/<basename>/ - Movie and related files

Bundle Generation

Generate a MulmoViewer bundle directly from any supported presentation format.

Usage:

# CLI
mulmo-slide bundle path/to/presentation.pptx
mulmo-slide bundle path/to/presentation.pdf
mulmo-slide bundle path/to/presentation.md
mulmo-slide bundle path/to/presentation.key  # macOS only
mulmo-slide bundle path/to/presentation.pptx -f -g -l ja  # force regenerate with LLM in Japanese

# yarn (development)
yarn bundle path/to/presentation.pptx
yarn bundle path/to/presentation.pptx -f -g -l ja

Options:

-l, --lang - Language for the MulmoScript (en, ja, fr, de)
-f, --force - Force regenerate MulmoScript (default: use existing if available)
-g, --generate-text - Generate narration text using OpenAI LLM (only when generating)

This command:

Converts the presentation to MulmoScript format (or uses existing)
Translates content to multiple languages (ja, en)
Generates audio and images
Creates a bundle for MulmoViewer (skipZip mode)

Output:

output/<basename>/ - Bundle files for MulmoViewer

Previewing Bundles (Web Viewer)

Preview generated bundles in the browser using the built-in Vue 3 web viewer.

# Production preview (npm global install)
mulmo-slide preview
mulmo-slide preview 8080   # custom port

# Development mode (hot reload)
yarn dev

Opens http://localhost:3000 and automatically discovers bundles from the output/ directory.

Features:

Slide display with narration playback
Audio / text language switching (multilingual support)
Recording mode: record via microphone → Whisper transcription → edit text → save
AI Q&A chat: ask questions about the presentation content (GPT-4o-mini, requires VITE_OPENAI_API_KEY in .env)

Language Setting

All converters support setting the language for the generated MulmoScript.

Supported languages: en (English), ja (Japanese), fr (French), de (German)

Priority: CLI option > Environment variable > Default (en)

CLI option:

mulmo-slide pptx presentation.pptx -l ja
mulmo-slide marp presentation.md --lang fr
mulmo-slide keynote presentation.key -l de

Environment variable:

export MULMO_LANG=ja
yarn pptx presentation.pptx

LLM Text Generation

Generate narration text for each slide using OpenAI's GPT-4o model.

Usage:

# PPTX: Uses slide images with Vision API
mulmo-slide pptx presentation.pptx -g -l ja

# PDF: Uses page images with Vision API
mulmo-slide pdf presentation.pdf -g -l ja

# Marp: Uses markdown content
mulmo-slide marp presentation.md -g -l en

# Bundle/Movie: Use with -f to regenerate
mulmo-slide bundle presentation.pptx -f -g

Requirements:

OPENAI_API_KEY environment variable must be set

How it works:

For PPTX/PDF: Converts slides/pages to images and uses OpenAI Vision API to understand content
For Marp: Uses the markdown content directly
The LLM considers the overall presentation structure to generate contextual narration
Output is in the specified language (-l option)

Narrate CLI

Generate a narrated ExtendedMulmoScript from any supported source file in one command. This automates the full pipeline: conversion to MulmoScript, LLM-based narration and metadata generation, and validation.

Usage:

# Full pipeline (requires OPENAI_API_KEY)
mulmo-slide narrate paper.pdf
mulmo-slide narrate slides.pptx -l ja
mulmo-slide narrate document.md --mermaid -s heading

# Scaffold only (no LLM, for Claude Code handoff)
mulmo-slide narrate paper.pdf --scaffold-only

# yarn (development)
yarn narrate samples/sample.pdf -l ja
yarn narrate samples/sample.pdf --scaffold-only

Options:

-l, --lang - Language for narration (en, ja, fr, de)
--scaffold-only - Only create ExtendedMulmoScript skeleton (no LLM). Useful as preparation for Claude Code /narrate analysis
-f, --force - Force regenerate MulmoScript even if it exists
-s, --separator - Slide separator mode (for Markdown files)
--mermaid - Convert mermaid code blocks (for Markdown files)

Output: scripts/{basename}/extended_script.json

Extend Scaffold

Create an ExtendedMulmoScript skeleton from an existing MulmoScript without any LLM calls. This adds beat IDs, empty metadata fields, and imports extracted texts as notes.

mulmo-slide extend scaffold scripts/<basename>/<basename>.json

# yarn (development)
yarn cli extend scaffold scripts/<basename>/<basename>.json

Markdown to ExtendedMulmoScript (LLM-assisted)

Convert a structured markdown document into an ExtendedMulmoScript with intelligent beat allocation, narration, and metadata. This is a multi-step process using the /md-to-mulmo Claude Code skill.

Pipeline Overview

Parse (parse-md): Extract document structure and generate JSON Schemas
Plan (LLM via /md-to-mulmo skill): Create presentation plan with beat allocation
Assemble (assemble-extended): Convert plan to ExtendedMulmoScript with variants

Usage

# Step 1: Parse markdown and generate schemas
mulmo-slide parse-md path/to/document.md

# Step 2: Use /md-to-mulmo skill in Claude Code (creates presentation_plan.json)

# Step 3: Assemble ExtendedMulmoScript from plan
mulmo-slide assemble-extended scripts/{basename}/presentation_plan.json

# Step 4: Generate MulmoScript from ExtendedMulmoScript
npx mulmocast-preprocessor scripts/{basename}/extended_script.json -o scripts/{basename}/{basename}.json

Output of parse-md:

scripts/{basename}/parsed_structure.json — structured markdown sections
scripts/{basename}/extended-script.schema.json — ExtendedMulmoScript JSON Schema
scripts/{basename}/presentation-plan.schema.json — intermediate plan schema

Output of assemble-extended:

scripts/{basename}/extended_script.json — validated ExtendedMulmoScript with output profiles

Setup

Install the Claude Code skill:

mulmo-slide extend init

Then use /md-to-mulmo path/to/document.md in Claude Code.

Narrate: Source File to Narrated Video (Claude Code Skill)

Tip: If you don't need interactive analysis and want a fully automated pipeline, use the Narrate CLI instead (mulmo-slide narrate <file>). It runs the full pipeline with OpenAI GPT-4o without requiring Claude Code.

The /narrate skill converts any supported source file into a narrated ExtendedMulmoScript in one step. It handles the full pipeline: conversion, narration generation, metadata, and validation.

Setup

Install the Claude Code skills into your project:

# If installed globally
mulmo-slide extend init

# With npx
npx @mulmocast/slide extend init

# Development
yarn cli extend init

This copies the skill files to .claude/skills/ in your project directory.

Usage

In Claude Code, use the /narrate command with any supported source file:

/narrate your-paper.pdf
/narrate your-slides.pptx
/narrate your-slides.md
/narrate your-slides.key

The skill automatically:

Converts the source file to MulmoScript (slide images + text extraction)
Generates AI narration for each slide
Adds metadata (keywords, sections, context, FAQ)
Validates the output
Shows you the next steps

Output: scripts/{basename}/extended_script.json

After `/narrate`: Next Steps

# Query the content interactively
npx mulmocast-preprocessor query scripts/{basename}/extended_script.json -i

# Generate a summary
npx mulmocast-preprocessor summarize scripts/{basename}/extended_script.json

# Generate a narrated video
npx mulmocast-preprocessor scripts/{basename}/extended_script.json -o scripts/{basename}/{basename}.json
npx mulmo movie scripts/{basename}/{basename}.json

Validating ExtendedMulmoScript

Validate an ExtendedMulmoScript JSON file against the schema:

mulmo-slide extend validate scripts/simple_text/extended_script.json

# Development
yarn cli extend validate scripts/simple_text/extended_script.json

Outputs beat count, scriptMeta presence, meta coverage percentage, and sections found.

Low-level: `/extend`

If you already have a MulmoScript and just want to add metadata:

/extend scripts/simple_text/simple_text.json

Tutorials

Output Structure

This tool generates files in two directories: scripts/ for intermediate data (MulmoScript, metadata) and output/ for final deliverables (movies, bundles). The <basename> is derived from the input filename without extension (e.g., presentation.pptx → presentation).

`scripts/<basename>/` — Intermediate Files

Generated by conversion commands (convert, marp, pptx, pdf, keynote, markdown, transcribe) and narration/metadata commands (narrate, extend, parse-md, assemble-extended).

Conversion Output

| File | Generated by | Description | |------|-------------|-------------| | <basename>.json | All converters | MulmoScript — the core JSON format pairing slides with narration text | | <basename>-markdown.json | marp only | MulmoScript using Markdown content instead of PNG images | | images/ | marp, pptx, pdf, keynote | Slide images as PNG files | | images/<basename>-0.png, -1.png, ... | pptx, pdf, keynote | One PNG per slide/page (zero-indexed) | | images/images.001.png, .002.png, ... | marp | One PNG per slide (one-indexed, Marp naming convention) | | <basename>.pdf | pptx | Intermediate PDF created by LibreOffice during PPTX→PNG conversion | | extracted_texts.json | pdf | Array of raw text extracted from each PDF page (used by /extend and narrate for metadata) |

Video Transcription Output (`transcribe` / `convert` with video files)

The video transcription pipeline splits a video by silence detection, transcribes each segment with OpenAI Whisper, and optionally generates a multi-language bundle.

| File | Description | |------|-------------| | <basename>.json | MulmoScript with type: "movie" beats referencing video segments | | 1.mp4, 2.mp4, ... | Split video segments (one per detected segment) | | 1.mp3, 2.mp3, ... | Extracted audio from each video segment (used for Whisper transcription) | | 1.jpg, 2.jpg, ... | Thumbnail images from each video segment (first frame) |

When bundle generation is enabled (default), additional files are created in output/<basename>/<basename>/:

| File | Description | |------|-------------| | mulmo_view.json | Viewer data with multi-language text and audio references | | 1.mp4, 2.mp4, ... | Video segments (copied from scripts) | | 1.mp3, 2.mp3, ... | Source language audio (copied from scripts) | | 1.jpg, 2.jpg, ... | Thumbnails (copied from scripts) | | <N>_<lang>.mp3 | TTS audio for translated text (e.g., 1_ja.mp3, 2_ja.mp3) |

Narration & Metadata Output

| File | Generated by | Description | |------|-------------|-------------| | extended_script.json | narrate, extend scaffold, assemble-extended | ExtendedMulmoScript — MulmoScript enriched with scriptMeta, beats[].meta, output profiles, and variants | | analysis.json | narrate (LLM mode) | LLM analysis of slide content used during narration generation |

Markdown-to-ExtendedMulmoScript Pipeline (`parse-md` → `/md-to-mulmo` → `assemble-extended`)

| File | Generated by | Description | |------|-------------|-------------| | parsed_structure.json | parse-md | Structured representation of the markdown document (sections, elements, hierarchy) | | extended-script.schema.json | parse-md | JSON Schema for ExtendedMulmoScript (generated from Zod, for LLM reference) | | presentation-plan.schema.json | parse-md | JSON Schema for the intermediate presentation plan format | | presentation_plan.json | /md-to-mulmo skill (LLM) | Presentation plan with beat allocation, narration, and core/optional flags |

`output/<basename>/` — Final Deliverables

Generated by movie, bundle, and publish commands.

Top-Level Files (generated by `mulmocast` library)

| File | Generated by | Description | |------|-------------|-------------| | <basename>_<lang>.mp4 | movie | Final narrated video (e.g., sample_ja.mp4) | | <basename>_<lang>.mp3 | movie, bundle | Concatenated audio for the entire presentation in one language | | <basename>_studio.json | movie, bundle | Studio data (timing, beat metadata) used by mulmocast internally | | <basename>_lang.json | bundle | Multi-language translation data for the viewer |

`output/<basename>/<basename>/` — Bundle Directory

Contains all files needed for MulmoViewer. Generated by bundle (and used by upload/publish).

| File / Directory | Description | |-----------------|-------------| | mulmo_view.json | Viewer data — beats with audio/image references, metadata, output profiles. This is the main file the viewer reads | | images.001.png, .002.png, ... | Slide images for the viewer | | beat-1.png, beat-2.png, ... | Rendered slide images (for markdown-based presentations) | | mulmo_credit.png | MulmoCast credit image (auto-generated) | | <basename>_<hash>_<lang>.mp3 | Per-beat audio files (hash identifies the text content, lang is ja/en/etc.) | | silent300.mp3 | Short silent audio clip used for padding between beats |

`output/<basename>/audio/<basename>/` and `output/<basename>/images/<basename>/`

Working directories used by mulmocast during generation. Contents are the same audio and image files that end up in the bundle directory.

`extend merge` — Updates Existing Bundle

The extend merge command reads scripts/<basename>/extended_script.json and merges its metadata (beat IDs, meta, scriptMeta, outputProfiles, variants) into the existing output/<basename>/<basename>/mulmo_view.json. No new files are created.

Example: Presentation Pipeline

# 1. Convert PPTX → MulmoScript
mulmo-slide pptx presentation.pptx -g -l ja

# Files created:
#   scripts/presentation/presentation.json    (MulmoScript)
#   scripts/presentation/images/              (slide PNGs)
#   scripts/presentation/presentation.pdf     (intermediate)

# 2. Generate ExtendedMulmoScript with narration
mulmo-slide narrate presentation.pptx --scaffold-only

# Files created:
#   scripts/presentation/extended_script.json (ExtendedMulmoScript)

# 3. Generate bundle for MulmoViewer
mulmo-slide bundle presentation.pptx -g -l ja

# Files created:
#   output/presentation/presentation/mulmo_view.json
#   output/presentation/presentation/*.mp3    (per-beat audio)
#   output/presentation/presentation/*.png    (slide images)
#   output/presentation/presentation_ja.mp3   (concatenated audio)
#   output/presentation/presentation_studio.json

Example: Video Transcription Pipeline

# Transcribe video with Japanese translation
mulmo-slide transcribe talk.mp4 --target-langs=ja

# Files created in scripts/talk/:
#   talk.json          (MulmoScript with movie beats)
#   1.mp4, 2.mp4, ...  (split video segments)
#   1.mp3, 2.mp3, ...  (extracted audio per segment)
#   1.jpg, 2.jpg, ...  (thumbnail per segment)
#
# Files created in output/talk/talk/ (bundle):
#   mulmo_view.json    (viewer data with ja/en text + audio)
#   1.mp4, 2.mp4, ...  (video segments)
#   1_ja.mp3, 2_ja.mp3, ... (Japanese TTS audio)

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

MulmoCast-Slides

Overview

System Requirements

Node.js

macOS

Linux (Ubuntu/Debian)

Environment Variables

Feature Requirements Summary

Installation

npm (Global Installation)

npx (No Installation)

Development Setup

Running Sample Files

Unified CLI

MulmoScript Format

PNG Image Format

Markdown Format

Available Tools

Keynote Extractor

Marp Extractor

Markdown Converter

PowerPoint (PPTX) Converter

PDF Converter

Movie Generation

Bundle Generation

Previewing Bundles (Web Viewer)

Language Setting

LLM Text Generation

Narrate CLI

Extend Scaffold

Markdown to ExtendedMulmoScript (LLM-assisted)

Pipeline Overview

Usage

Setup

Narrate: Source File to Narrated Video (Claude Code Skill)

Setup

Usage

After /narrate: Next Steps

Validating ExtendedMulmoScript

Low-level: /extend

Tutorials

Output Structure

scripts/<basename>/ — Intermediate Files

Conversion Output

Video Transcription Output (transcribe / convert with video files)

Narration & Metadata Output

Markdown-to-ExtendedMulmoScript Pipeline (parse-md → /md-to-mulmo → assemble-extended)

output/<basename>/ — Final Deliverables

Top-Level Files (generated by mulmocast library)

output/<basename>/<basename>/ — Bundle Directory

output/<basename>/audio/<basename>/ and output/<basename>/images/<basename>/

extend merge — Updates Existing Bundle

Example: Presentation Pipeline

Example: Video Transcription Pipeline

License

After `/narrate`: Next Steps

Low-level: `/extend`

`scripts/<basename>/` — Intermediate Files

Video Transcription Output (`transcribe` / `convert` with video files)

Markdown-to-ExtendedMulmoScript Pipeline (`parse-md` → `/md-to-mulmo` → `assemble-extended`)

`output/<basename>/` — Final Deliverables

Top-Level Files (generated by `mulmocast` library)

`output/<basename>/<basename>/` — Bundle Directory

`output/<basename>/audio/<basename>/` and `output/<basename>/images/<basename>/`

`extend merge` — Updates Existing Bundle