npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@reaatech/media-pipeline-mcp-video-gen

v0.3.0

Published

Video generation operations — text-to-video, image-to-video (provider delegation), frame extraction, audio extraction (ffmpeg)

Readme

@reaatech/media-pipeline-mcp-video-gen

npm version License: MIT CI

Status: Pre-1.0 — APIs may change in minor versions. Pin to a specific version in production.

Video generation and processing — text-to-video and image-to-video via provider delegation, plus local ffmpeg-based frame extraction, audio extraction, subtitle generation with burn-in, loudness normalization, and video cropping.

Installation

npm install @reaatech/media-pipeline-mcp-video-gen
# or
pnpm add @reaatech/media-pipeline-mcp-video-gen

Requirements

ffmpeg must be installed for local video processing operations (extractFrames, extractAudio, FfmpegWrapper, and subtitle burn-in):

# macOS
brew install ffmpeg

# Ubuntu/Debian
apt-get install ffmpeg

# Windows (Chocolatey)
choco install ffmpeg

Feature Overview

  • Text-to-video — generate videos from text prompts via provider delegation (Kling, etc.)
  • Image-to-video — animate still images into videos via provider delegation
  • Frame extraction — extract frames at configurable intervals or specific timestamps via ffmpeg
  • Audio extraction — extract audio tracks from video files as AAC via ffmpeg
  • Subtitle pipeline — end-to-end subtitle generation (STT → segment processing → encoding → optional burn-in) with SRT/VTT/ASS format support
  • Subtitle burn-in — render ASS subtitles into video with configurable fonts, colors, and positioning
  • Subtitle translation — translate generated subtitles to a target language via LLM
  • Loudness measurement & normalization — measure audio loudness (EBU R128) and normalize to a target loudness level
  • Video cropping — crop videos to specified dimensions via ffmpeg
  • Multi-provider routing — operation-based lookup with preferred provider selection

Quick Start

import { createVideoGenOperations } from "@reaatech/media-pipeline-mcp-video-gen";
import { ReplicateProvider } from "@reaatech/media-pipeline-mcp-replicate";

const ops = createVideoGenOperations(artifactRegistry, storage);

// Register a provider for video generation
ops.registerProvider("replicate", new ReplicateProvider({
  apiKey: process.env.REPLICATE_API_KEY!,
}));

// Local operations (ffmpeg-based, no provider needed)

// Extract frames every 2 seconds
const frames = await ops.extractFrames({
  artifactId: "video-123",
  interval: 60,     // Extract every 60th frame (~1 per second at 60fps)
});

// Extract audio track
const audio = await ops.extractAudio({
  artifactId: "video-123",
});

// Provider-delegated operations

// Generate video from text prompt
const video = await ops.generate({
  prompt: "A drone flythrough of a canyon at golden hour",
  duration: 5,
  aspectRatio: "16:9",
  style: "cinematic",
});

// Animate an image into a video
const animated = await ops.imageToVideo({
  artifactId: "img-123",
  motionPrompt: "Gentle camera pan and zoom",
  duration: 5,
});

API Reference

createVideoGenOperations(artifactRegistry, storage)

Factory function that creates a VideoGenOperations instance.

function createVideoGenOperations(
  artifactRegistry: ArtifactRegistry,
  storage: ArtifactStore,
): VideoGenOperations;

VideoGenOperations

Main class providing video generation and local processing capabilities.

class VideoGenOperations {
  constructor(artifactRegistry: ArtifactRegistry, storage: ArtifactStore);

  registerProvider(name: string, provider: MediaProvider): void;

  generate(config: VideoGenerateConfig): Promise<Artifact>;
  imageToVideo(config: ImageToVideoConfig): Promise<Artifact>;
  extractFrames(config: ExtractFramesConfig): Promise<Artifact[]>;
  extractAudio(config: ExtractAudioConfig): Promise<Artifact>;
}

Operation Configs

VideoGenerateConfig

interface VideoGenerateConfig {
  prompt: string;                          // Text description of the video
  duration?: number;                       // Duration in seconds (default: 5)
  aspectRatio?: "16:9" | "9:16" | "1:1" | "4:3";  // Aspect ratio (default: "16:9")
  style?: string;                          // Style descriptor (e.g., "cinematic")
  provider?: string;                       // Force specific provider
}

ImageToVideoConfig

interface ImageToVideoConfig {
  artifactId: string;                      // ID of the source image
  motionPrompt?: string;                   // Description of motion to apply
  duration?: number;                       // Duration in seconds (default: 5)
  provider?: string;                       // Force specific provider
}

ExtractFramesConfig

interface ExtractFramesConfig {
  artifactId: string;                      // ID of the video
  interval?: number;                       // Extract every Nth frame (default: fps, i.e., 1 per sec)
  timestamps?: number[];                   // Specific timestamps in seconds to extract at
}

ExtractAudioConfig

interface ExtractAudioConfig {
  artifactId: string;                      // ID of the video
}

FfmpegWrapper

Low-level ffmpeg wrapper providing direct access to ffmpeg capabilities. All methods are static.

class FfmpegWrapper {
  static isAvailable(): Promise<boolean>;
  static exec(args: string[], options?: { timeout?: number }): Promise<{ stdout: string; stderr: string }>;
  static extractAudio(inputPath: string, outputPath: string, format?: string): Promise<void>;
  static burnSubtitles(inputPath: string, subtitlePath: string, outputPath: string, options?: BurnInOptions): Promise<void>;
  static measureLoudness(inputPath: string): Promise<LoudnessMeasurement>;
  static normalizeLoudness(inputPath: string, target: LoudnessTarget, measured: LoudnessMeasurement, outputPath: string): Promise<void>;
  static cropVideo(inputPath: string, outputPath: string, width: number, height: number, x?: number, y?: number): Promise<void>;
}

BurnInOptions

interface BurnInOptions {
  font?: string;                           // Font name (default: "Arial")
  fontSize?: number;                       // Font size in px (default: 24)
  fontColor?: string;                      // Font color hex (default: "FFFFFF")
  outline?: { color: string; widthPx: number };  // Text outline
  position?: "top" | "middle" | "bottom";  // Screen position (default: "bottom")
  marginPx?: number;                       // Margin from edge in px (default: 10)
  background?: { color: string; opacity: number };  // Subtitle background box
}

LoudnessMeasurement

interface LoudnessMeasurement {
  iLufs: number;                           // Integrated loudness in LUFS
  lra: number;                             // Loudness range
  tpDb: number;                            // True peak in dB
}

LoudnessTarget

interface LoudnessTarget {
  iLufs: number;                           // Target integrated loudness (LUFS)
  lra: number;                             // Target loudness range
  tpDb: number;                            // Target true peak (dB)
}

SubtitlePipeline

End-to-end subtitle generation pipeline that extracts audio, runs STT, post-processes segments, optionally translates, and burns subtitles into the video.

class SubtitlePipeline {
  constructor(providers: Map<string, MediaProvider>, storage: ArtifactStore);

  generate(config: SubtitleConfig): Promise<SubtitleOutput>;
}

SubtitleConfig

interface SubtitleConfig {
  artifactId: string;                      // ID of the video artifact
  language?: string;                       // Language code (default: "en")
  format?: "srt" | "vtt" | "ass";          // Subtitle format (default: "srt")
  sttProvider?: string;                    // STT provider override
  sttModel?: string;                       // STT model override
  burnIn?: BurnInOptions;                  // Burn subtitles into video
  diarize?: boolean;                       // Enable speaker diarization (default: false)
  translateTo?: string;                    // Target language for translation
}

SubtitleOutput

interface SubtitleOutput {
  subtitleArtifactId: string;              // Artifact ID of the generated subtitle text
  burnedArtifactId?: string;               // Artifact ID of the video with burned-in subtitles
  language: string;                        // Language used
  segments: SubtitleSegment[];             // All parsed subtitle segments
  totalCostUsd: number;                    // Total cost of the operation
}

SubtitleSegment

interface SubtitleSegment {
  index: number;                           // Subtitle index (1-based)
  startMs: number;                         // Start time in milliseconds
  endMs: number;                           // End time in milliseconds
  text: string;                            // Subtitle text (may be multi-line)
  speaker?: string;                        // Identified speaker (if diarized)
  confidence?: number;                     // Segment confidence score
}

createSubtitlePipeline(providers, storage)

function createSubtitlePipeline(
  providers: Map<string, MediaProvider>,
  storage: ArtifactStore,
): SubtitlePipeline;

Usage Patterns

Frame Extraction by Interval

// Extract frames at ~1 frame per second (defaults to interval = fps)
const frames = await ops.extractFrames({
  artifactId: "video-123",
});

// Extract every 5 seconds (at 30fps: interval = 150)
const every5s = await ops.extractFrames({
  artifactId: "video-123",
  interval: 150,
});

console.log(frames.length);  // number of frames extracted
for (const frame of frames) {
  console.log(frame.metadata.timestamp);   // seconds
  console.log(frame.metadata.frameIndex);  // 0-based
  console.log(frame.metadata.width, frame.metadata.height);
}

Frame Extraction at Specific Timestamps

const frames = await ops.extractFrames({
  artifactId: "video-123",
  timestamps: [3.5, 12.0, 27.8, 45.2],
});

// Returns exactly 4 frames at the specified times

Extract Audio Track

const audio = await ops.extractAudio({
  artifactId: "video-123",
});

console.log(audio.mimeType);            // "audio/aac"
console.log(audio.metadata.sampleRate); // 48000
console.log(audio.metadata.channels);   // 2
console.log(audio.metadata.codec);      // "aac"
console.log(audio.metadata.duration);   // seconds from source video

Subtitle Generation Pipeline

import { createSubtitlePipeline } from "@reaatech/media-pipeline-mcp-video-gen";

const pipeline = createSubtitlePipeline(providerMap, storage);

// Generate SRT subtitles
const result = await pipeline.generate({
  artifactId: "video-123",
  language: "en",
  format: "srt",
  diarize: true,
});

console.log(result.subtitleArtifactId);   // artifact ID for subtitle text
console.log(result.segments.length);      // number of subtitle segments
for (const seg of result.segments.slice(0, 3)) {
  console.log(`[${seg.startMs}ms → ${seg.endMs}ms] ${seg.speaker ?? "Narrator"}: ${seg.text}`);
}

Subtitle Burn-in with Custom Styling

const result = await pipeline.generate({
  artifactId: "video-123",
  format: "ass",
  language: "en",
  burnIn: {
    font: "Helvetica",
    fontSize: 28,
    fontColor: "#FFFFFF",
    outline: { color: "#000000", widthPx: 3 },
    position: "bottom",
    marginPx: 20,
    background: { color: "#000000", opacity: 0.4 },
  },
});

console.log(result.burnedArtifactId);  // artifact ID for video with burned subtitles

Subtitle Translation

const result = await pipeline.generate({
  artifactId: "video-123",
  language: "en",
  format: "srt",
  translateTo: "es",  // Translate to Spanish
});

console.log(result.language);  // "es"
// Segments contain the translated text

Provider Delegation for Video Generation

import { ReplicateProvider } from "@reaatech/media-pipeline-mcp-replicate";
import { FalProvider } from "@reaatech/media-pipeline-mcp-fal";

const ops = createVideoGenOperations(artifactRegistry, storage);
ops.registerProvider("replicate", new ReplicateProvider({ apiKey: process.env.REPLICATE_API_KEY! }));
ops.registerProvider("fal", new FalProvider({ apiKey: process.env.FAL_API_KEY! }));

// Text-to-video — routes to first provider supporting "video.generate"
const video = await ops.generate({
  prompt: "Timelapse of a flower blooming in a sunlit garden",
  duration: 10,
  aspectRatio: "16:9",
  style: "cinematic",
});

console.log(video.metadata.provider);     // provider name
console.log(video.metadata.costUsd);      // cost in USD
console.log(video.metadata.duration);     // seconds
console.log(video.metadata.fps);          // 30
console.log(video.metadata.codec);        // "h264"

// Image-to-video
const animated = await ops.imageToVideo({
  artifactId: "img-123",
  motionPrompt: "Gentle camera drift left to right",
  duration: 8,
  provider: "fal",  // force specific provider
});

Direct ffmpeg Usage

import { FfmpegWrapper } from "@reaatech/media-pipeline-mcp-video-gen";

// Check availability
const available = await FfmpegWrapper.isAvailable();
if (!available) throw new Error("ffmpeg is required");

// Execute arbitrary ffmpeg commands
await FfmpegWrapper.exec([
  "-i", "/tmp/input.mp4",
  "-vf", "scale=1280:720",
  "-y", "/tmp/output.mp4",
]);

// Extract audio in specific format
await FfmpegWrapper.extractAudio("/tmp/input.mp4", "/tmp/audio.aac", "aac");
await FfmpegWrapper.extractAudio("/tmp/input.mp4", "/tmp/audio.mp3", "mp3");

// Measure loudness (EBU R128)
const loudness = await FfmpegWrapper.measureLoudness("/tmp/audio.wav");
console.log(loudness.iLufs, loudness.lra, loudness.tpDb);

// Normalize to broadcast standard
const target: LoudnessTarget = { iLufs: -23, lra: 7, tpDb: -2 };
await FfmpegWrapper.normalizeLoudness("/tmp/input.wav", target, loudness, "/tmp/normalized.wav");

// Crop a video
await FfmpegWrapper.cropVideo("/tmp/input.mp4", "/tmp/cropped.mp4", 1280, 720, 100, 50);

Related Packages

License

MIT