npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

remotion-captioneer

v0.9.0

Published

Drop-in animated captions for Remotion. Audio to word-level synced subtitle components. Supports OpenAI, Groq, Deepgram, AssemblyAI.

Downloads

970

Readme

🎬 remotion-captioneer

Drop-in animated captions for Remotion.

Feed it audio. Get word-level synced, beautifully animated captions. Four styles. Zero hassle.

CI npm license remotion CodeQL

🌐 Live Demo →


🤝 Works With @remotion/captions

Our types are fully compatible with the official @remotion/captions package. You can convert freely between them:

import { createTikTokStyleCaptions } from "@remotion/captions";
import { toCaptionArray, fromCaptionArray } from "remotion-captioneer";

// Convert our CaptionData → flat Caption[] for @remotion/captions
const flatCaptions = toCaptionArray(myCaptionData);
const { pages } = createTikTokStyleCaptions({
  captions: flatCaptions,
  combineTokensWithinMilliseconds: 1200,
});

// Or go the other way: Caption[] → CaptionData
const captionData = fromCaptionArray(flatCaptions);

| | @remotion/captions (official) | remotion-captioneer (this) | |---|---|---| | Caption types | ✅ Caption type | ✅ Compatible + CaptionData with segments | | Page segmentation | ✅ createTikTokStyleCaptions() | ❌ Use official package | | Animated components | ❌ Build yourself | ✅ 4 ready-to-use styles | | STT/transcription | ❌ Separate package | ✅ 5 providers built-in | | CLI tool | ❌ | ✅ npx captioneer process |


🎥 Caption Styles Preview

Word Highlight

Each word lights up as it's spoken with a scale animation.

"Hello world this is"
  dim  dim  GOLD  dim

Karaoke

Progressive color fill — left-to-right like karaoke.

"Hello world this is"
 RED   red  ░░░░  ░░░

Typewriter

Character-by-character reveal with blinking cursor.

┌─────────────────────┐
│ Hello world th|      │
└─────────────────────┘

Bounce

Active word bounces up with spring physics.

"Hello  world  this  is"
  ↓     ↑      ↓     ↓
       bounce!

👉 See them animated live at the demo page.


✨ Features

  • 🎙️ 5 STT Providers — Local Whisper, OpenAI, Groq, Deepgram, AssemblyAI
  • 🎨 14 Caption Styles — Word Highlight, Karaoke, Typewriter, Bounce, Wave, Glow, Erase, Pill, Flicker, Highlighter, Blur, Rainbow, Scale, Spotlight
  • 🎭 24 Presets — TikTok, Instagram, YouTube, Podcast, Cinematic, Music, Tutorial, Minimal, Gaming, News, Education, Fun
  • 🎵 Audio-Video Sync — Beat detection, volume-reactive animations, timeline keyframes
  • 📦 Template System — Data-driven video generation from JSON config
  • 🧱 Layout Primitives — Stack, Row, Columns, Grid, Center, FadeIn, SlideUp
  • 📤 7 Export Formats — SRT, VTT, ASS, TXT, word-level SRT & VTT
  • Drop-in Components<AnimatedCaptions> works out of the box
  • 🔧 CLI Tool — process, batch, export, presets, providers, styles
  • 📐 Zero Config — Works with sensible defaults, customizable everything
  • 🔷 TypeScript — Full type definitions included
  • 🐳 Docker Ready — Deploy rendering at scale

🚀 Quick Start

Option 1: Scaffold a Project

npx captioneer init my-video
cd my-video
npm install
npm start

This creates a ready-to-use Remotion project with captions.

Option 2: Add to Existing Project

1. Install

npm install remotion-captioneer

Option 2: Add to Existing Project

1. Install

npm install remotion-captioneer

2. Generate Captions from Audio

npx captioneer process my-audio.mp4

This creates my-audio-captions.json with word-level timestamps.

3. Use in Your Remotion Project

import { AbsoluteFill } from "remotion";
import { AnimatedCaptions } from "remotion-captioneer";
import captions from "./my-audio-captions.json";

export const MyVideo = () => {
  return (
    <AbsoluteFill style={{ backgroundColor: "#0a0a0a" }}>
      <AnimatedCaptions
        captions={captions}
        style="word-highlight"
        position="bottom"
        highlightColor="#FFD700"
      />
    </AbsoluteFill>
  );
};

That's it. Render with npx remotion render as usual.


🎨 Caption Styles

14 animated styles, each with a unique visual feel:

| Style | Effect | Best For | |-------|--------|----------| | word-highlight | Each word lights up with scale animation | Podcasts, interviews | | karaoke | Progressive left-to-right color fill | Music, singing | | typewriter | Character-by-character reveal + cursor | Tutorials, code demos | | bounce | Active word bounces with spring physics | Social media, reels | | wave | Words animate in a wave pattern | Music, rhythmic content | | glow | Neon glow pulsing on active word | Cinematic, dramatic | | typewriter-erase | Types then erases word-by-word | Transitions, reveals | | pill | Active word in a colored pill/badge | Clean, modern look | | flicker | Flickers in like a neon sign | Retro, neon aesthetic | | highlighter | Yellow highlighter behind active word | Study, educational | | blur | Future words blur, active word sharpens | Dramatic reveals | | rainbow | Cycling rainbow colors on active word | Fun, playful content | | scale | Words grow from small to full size | Energetic, bold | | spotlight | Radial spotlight effect behind active word | Theatrical, stage |

<AnimatedCaptions captions={captions} style="word-highlight" />
<AnimatedCaptions captions={captions} style="karaoke" />
<AnimatedCaptions captions={captions} style="typewriter" />
<AnimatedCaptions captions={captions} style="bounce" />
<AnimatedCaptions captions={captions} style="wave" />
<AnimatedCaptions captions={captions} style="glow" />
<AnimatedCaptions captions={captions} style="typewriter-erase" />
<AnimatedCaptions captions={captions} style="pill" />
<AnimatedCaptions captions={captions} style="flicker" />
<AnimatedCaptions captions={captions} style="highlighter" />
<AnimatedCaptions captions={captions} style="blur" />
<AnimatedCaptions captions={captions} style="rainbow" />
<AnimatedCaptions captions={captions} style="scale" />
<AnimatedCaptions captions={captions} style="spotlight" />

📡 STT Providers

Choose your speech-to-text backend. Supports 5 providers out of the box:

| Provider | Env Variable | Speed | Offline | Best For | |----------|-------------|-------|---------|----------| | Local Whisper | — | ⭐⭐ | ✅ | Privacy, no API costs | | OpenAI | OPENAI_API_KEY | ⭐⭐⭐ | ❌ | Best accuracy | | Groq | GROQ_API_KEY | ⭐⭐⭐⭐⭐ | ❌ | Ultra-fast inference | | Deepgram | DEEPGRAM_API_KEY | ⭐⭐⭐⭐ | ❌ | Real-time capable | | AssemblyAI | ASSEMBLYAI_API_KEY | ⭐⭐⭐ | ❌ | Rich features |


🎭 Caption Presets

Apply a professional look instantly with one of 16 built-in presets:

import { AnimatedCaptions, applyPreset } from "remotion-captioneer";

// Use a preset
<AnimatedCaptions
  captions={captions}
  {...applyPreset("tiktok")}
/>

// Or spread individual props
const tiktokStyle = applyPreset("cinematic-gold");
<AnimatedCaptions captions={captions} {...tiktokStyle} />

Available Presets

| Category | Presets | |----------|---------| | Social Media | tiktok, instagram-reels, youtube-shorts, twitter-clips | | Podcast | podcast-clean, podcast-bold | | Cinematic | cinematic-gold, cinematic-white, cinematic-neon | | Music | music-karaoke, music-wave | | Tutorial | tutorial-typewriter, tutorial-erase | | Minimal | minimal-white, minimal-subtle | | Gaming | gaming-neon, gaming-bold | | News & Documentary | news-ticker, documentary | | Education | education-highlighter, education-scale | | Fun & Creative | fun-rainbow, retro-flicker |

# List presets from CLI
npx captioneer presets

📤 Export Formats

Export captions to standard subtitle formats:

import { toSRT, toVTT, toASS, toPlainText } from "remotion-captioneer";

const srt = toSRT(captionData);       // SubRip (.srt)
const vtt = toVTT(captionData);       // WebVTT (.vtt)
const ass = toASS(captionData);       // SubStation Alpha (.ass)
const txt = toPlainText(captionData); // Plain text

// Word-level exports (for custom timing)
const srtWords = toWordLevelSRT(captionData);
const vttWords = toWordLevelVTT(captionData);
# Export from CLI
npx captioneer export captions.json --format srt
npx captioneer export captions.json --format vtt --output subtitles.vtt
npx captioneer export captions.json --format ass
npx captioneer export captions.json --format srt-words

Formats: srt, vtt, ass, txt, srt-words, vtt-words

Auto-Detection

The CLI auto-detects available providers from environment variables:

# Groq is fastest — set this first if you have a key
export GROQ_API_KEY="gsk_..."

# Or OpenAI
export OPENAI_API_KEY="sk-..."

# Then just run — it picks the best available
npx captioneer process audio.mp4

Explicit Provider

npx captioneer process audio.mp4 --provider groq
npx captioneer process audio.mp4 --provider openai --model whisper-1
npx captioneer process audio.mp4 --provider deepgram --model nova-2
npx captioneer process audio.mp4 --provider assemblyai
npx captioneer process audio.mp4 --provider local --model base

Check Provider Status

npx captioneer providers
📡 Available STT Providers:

  local           ✅ ready
                  models: tiny, base, small, medium, large

  groq            ✅ ready
                  models: whisper-large-v3, whisper-large-v3-turbo, distil-whisper-large-v3-en

  openai          ⚪ not configured
                  models: whisper-1

Programmatic Usage

import { GroqProvider, OpenAIProvider } from "remotion-captioneer";

// Groq — ultra-fast
const groq = new GroqProvider("gsk_...");
const captions = await groq.transcribe("audio.mp4", {
  model: "whisper-large-v3-turbo",
  language: "en",
});

// OpenAI
const openai = new OpenAIProvider("sk-...");
const captions = await openai.transcribe("audio.mp4");

// Auto-detect from env
import { detectProvider } from "remotion-captioneer";
const detected = detectProvider();
if (detected) {
  const captions = await detected.provider.transcribe("audio.mp4");
}

🎵 Audio-Video Sync

Frame-perfect animations synchronized to audio. No more manually timing keyframes.

Pre-analyze Audio

import { analyzeAudio } from "remotion-captioneer";

const analysis = await analyzeAudio("my-audio.mp4");
// Returns: beats, volumeFrames, bpm, energy levels

Beat-Reactive Hooks

import {
  AudioSyncProvider,
  useBeatPulse,
  useVolume,
  useEnergy,
} from "remotion-captioneer";

// Wrap your composition
const MyVideo = () => (
  <AudioSyncProvider analysis={audioAnalysis}>
    <BeatReactiveContent />
  </AudioSyncProvider>
);

// Use in any child component
const BeatReactiveContent = () => {
  const pulse = useBeatPulse();       // 0→1 spring on each beat
  const volume = useVolume();          // Current volume 0-1
  const energy = useEnergy();          // Smoothed energy 0-1

  return (
    <div style={{
      transform: `scale(${1 + pulse * 0.2})`,
      opacity: 0.5 + volume * 0.5,
    }}>
      🎵 Synced to the beat!
    </div>
  );
};

Timeline Keyframes

import { useTimelineValue, fadeInOut } from "remotion-captioneer";

// Map animation to audio timestamps (in ms)
const opacity = useTimelineValue({
  keyframes: [
    { timeMs: 0, value: 0 },
    { timeMs: 1000, value: 1, easing: "easeOut" },
    { timeMs: 5000, value: 1 },
    { timeMs: 6000, value: 0, easing: "easeIn" },
  ],
  defaultValue: 0,
});

// Or use the helper
const fadeOpacity = useTimelineValue(
  fadeInOut(0, 1000, 5000, 6000)
);

Available Hooks

| Hook | Returns | Use For | |------|---------|---------| | useVolume() | number (0-1) | Opacity, scale, size | | useBeat() | BeatInfo \| null | Flash effects, pulses | | useBeatPulse() | number (0-1 spring) | Bounce, scale on beat | | useEnergy() | number (0-1) | Background intensity | | useIsOnBeat() | boolean | Conditional rendering | | useTimelineValue() | number | Keyframe animations | | useTimelineProgress() | number (0-1) | Progress bars |


📦 Template System

Build videos from JSON config. No code needed for simple videos.

Quick Template

import { buildTemplate, TemplateComposition } from "remotion-captioneer";

const template = buildTemplate({
  name: "My Captioned Video",
  intro: {
    title: "Episode 1",
    subtitle: "Getting Started",
    logo: "/logo.png",
  },
  captions: [
    { captions: myCaptions, captionStyle: "word-highlight" },
  ],
  outro: {
    heading: "Thanks for watching!",
    cta: "Subscribe for more",
    logo: "/logo.png",
  },
});

// Use as Remotion composition
<TemplateComposition template={template} />

Preset Scenes

import {
  createIntroScene,
  createCaptionScene,
  createOutroScene,
  createDividerScene,
} from "remotion-captioneer";

const intro = createIntroScene({
  title: "My Video",
  subtitle: "A demo",
  durationSec: 3,
});

const content = createCaptionScene({
  captions: myCaptions,
  captionStyle: "karaoke",
  highlightColor: "#FF6B6B",
});

const outro = createOutroScene({
  heading: "The End",
  cta: "Like & Subscribe",
  logo: "/logo.png",
});

Design Tokens

Customize the entire look with a single config:

const template = buildTemplate({
  name: "Brand Video",
  tokens: {
    colors: {
      primary: "#6366F1",
      accent: "#FFD700",
      background: "#0a0a0a",
      text: "#FFFFFF",
    },
    typography: {
      headingFont: "Poppins, sans-serif",
      bodyFont: "Inter, sans-serif",
    },
  },
  // ...
});

🧱 Layout Primitives

Composable layout building blocks for any Remotion video:

import {
  Stack, Row, Columns, Grid,
  Center, FadeIn, SlideUp,
  GradientBg, Overlay, Positioned,
} from "remotion-captioneer";

// Vertical stack
<Stack gap={24}>
  <FadeIn delayMs={0}>Title</FadeIn>
  <FadeIn delayMs={200}>Subtitle</FadeIn>
</Stack>

// Horizontal columns
<Columns ratios={[2, 1]} gap={32}>
  <div>Main content</div>
  <div>Sidebar</div>
</Columns>

// Grid layout
<Grid columns={3} gap={16}>
  {items.map(item => <Card key={item.id} />)}
</Grid>

// Animated entrance
<SlideUp delayMs={500} durationMs={800}>
  <div>Slides up with delay</div>
</SlideUp>

// Gradient background
<GradientBg from="#0a0a0a" to="#1a1a2e">
  <Center>Content here</Center>
</GradientBg>

🎙️ CLI Reference

Process Audio

# Basic usage (auto-detects provider from env vars)
npx captioneer process audio.mp4

# Specify provider
npx captioneer process audio.mp4 --provider groq
npx captioneer process audio.mp4 --provider openai --model whisper-1

# With options
npx captioneer process audio.mp4 --provider groq --language en --output captions.json
npx captioneer process audio.mp4 --provider local --model base

# Pass API key directly
npx captioneer process audio.mp4 --provider groq --api-key gsk_...

Options:

  • -p, --provider <provider> — STT provider: local, openai, groq, deepgram, assemblyai
  • -m, --model <model> — Model name (provider-specific)
  • -k, --api-key <key> — API key (or use env vars)
  • -l, --language <lang> — Language code: en, es, fr, de, etc.
  • -o, --output <path> — Output JSON path
  • -v, --verbose — Verbose output

Other Commands

# Scaffold a new project
npx captioneer init my-video

# List available providers and their status
npx captioneer providers

# List available caption styles
npx captioneer styles

# List available presets
npx captioneer presets

# Export captions to SRT/VTT/ASS
npx captioneer export captions.json --format srt
npx captioneer export captions.json --format vtt --output subs.vtt

# Batch process a directory of audio files
npx captioneer batch ./audio-files/
npx captioneer batch ./audio-files/ --provider groq --output-dir ./captions/

# Start real-time preview server
npx captioneer preview

# Open Remotion Studio with demos
npx captioneer demo

📖 Caption Data Format

The generated JSON follows this structure:

interface CaptionData {
  segments: Array<{
    text: string;           // Full segment text
    startMs: number;        // Segment start time (ms)
    endMs: number;          // Segment end time (ms)
    words: Array<{
      word: string;         // Word text
      startMs: number;      // Word start time (ms)
      endMs: number;        // Word end time (ms)
      confidence: number;   // Whisper confidence (0-1)
    }>;
  }>;
  language: string;         // Detected language
  durationMs: number;       // Total duration (ms)
}

You can also create caption data manually or from other sources — just match this format.


⚙️ Configuration

Create a .captioneerrc file in your project root:

{
  "whisperPath": "./whisper.cpp",
  "modelPath": "./whisper.cpp/models/ggml-base.bin",
  "defaultModel": "base",
  "defaultLanguage": "en",
  "defaultStyle": "word-highlight"
}

Or add to your package.json:

{
  "captioneer": {
    "defaultModel": "base",
    "defaultLanguage": "en"
  }
}

🎬 Full Example

import {
  AbsoluteFill,
  Audio,
  Composition,
  staticFile,
} from "remotion";
import { AnimatedCaptions } from "remotion-captioneer";
import captions from "./captions.json";

export const CaptionedVideo = () => (
  <AbsoluteFill
    style={{
      background: "linear-gradient(135deg, #0a0a0a 0%, #1a1a2e 100%)",
    }}
  >
    <Audio src={staticFile("my-audio.mp4")} />
    <AnimatedCaptions
      captions={captions}
      style="karaoke"
      position="bottom"
      highlightColor="#FF6B6B"
      fontSize={64}
      fontFamily="Inter, sans-serif"
    />
  </AbsoluteFill>
);

export const RemotionRoot = () => (
  <Composition
    id="CaptionedVideo"
    component={CaptionedVideo}
    durationInFrames={900} // 30s at 30fps
    fps={30}
    width={1920}
    height={1080}
  />
);

🐳 Docker

FROM node:20-slim

# Install whisper.cpp dependencies
RUN apt-get update && apt-get install -y git cmake build-essential

WORKDIR /app
COPY . .
RUN npm install

# The CLI will auto-install whisper.cpp on first run
ENTRYPOINT ["npx", "captioneer"]

🛠️ Component Props

<AnimatedCaptions>

| Prop | Type | Default | Description | |------|------|---------|-------------| | captions | CaptionData | required | Caption data object | | style | CaptionStyle | "word-highlight" | Caption animation style | | fontFamily | string | "Inter, sans-serif" | Font family | | fontSize | number | 56 | Font size in px | | fontColor | string | "rgba(255,255,255,0.5)" | Inactive text color | | highlightColor | string | "#FFD700" | Active/highlight color | | position | "top" \| "center" \| "bottom" | "bottom" | Vertical position |


📚 Examples

See the examples/ directory for complete working examples:

| File | What it shows | |------|---------------| | 01-basic.tsx | Simplest captioned video | | 02-presets.tsx | Using presets (TikTok, Cinematic, Gaming) | | 03-audio-sync.tsx | Beat-reactive animations | | 04-template.tsx | Multi-scene template (intro → content → outro) | | 05-layouts.tsx | Custom layouts with primitives | | 06-export.ts | Export to SRT, VTT, ASS formats | | 07-emoji.tsx | Emoji reactions at word timestamps |


🗺️ Roadmap

✅ Completed

  • [x] 14 caption styles (word-highlight, karaoke, typewriter, bounce, wave, glow, typewriter-erase, pill, flicker, highlighter, blur, rainbow, scale, spotlight)
  • [x] Multi-line auto-wrapping with smart breaks (smartWrap())
  • [x] Word-level emoji reactions (EmojiReactions + autoGenerateReactions())
  • [x] Real-time preview server (npx captioneer preview)
  • [x] Batch processing mode (npx captioneer batch ./audio/)
  • [x] Multi-provider STT (OpenAI, Groq, Deepgram, AssemblyAI)
  • [x] @remotion/captions compatibility layer
  • [x] Audio-video sync (beat detection, volume hooks, timeline)
  • [x] Template system for data-driven videos
  • [x] Layout primitives (Stack, Row, Columns, Grid, etc.)
  • [x] 16 caption presets (TikTok, Instagram, Podcast, Cinematic, etc.)
  • [x] Export formats (SRT, VTT, ASS, TXT, word-level)

🔮 Future

  • [ ] Caption style marketplace (community-contributed styles)
  • [ ] AI-powered auto-emoji (LLM suggests emojis from context)
  • [ ] Multi-language caption support with RTL
  • [ ] Caption editor with visual timeline
  • [ ] Integration with video hosting APIs (YouTube, Vimeo)
  • [ ] Real-time caption rendering in browser (WebCodecs)
  • [ ] Caption translation utilities
  • [ ] Speaker diarization (multi-speaker support)

🤝 Contributing

Contributions welcome! Please open an issue first to discuss what you'd like to change.

  1. Fork the repo
  2. Create your feature branch (git checkout -b feature/amazing)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing)
  5. Open a Pull Request

📄 License

MIT © Shuvo Roy


💡 Why This Exists

Everyone using Remotion for captioned videos ends up rebuilding the same thing:

Get audio → run Whisper → parse output → sync to frames → animate words

This package handles steps 2-5 so you can focus on your content, not plumbing.

⭐ Star this repo if it helps you!