playwright-recast

v0.19.0

Published

24 days ago

Fluent pipeline library for processing Playwright traces into polished demo videos — TTS voiceover, subtitles, speed control, and zoom.

0High
0Medium
0Low

thepatriczek

playwright trace video demo tts voiceover subtitles srt bdd automation

playwright-recast

Transform Playwright traces into stunning demo videos — automatically.

Website · Documentation

Your Playwright tests already capture everything — traces, screenshots, network activity, cursor positions. playwright-recast turns those artifacts into polished, narrated product videos with a single fluent pipeline.

https://github.com/user-attachments/assets/418d996d-2e18-4ae8-9ccc-3e5161dc7af8

Why?

Recording product demos is painful. Every UI change means re-recording. Manual voiceover and subtitling takes hours. Timing is always off.

playwright-recast flips this:** your Playwright tests become your video source.** Write tests once, regenerate polished videos on every deploy.

import { Recast, ElevenLabsProvider } from 'playwright-recast'

await Recast
  .from('./test-results/trace.zip')
  .parse()
  .speedUp({ duringIdle: 3.0, duringUserAction: 1.0 })
  .subtitlesFromSrt('./narration.srt')
  .voiceover(ElevenLabsProvider({ voice: 'daniel' }), { normalize: true })
  .render({ format: 'mp4', resolution: '1080p' })
  .toFile('demo.mp4')

That's it. Trace in, polished video out.

Features

Fluent pipeline API — Chainable, immutable, lazy-evaluated. Build complex pipelines that read like English.
Trace-based processing — Parses Playwright trace.zip (actions, screenshots, network, cursor positions). No manual recording needed.
Smart speed control — Automatically speeds up idle time, network waits, and navigation while keeping user actions at normal speed.
TTS voiceover — Generate narration with OpenAI TTS, ElevenLabs, or Amazon Polly. Properly timed with silence padding.
Subtitle generation — SRT, WebVTT, and ASS output. Import external SRT or generate from trace BDD step titles.
Styled subtitle burn-in — Configurable font, size, color, background box with opacity, padding, position. Smart punctuation-based chunking for single-line display.
playwright-bdd support — First-class integration with playwright-bdd Gherkin steps. Doc strings become voiceover narration.
Click highlighting — Animated ripple effect at click positions with optional click sound. Configurable color, opacity, radius, duration.
Cursor overlay — Animated cursor appears before each click, moves to the click position with ease-out animation, then disappears. Bundled arrow cursor or custom image.
Animated zoom with easing — Auto-zoom uses customizable easing functions (ease-in-out, ease-out, cubic-bezier, or custom JS functions) with smooth zoom-to-zoom panning.
Frame interpolation — Smooth out choppy browser recordings with ffmpeg minterpolate. Blend, duplicate, or motion-compensated modes with multi-pass support.
Step helpers — narrate(), highlight(), zoom(), pace(), click(), markClick(), waitForNarration() — importable helpers for Playwright step definitions. narrate/highlight/zoom/click write marker steps directly into the trace zip, so the pipeline picks them up automatically via subtitlesFromTrace() (no report.json or extra pipeline calls needed).
Polished click markers — click() / markClick() mark a click in the trace; the renderer prefers these over auto-detected clicks and plays a deliberate, held cursor approach over the painted target (configurable via cursorOverlay({ approachMs })) — no more "the mouse moves before there's anything to click on."
Voiceover-driven freezes — When a TTS narration is longer than its visual window, the renderer holds the current frame until the audio finishes; overlays freeze with it, click sounds shift to match. waitForNarration() marks an explicit beat to hold on until a line is fully spoken — so with TTS you can skip autoWait entirely and run the test at full speed while the rendered video stays in sync.
Soft (embedded) subtitle track — render({ embedSubtitles: true }) muxes a toggleable subtitle track into the container (mov_text for mp4, webvtt for webm).
Background music — Add background music with auto-ducking during voiceover, looping, and fade-out. Covers intro/outro.
Intro/outro — Prepend/append branded video clips with smooth crossfade transitions. Audio preserved.
MCP server — AI-assisted video creation via Model Context Protocol. Record, analyze, and render through any MCP-compatible client (Claude Code, etc.).
recast-studio — Record browser sessions via Playwright Codegen, then generate videos with a Claude Code skill. No code required.
CLI included — npx playwright-recast -i trace.zip -o demo.mp4 — no code needed.
Zero lock-in — Every stage is optional. Use just the trace parser, just the subtitle generator, or the full pipeline.

Quick Start

Install

npm install playwright-recast
# or
bun add playwright-recast

System requirement: ffmpeg and ffprobe must be on your PATH.

# macOS
brew install ffmpeg

# Ubuntu
sudo apt install ffmpeg

Trace inputs. Two layouts work:

Test-results directory (preferred): pass a folder or trace.zip whose sibling is a .webm recorded by Playwright's recordVideo (use: { video: 'on' }). Native frame rate, best quality.
Standalone trace.zip: pass the zip alone — the pipeline assembles the source video from screencast JPEG frames stored inside the trace. Variable cadence (~capture rate), but works out of the box on traces from the Playwright trace viewer or test runs without recordVideo.

CLI Usage

# Basic — trace to video
npx playwright-recast -i ./test-results/trace.zip -o demo.mp4

# With speed processing
npx playwright-recast -i ./traces --speed-idle 4.0 --speed-action 1.0

# With external SRT subtitles
npx playwright-recast -i ./traces --srt narration.srt --burn-subs

# With TTS voiceover (OpenAI)
npx playwright-recast -i ./traces --srt narration.srt --provider openai --voice nova

# With TTS voiceover (ElevenLabs)
npx playwright-recast -i ./traces --srt narration.srt --provider elevenlabs --voice onwK4e9ZLuTAKqWW03F9

Programmatic API

import { Recast, OpenAIProvider } from 'playwright-recast'

// Minimal — just trace to video
await Recast.from('./traces').parse().render().toFile('output.mp4')

// Full pipeline
await Recast
  .from('./test-results/')
  .parse()
  .hideSteps(s => s.keyword === 'Given' && s.text?.includes('logged in'))
  .speedUp({
    duringIdle: 4.0,
    duringUserAction: 1.0,
    duringNetworkWait: 2.0,
    minSegmentDuration: 500,
  })
  .subtitlesFromSrt('./narration.srt')
  .voiceover(OpenAIProvider({
    voice: 'nova',
    speed: 1.2,
    instructions: 'Professional product demo narration.',
  }))
  .render({
    format: 'mp4',
    resolution: '1080p',
    fps: 60,
    burnSubtitles: true,
    subtitleStyle: {
      fontSize: 48,
      primaryColor: '#1a1a1a',
      backgroundColor: '#FFFFFF',
      backgroundOpacity: 0.75,
      padding: 20,
      bold: true,
      chunkOptions: { maxCharsPerLine: 55 },
    },
  })
  .toFile('demo.mp4')

playwright-bdd Integration

Use narrate(), highlight(), zoom(), pace(), click(), and waitForNarration() in your BDD step definitions:

// steps/fixtures.ts
import { test } from 'playwright-bdd'
import { setupRecast, narrate, highlight, zoom, pace, click, waitForNarration } from 'playwright-recast'

setupRecast(test)
// Optional global defaults:
// setupRecast(test, { narrateAutoWait: true, clickSettleMs: 200 })
export { narrate, highlight, zoom, pace, click, waitForNarration }

// steps/my-steps.ts
import { Given, When, Then } from './fixtures'
import { narrate, highlight, zoom, pace, click, waitForNarration } from 'playwright-recast'

Given('the user opens the dashboard', async ({ page }, docString?: string) => {
  await narrate(docString)
  await page.goto('/dashboard')
  await pace(page, 4000)
})

When('the user highlights revenue', async ({ page }, docString?: string) => {
  await narrate(docString, { autoWait: true })  // pad test with estimated speak time
  await highlight(page.locator('h2'), { text: 'Revenue' })
  await zoom(page.locator('.kpi-card'), 1.3)
})

Then('the user opens the report', async ({ page }, docString?: string) => {
  await narrate(docString)
  await click(page.getByRole('link', { name: 'Reports' }))  // held cursor approach
  await waitForNarration()                                  // hold until the line finishes
})

Each helper writes a marker-prefixed test.step() into the trace zip after setupRecast(test) has connected the helpers to Playwright — subtitlesFromTrace() picks them all up and the renderer applies overlays, zoom, clicks, and per-narration timing automatically. click() markers render with a held cursor approach when the pipeline includes cursorOverlay() and/or clickEffect().

Feature: Dashboard demo

  Scenario: View analytics
    Given the user opens the dashboard
      """
      Let's open the analytics dashboard to see real-time metrics.
      """
    When the user clicks the revenue chart
      """
      Clicking on the revenue chart reveals detailed breakdown.
      """

Pipeline Stages

Every stage is optional and composable:

| Stage | Description | |-------|-------------| | .parse() | Parse Playwright trace.zip into structured data (actions, frames, network, cursor) | | .injectActions(actions) | Inject synthetic actions into a parsed trace (e.g., DOM-tracked actions from page.pause() recordings) | | .hideSteps(predicate) | Remove steps from the output (e.g., login, setup) | | .speedUp(config) | Adjust video speed based on activity (idle, action, network) | | .subtitles(textFn) | Generate subtitles from trace actions | | .subtitlesFromSrt(path) | Load subtitles from an external SRT file | | .subtitlesFromTrace() | Auto-generate subtitles from BDD step titles in trace | | .textProcessing(config) | Sanitize subtitle text before TTS (strip quotes, normalize dashes, custom rules) | | .autoZoom(config) | Auto-zoom to user actions with customizable easing transitions | | .enrichZoomFromReport(steps) | Apply zoom coordinates from external report data | | .cursorOverlay(config) | Animated cursor at click positions (appears, moves, disappears) | | .clickEffect(config) | Add visual ripple + optional click sound at click positions | | .textHighlight(config) | Animated marker overlay on text (swipe-in reveal, auto-positioned from report) | | .backgroundMusic({ path, volume?, ... }) | Add background music with auto-ducking, loop, fade-out | | .intro({ path, fadeDuration? }) | Prepend intro video with crossfade transition | | .outro({ path, fadeDuration? }) | Append outro video with crossfade transition | | .interpolate(config) | Frame interpolation for smoother video (ffmpeg minterpolate) | | .voiceover(provider, options?) | Generate TTS audio from subtitle text; { normalize: true } level-matches segments (EBU R128) | | .render(config) | Render final video (format, resolution, fps, styled subtitle burn-in) | | .toFile(path) | Execute pipeline and write output |

Subtitle Styling

Burn styled subtitles into the video with full control over appearance:

.render({
  burnSubtitles: true,
  subtitleStyle: {
    fontFamily: 'Arial',          // Any system font
    fontSize: 48,                 // Pixels (relative to 1080p)
    primaryColor: '#1a1a1a',      // Text color (hex)
    backgroundColor: '#FFFFFF',   // Box background (hex)
    backgroundOpacity: 0.75,      // 0.0 transparent — 1.0 opaque
    padding: 20,                  // Box padding in px
    bold: true,
    position: 'bottom',           // 'bottom' or 'top'
    marginVertical: 50,           // Distance from edge
    marginHorizontal: 100,        // Side margins (text wraps within)
    wrapStyle: 'smart',           // 'smart', 'endOfLine', 'none'
    chunkOptions: {               // Split long text into single-line chunks
      maxCharsPerLine: 55,        // Split at punctuation when text exceeds this
      minCharsPerChunk: 15,       // Merge tiny fragments
    },
  },
})

Punctuation-based chunking splits long subtitle text into shorter single-line entries. Time is distributed proportionally by character count. Splits at sentence boundaries (. ! ?) first, then clause boundaries (, ; :) if still too long.

Without subtitleStyle, burnSubtitles: true falls back to default ffmpeg SRT rendering.

Text Processing

Clean subtitle text before sending to TTS providers. Removes typographic characters that cause artifacts in voice synthesis while keeping the original text for visual subtitles.

// Built-in sanitization (strips smart quotes, normalizes dashes, etc.)
.textProcessing({ builtins: true })

// Custom regex rules
.textProcessing({
  builtins: true,
  rules: [
    { pattern: '\\bNSS\\b', flags: 'g', replacement: 'Nejvyšší správní soud' },
  ],
})

// Programmatic transform
.textProcessing({
  transform: (text) => text.replace(/\[.*?\]/g, ''),
})

Built-in rules (when builtins: true):

Remove double quotes: „ " " " « » "
Remove single quotes: ' ' ‚ ‛ ‹ ›
Dashes → comma: – — → ,
Ellipsis: … → ...
Normalize: NBSP → space, collapse whitespace, trim

Text processing writes to ttsText — the voiceover uses cleaned text while burnt-in subtitles and SRT/VTT output keep the original text.

CLI:

npx playwright-recast -i ./traces --text-processing --provider openai
npx playwright-recast -i ./traces --text-processing-config ./rules.json --provider elevenlabs

TTS Providers

OpenAI TTS

import { OpenAIProvider } from 'playwright-recast/providers/openai'

OpenAIProvider({
  voice: 'nova',          // alloy, echo, fable, onyx, nova, shimmer
  model: 'gpt-4o-mini-tts',
  speed: 1.2,
  instructions: 'Calm, professional demo narration.',
  cacheDir: './.recast-cache/openai',  // optional: disk cache to skip re-synthesis
})

Requires OPENAI_API_KEY environment variable or apiKey option.

ElevenLabs

import { ElevenLabsProvider } from 'playwright-recast/providers/elevenlabs'

ElevenLabsProvider({
  voice: 'onwK4e9ZLuTAKqWW03F9',  // Daniel
  model: 'eleven_multilingual_v2',
  languageCode: 'cs',              // Force Czech (ISO 639-1)
  voiceSettings: {
    stability: 0.75,               // Higher = more consistent delivery (less drift)
    similarityBoost: 0.75,
  },
  cacheDir: './.recast-cache/elevenlabs',  // optional: disk cache to skip re-synthesis
})

Requires ELEVENLABS_API_KEY environment variable or apiKey option.

Amazon Polly

import { PollyProvider } from 'playwright-recast/providers/polly'

PollyProvider({
  region: 'us-east-1',
  voice: 'Joanna',          // Matthew, Ruth, Stephen, Ivy, Joey, …
  engine: 'neural',         // standard | neural | long-form | generative
  sampleRate: '24000',
  cacheDir: './.recast-cache/polly',  // optional: disk cache to skip re-synthesis
  // Credentials are optional — the AWS SDK default chain is used
  // (env vars, ~/.aws/credentials, IAM role on EC2/ECS/Lambda, SSO).
})

Install the SDK alongside this package:

npm install @aws-sdk/client-polly

Resolves credentials from AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY (and optional AWS_SESSION_TOKEN), shared config, or — preferred on AWS — an attached IAM role.

Qwen3-TTS (local, GPU)

Local TTS backed by Alibaba's Qwen3-TTS models. Two modes:

Clone — synthesize text in the voice of a reference WAV/MP3.
Design — synthesize text in a voice described by a prompt.

import { QwenTtsProvider } from 'playwright-recast/providers/qwen'

// Clone an existing voice
.voiceover(QwenTtsProvider({
  mode: 'clone',
  voiceSample: './my-voice.wav',
  refText: 'Welcome! In this screencast we will walk through the key concepts.',
  language: 'English',
  cacheAudio: true,
}))

// Design a voice from a description
.voiceover(QwenTtsProvider({
  mode: 'design',
  voiceDescription: 'A clear, steady male voice with a calm and even tone.',
  refText: 'Welcome! In this screencast we will walk through the key concepts.',
  language: 'English',
  cacheAudio: true,
  cacheVoiceDesign: true,
}))

The provider spawns a Python sidecar (PyTorch + qwen-tts + flash-attn) once per pipeline run. The deps are heavy (~5–8 GB on disk for PyTorch + flash-attn), so the recommended setup is one shared venv reused across projects, pointed at via pythonBin:

python3 -m venv ~/.venvs/playwright-recast
~/.venvs/playwright-recast/bin/pip install -r node_modules/playwright-recast/dist/voiceover/providers/qwen-sidecar/requirements.txt

.voiceover(QwenTtsProvider({
  mode: 'clone',
  voiceSample: './my-voice.wav',
  refText: 'Welcome! In this screencast we will walk through the key concepts.',
  pythonBin: `${process.env.HOME}/.venvs/playwright-recast/bin/python3`,
}))

Absolute pythonBin means no shell activation required — works in CI, cron jobs, and IDE runners alike.

Requires a CUDA-capable GPU (~4–8 GB VRAM depending on model). HF_TOKEN in the environment if the chosen weights are gated.

Caching — both flags default off:

| Flag | What it caches | Hash includes | |---|---|---| | cacheAudio | Per-segment MP3s at cacheDir/audio/<hash>.mp3 | target text, refText, ref-audio fingerprint, language, model, dtype | | cacheVoiceDesign (design mode only) | The design WAV at cacheDir/design/<hash>.wav | description, refText, language, designModel, dtype |

cacheDir defaults to ./.recast-cache/voice/. The cache grows unbounded; manage it yourself.

Defaults: cloneModel: 'Qwen/Qwen3-TTS-12Hz-0.6B-Base', designModel: 'Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign', device: 'cuda:0', dtype: 'bfloat16', language: 'English'.

Loudness normalization

TTS providers (especially ElevenLabs on eleven_multilingual_v2 + non-English languages) can deliver segments at wildly different levels — one line at −16 LUFS, the next at −32 LUFS. Enable opt-in per-segment normalization to fix it:

.voiceover(ElevenLabsProvider({ voice: 'daniel' }), { normalize: true })

This runs each synthesized segment through two-pass EBU R128 loudnorm before concat. Defaults: −16 LUFS integrated, −1 dBFS true peak, 11 LU range, linear mode (preserves dynamics). Override any of them:

.voiceover(provider, {
  normalize: {
    targetLufs: -18,
    truePeakDb: -1.5,
    lra: 11,
    linear: true,
  },
})

Also exported standalone for external use:

import { normalizeLoudness } from 'playwright-recast'
await normalizeLoudness('input.mp3', 'output.mp3', { targetLufs: -16 })

Zoom

Zoom into specific areas of the video during steps — focus the viewer's attention on the relevant UI element.

Auto-zoom from trace

Automatically zoom into input elements (fill/type actions) detected from the Playwright trace. Zoom window follows the actual action duration — zooms in when the user starts typing, zooms out when they move on. Smooth fade transitions between zoom states.

await Recast
  .from('./traces')
  .parse()
  .subtitlesFromSrt('./narration.srt')
  .autoZoom({
    inputLevel: 1.4,    // zoom level for fill/type actions
    clickLevel: 1.0,    // 1.0 = no zoom on clicks (default)
    centerBias: 0.3,    // blend coordinates toward center (0–1)
  })
  .render({ format: 'mp4' })
  .toFile('demo.mp4')

autoZoom() finds click/fill/type actions in the trace, extracts their cursor coordinates, and applies crop-and-scale zoom during the matching subtitle's time window.

Zoom from report data

Apply zoom coordinates from an external source (e.g., a demo report with per-step zoom data):

const reportSteps = [
  { zoom: null },                            // Step 1: no zoom
  { zoom: { x: 0.5, y: 0.8, level: 1.4 } }, // Step 2: zoom to input area
  { zoom: null },                            // Step 3: no zoom
  { zoom: { x: 0.78, y: 0.45, level: 1.3 }}, // Step 4: zoom to sidebar
]

await Recast
  .from('./traces')
  .parse()
  .subtitlesFromSrt('./narration.srt')
  .enrichZoomFromReport(reportSteps)
  .render({ format: 'mp4' })
  .toFile('demo.mp4')

Zoom from step helpers

Capture zoom coordinates during Playwright test execution using the zoom() helper:

import { zoom } from 'playwright-recast'

When('the user opens the sidebar', async ({ page }) => {
  const sidebar = page.locator('.sidebar-panel')
  await zoom(sidebar, 1.3) // Record zoom target for this step
  await sidebar.click()
})

The helper captures the element's bounding box as a Playwright annotation. Use enrichZoomFromReport() to apply these coordinates during video generation.

Zoom coordinates

All zoom coordinates use viewport-relative fractions (0.0–1.0):

| Field | Description | Default | |-------|-------------|---------| | x | Center X (0 = left, 1 = right) | 0.5 | | y | Center Y (0 = top, 1 = bottom) | 0.5 | | level | Zoom level (1.0 = no zoom, 2.0 = 2x) | 1.0 |

The renderer applies zoom by cropping the video to (width/level × height/level) centered at (x, y), then scaling back to the output resolution.

Click Effect

Highlight click actions with animated ripple effects and optional click sounds.

await Recast
  .from('./traces')
  .parse()
  .clickEffect({
    color: '#3B82F6',    // Ripple color (hex, default: blue)
    opacity: 0.5,        // Ripple opacity 0.0–1.0
    radius: 30,          // Max radius in px (relative to 1080p)
    duration: 400,       // Animation duration in ms
    sound: true,         // true = bundled default, or path to custom audio
    soundVolume: 0.8,    // Sound volume 0.0–1.0
  })
  .render({ format: 'mp4' })
  .toFile('demo.mp4')

The click effect stage automatically detects click and selectOption actions from the Playwright trace. Timestamps are remapped through speed processing so ripples appear at the correct video time.

To emphasise specific clicks, mark them in your test with the click() / markClick() helpers. A marker suppresses the matching auto-detected click (no duplicate ripple; nearest marker wins if several compete) and — with cursorOverlay() — gives it a held cursor approach over the painted target (cursorOverlay({ approachMs }), default 500ms).

Filtering clicks:

.clickEffect({
  filter: (action) => action.method === 'click', // Only clicks, not selectOption
})

CLI:

npx playwright-recast -i ./traces --click-effect
npx playwright-recast -i ./traces --click-effect --click-sound click.mp3
npx playwright-recast -i ./traces --click-effect-config config.json

Frame Interpolation

Generate smooth intermediate frames from choppy browser recordings using ffmpeg's minterpolate filter.

await Recast
  .from('./traces')
  .parse()
  .interpolate({
    fps: 60,              // Target FPS (default: 60)
    mode: 'blend',        // 'dup' | 'blend' | 'mci' (default: 'mci')
    quality: 'balanced',  // 'fast' | 'balanced' | 'quality' (default: 'balanced')
    passes: 1,            // Multi-pass for smoother results (default: 1)
  })
  .render({ format: 'mp4' })
  .toFile('demo.mp4')

Modes

| Mode | Speed | Quality | Description | |------|-------|---------|-------------| | dup | Instant | None | Duplicate frames to reach target FPS | | blend | Fast | Good | Linear crossfade between frames | | mci | Slow | Best | Motion-compensated interpolation (CPU-intensive, especially at 4K) |

Multi-pass

With passes: 2, FPS is distributed geometrically across passes (e.g., 25fps -> 39fps -> 60fps). Each pass interpolates already-smoothed frames for a cleaner result.

CLI:

npx playwright-recast -i ./traces --interpolate
npx playwright-recast -i ./traces --interpolate --interpolate-fps 30
npx playwright-recast -i ./traces --interpolate --interpolate-mode blend --interpolate-passes 2

Speed Processing

The speed processor classifies every moment of the trace:

| Activity | Default Speed | Description | |----------|---------------|-------------| | User Action | 1.0x | Clicks, fills, keyboard input — real-time | | Navigation | 2.0x | Page loads, redirects — slightly faster | | Network Wait | 2.0x | API calls in flight — compress wait time | | Idle | 4.0x | Nothing happening — skip quickly |

.speedUp({
  duringIdle: 4.0,
  duringUserAction: 1.0,
  duringNetworkWait: 2.0,
  duringNavigation: 2.0,
  minSegmentDuration: 500,  // Avoid jarring speed changes
  maxSpeed: 8.0,            // Safety cap
})

Architecture

Trace.zip → ParsedTrace → FilteredTrace → SpeedMappedTrace → SubtitledTrace → VoiceoveredTrace → MP4
               ↑               ↑                ↑                  ↑       ↑          ↑              ↑
            parse()       hideSteps()        speedUp()         subtitles() textProcessing() voiceover()  render()

The pipeline is lazy — calling chain methods builds a pipeline description. Nothing executes until .toFile() or .toBuffer() is called.

Each pipeline instance is immutable — every method returns a new pipeline, so you can branch:

const base = Recast.from('./traces').parse().speedUp({ duringIdle: 3.0 })

// Branch A: with voiceover
await base.subtitlesFromSrt('./en.srt').voiceover(openai).render().toFile('demo-en.mp4')

// Branch B: subtitles only
await base.subtitlesFromSrt('./cs.srt').render({ burnSubtitles: true }).toFile('demo-cs.mp4')

MCP Server

playwright-recast includes an MCP (Model Context Protocol) server for AI-assisted video creation. Any MCP-compatible client (Claude Code, Cursor, etc.) can record browser sessions, analyze traces, and render polished videos through a conversational workflow.

Typical workflow: record_session --> analyze_trace --> (edit voiceover text) --> render_video

Available Tools

| Tool | Description | |------|-------------| | record_session | Opens a browser at a URL for interactive recording. Returns trace metadata. | | analyze_trace | Parses a trace and returns structured steps with timing and auto-detected hidden steps. | | list_recordings | Lists available trace recordings in a directory. | | render_video | Renders a polished video from steps with voiceover text, hidden flags, and full pipeline configuration. |

Configuration

Add to your project's .mcp.json:

{
  "mcpServers": {
    "recast": {
      "command": "npx",
      "args": [
        "-y",
        "-p", "playwright-recast",
        "-p", "@playwright/test",
        "-p", "openai",
        "-p", "@elevenlabs/elevenlabs-js",
        "recast-mcp"
      ],
      "env": {
        "OPENAI_API_KEY": "sk-...",
        "RECAST_RESOLUTION": "1080p",
        "RECAST_WORK_DIR": "."
      }
    }
  }
}

Environment variables:

| Variable | Default | Description | |----------|---------|-------------| | OPENAI_API_KEY | — | OpenAI API key (enables OpenAI TTS) | | ELEVENLABS_API_KEY | — | ElevenLabs API key (enables ElevenLabs TTS) | | AWS_REGION / AWS_DEFAULT_REGION | us-east-1 | AWS region for Amazon Polly | | AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY | — | AWS credentials (also enables Polly auto-detect; IAM role on EC2/ECS/Lambda works without these) | | RECAST_POLLY_ENGINE | neural | Polly engine: standard, neural, long-form, generative | | RECAST_TTS_PROVIDER | auto-detected | Force openai, elevenlabs, polly, or none | | RECAST_TTS_VOICE | nova / 3HdFueVb2f3yUQzeEpyz / Joanna | Default voice ID (provider-specific) | | RECAST_RESOLUTION | 4k | Output resolution: 720p, 1080p, 1440p, 4k | | RECAST_FPS | 120 | Output FPS | | RECAST_WORK_DIR | . | Working directory for recordings | | RECAST_INTRO_PATH | — | Default intro video path | | RECAST_OUTRO_PATH | — | Default outro video path | | RECAST_BACKGROUND_MUSIC | — | Default background music path |

TTS provider is auto-detected from available API keys when RECAST_TTS_PROVIDER is not set.

Contributing

Contributions welcome! Please check the issues for open tasks.

git clone https://github.com/ThePatriczek/playwright-recast.git
cd playwright-recast
npm install
npm test

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

playwright-recast

Why?

Features

Quick Start

Install

CLI Usage

Programmatic API

playwright-bdd Integration

Pipeline Stages

Subtitle Styling

Text Processing

TTS Providers

OpenAI TTS

ElevenLabs

Amazon Polly

Qwen3-TTS (local, GPU)

Loudness normalization

Zoom

Auto-zoom from trace

Zoom from report data

Zoom from step helpers

Zoom coordinates

Click Effect

Frame Interpolation

Modes

Multi-pass

Speed Processing

Architecture

MCP Server

Available Tools

Configuration

Contributing

License