npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

clipify-web-transcoder

v1.0.0

Published

A TypeScript module for streaming video frames, editing them in real-time, and outputting a new video with WebCodecs

Readme

clipify-web-transcoder

A TypeScript library for streaming video transcoding with ONNX tensor processing. Built on top of mediabunny for high-performance WebCodecs-based video processing.

Features

  • Streaming transcoder: Decode → process → encode loop with constant memory usage
  • ONNX tensor integration: Input frames are converted to [1, 3, H, W] RGB tensors for direct use with ONNX models
  • Transparent video output: Outputs WebM with VP9 codec and alpha channel support
  • Memory efficient: Online processing keeps memory usage constant regardless of video length
  • TypeScript first: Full type definitions included

Installation

npm install clipify-web-transcoder

Quick Start

import { VideoTranscoder } from 'clipify-web-transcoder';

// Open a video file
const video = await VideoTranscoder.open(videoFile, { dtype: 'float32' });

// Access video metadata
const { width, height, frameRate, frameCount } = video.metadata;
console.log(`Video: ${width}x${height} @ ${frameRate}fps, ${frameCount} frames`);

// Set up frame processing
video.processFrame(async (frame) => {
    // frame.tensor: [1, 3, H, W] RGB tensor (normalized 0-1)
    // frame.index: current frame number (0-based)
    // frame.time: presentation time in seconds
    // frame.width, frame.height: frame dimensions
    
    // Your processing here - e.g., run through an ONNX model
    const results = await session.run({ src: frame.tensor });
    
    // Return foreground and alpha tensors
    return { foreground: results.fgr, alpha: results.pha };
});

// Optional: track progress
video.onProgress((progress, stage) => {
    console.log(`${stage}: ${Math.round(progress * 100)}%`);
});

// Run the transcoding pipeline
const outputBuffer = await video.run();

// Create downloadable blob
const blob = new Blob([outputBuffer], { type: 'video/webm' });

// Release resources
video.close();

Why WebM with Alpha?

This library outputs WebM with VP9 codec because it's currently the only web-compatible format that supports transparent video. This enables use cases like:

  • Background removal with AI models (like RobustVideoMatting)
  • Video compositing in web applications
  • Green screen replacement without pre-keying
  • Overlays for video editing

API Reference

VideoTranscoder.open(source, options?)

Static factory method to create a transcoder instance.

Parameters:

| Parameter | Type | Description | |-----------|------|-------------| | source | VideoSourceData | Video source (File, Blob, ArrayBuffer, or URL string) | | options | TranscoderOptions | Optional configuration | | options.dtype | TensorDataType | Tensor data type: 'float32' (default) or 'float16' | | options.outputBitrate | number | Output video bitrate in bps (default: 10000000) |

Returns: Promise<VideoTranscoder>

// From file input
const video = await VideoTranscoder.open(fileInput.files[0]);

// From URL with options
const video = await VideoTranscoder.open('video.mp4', { 
    dtype: 'float32',
    outputBitrate: 20_000_000 
});

VideoTranscoder Instance

metadata

Read-only property containing video metadata. Available immediately after opening.

Type: VideoMetadata

const { width, height, frameRate, duration, frameCount, hasAudio } = video.metadata;

processFrame(handler)

Set the frame processing callback. Must be called before run().

Parameters:

| Parameter | Type | Description | |-----------|------|-------------| | handler | FrameHandler | Function called for each frame |

Returns: VideoTranscoder (for chaining)

video.processFrame(async (frame) => {
    // Process the frame...
    return { foreground: fgrTensor, alpha: alphaTensor };
});

onProgress(callback)

Set the callback for progress updates during processing.

Parameters:

| Parameter | Type | Description | |-----------|------|-------------| | callback | ProgressCallback | Function called with progress updates |

Returns: VideoTranscoder (for chaining)

video.onProgress((progress, stage) => {
    // progress: 0-1
    // stage: 'decoding' or 'encoding'
    console.log(`${stage}: ${Math.round(progress * 100)}%`);
});

run(signal?)

Start the transcode pipeline and process all frames.

Parameters:

| Parameter | Type | Description | |-----------|------|-------------| | signal | AbortSignal | Optional signal for cancellation |

Returns: Promise<ArrayBuffer> - The processed video (WebM format with VP9 codec)

// Basic usage
const output = await video.run();

// With abort support
const controller = new AbortController();
setTimeout(() => controller.abort(), 30000); // 30 second timeout
const output = await video.run(controller.signal);

abort()

Abort the current processing operation.

video.abort();

close()

Release all resources. Call this when done with the transcoder.

video.close();

Types

TensorDataType

type TensorDataType = 'float32' | 'float16';

| Type | Description | |------|-------------| | 'float32' | 32-bit float tensors (wider compatibility) | | 'float16' | 16-bit float tensors (requires browser support, better performance) |

VideoFrame

The object passed to the frame handler.

interface VideoFrame {
    /** Frame index (0-based) */
    index: number;
    /** Presentation time in seconds */
    time: number;
    /** Input tensor with shape [1, 3, height, width] (RGB, normalized 0-1) */
    tensor: Tensor;
    /** Frame width in pixels */
    width: number;
    /** Frame height in pixels */
    height: number;
}

FrameResult

The object returned from the frame handler.

interface FrameResult {
    /** Foreground tensor with shape [1, 3, height, width] (RGB, 0-1) */
    foreground: Tensor;
    /** Alpha tensor with shape [1, 1, height, width] (0-1) */
    alpha: Tensor;
}

FrameHandler

type FrameHandler = (frame: VideoFrame) => FrameResult | Promise<FrameResult>;

ProgressCallback

type ProgressCallback = (progress: number, stage: 'decoding' | 'encoding') => void;

VideoMetadata

interface VideoMetadata {
    readonly width: number;
    readonly height: number;
    readonly frameRate: number;
    readonly duration: number;
    readonly hasAudio: boolean;
    readonly frameCount: number;
}

TranscoderOptions

interface TranscoderOptions {
    /** Tensor data type (default: 'float32') */
    dtype?: TensorDataType;
    /** Output video bitrate in bps (default: 10000000) */
    outputBitrate?: number;
}

VideoSourceData

type VideoSourceData = string | ArrayBuffer | Blob | File;

Utility Functions

These are exported for advanced use cases:

rgbaToTensor(rgba, width, height, dataType, reuseBuffer?)

Convert RGBA Uint8ClampedArray to an ONNX tensor with shape [1, 3, H, W].

import { rgbaToTensor } from 'clipify-web-transcoder';

// Basic usage
const tensor = rgbaToTensor(rgbaData, 1920, 1080, 'float32');

// With buffer reuse for better performance
const buffer = new Float32Array(3 * 1920 * 1080);
const tensor = rgbaToTensor(rgbaData, 1920, 1080, 'float32', buffer);

tensorsToRgba(foreground, alpha, width, height, reuseBuffer?)

Convert foreground [1, 3, H, W] and alpha [1, 1, H, W] tensors to RGBA Uint8ClampedArray.

import { tensorsToRgba } from 'clipify-web-transcoder';

// Basic usage
const rgba = tensorsToRgba(fgrTensor, alphaTensor, 1920, 1080);

// With buffer reuse for better performance
const buffer = new Uint8ClampedArray(1920 * 1080 * 4);
const rgba = tensorsToRgba(fgrTensor, alphaTensor, 1920, 1080, buffer);

isFloat16Supported()

Check if Float16Array is supported in the current environment.

import { isFloat16Supported } from 'clipify-web-transcoder';
if (isFloat16Supported()) {
    // Safe to use float16 tensors
}

VideoProcessor Class

For advanced usage, you can use the VideoProcessor class directly:

import { VideoProcessor } from 'clipify-web-transcoder';

const processor = new VideoProcessor('float32');
await processor.load(videoBlob);

const metadata = processor.getMetadata();

processor.setFrameHandler(async (frame) => {
    const results = await session.run({ src: frame.tensor });
    return { foreground: results.fgr, alpha: results.pha };
});

const output = await processor.run();
processor.close();

Browser Support

This library requires browser support for:

Examples

Background Removal with RobustVideoMatting

import { VideoTranscoder } from 'clipify-web-transcoder';
import * as ort from 'onnxruntime-web';

// Load the RVM model
const session = await ort.InferenceSession.create('rvm_mobilenetv3_fp32.onnx');

// Initialize recurrent states (r1-r4) as zeros
let r1 = new ort.Tensor('float32', new Float32Array(1 * 16 * 1 * 1), [1, 16, 1, 1]);
let r2 = new ort.Tensor('float32', new Float32Array(1 * 20 * 1 * 1), [1, 20, 1, 1]);
let r3 = new ort.Tensor('float32', new Float32Array(1 * 40 * 1 * 1), [1, 40, 1, 1]);
let r4 = new ort.Tensor('float32', new Float32Array(1 * 64 * 1 * 1), [1, 64, 1, 1]);

const video = await VideoTranscoder.open(videoFile, { dtype: 'float32' });

video.processFrame(async (frame) => {
    // Run the model with recurrent states
    const results = await session.run({
        src: frame.tensor,
        r1i: r1, r2i: r2, r3i: r3, r4i: r4,
        downsample_ratio: new ort.Tensor('float32', [0.25], [1])
    });
    
    // Update recurrent states for next frame
    r1 = results.r1o;
    r2 = results.r2o;
    r3 = results.r3o;
    r4 = results.r4o;
    
    return { foreground: results.fgr, alpha: results.pha };
});

const outputBuffer = await video.run();
const blob = new Blob([outputBuffer], { type: 'video/webm' });
// Download or use the video with transparent background...

video.close();

Simple Pass-through (No Model)

import { VideoTranscoder } from 'clipify-web-transcoder';
import { Tensor } from 'onnxruntime-web';

const video = await VideoTranscoder.open(videoFile, { dtype: 'float32' });
const { width, height } = video.metadata;

// Pre-allocate alpha buffer ONCE (optimization)
const alphaData = new Float32Array(height * width).fill(1.0);
const alpha = new Tensor('float32', alphaData, [1, 1, height, width]);

video.processFrame((frame) => {
    // Pass through the input as foreground, reuse pre-built alpha
    return { foreground: frame.tensor, alpha };
});

const output = await video.run();
video.close();

Gradient Alpha Mask

import { VideoTranscoder } from 'clipify-web-transcoder';
import { Tensor } from 'onnxruntime-web';

const video = await VideoTranscoder.open(videoFile, { dtype: 'float32' });
const { width, height } = video.metadata;

// Pre-compute gradient alpha ONCE outside the callback (optimization)
const alphaData = new Float32Array(height * width);
for (let y = 0; y < height; y++) {
    for (let x = 0; x < width; x++) {
        alphaData[y * width + x] = x / width; // 0 on left, 1 on right
    }
}
const alpha = new Tensor('float32', alphaData, [1, 1, height, width]);

video.processFrame((frame) => {
    return { foreground: frame.tensor, alpha };
});

const output = await video.run();
video.close();

With Cancellation Support

import { VideoTranscoder } from 'clipify-web-transcoder';

const video = await VideoTranscoder.open(videoFile, { dtype: 'float32' });

video.processFrame((frame) => {
    return { foreground: frame.tensor, alpha: solidAlpha };
});

// Cancel after 10 seconds
const controller = new AbortController();
setTimeout(() => controller.abort(), 10000);

try {
    const output = await video.run(controller.signal);
    // Processing completed
} catch (err) {
    if (err.name === 'AbortError') {
        console.log('Processing was cancelled');
    }
} finally {
    video.close();
}

Performance Tips

For optimal performance, especially with long videos:

1. Pre-allocate buffers outside the callback

// ❌ Bad: Allocates new arrays every frame
video.processFrame((frame) => {
    const alphaData = new Float32Array(height * width); // Allocates ~8MB per frame!
    // ...
    return { foreground, alpha };
});

// ✅ Good: Pre-allocate once, reuse every frame
const alphaData = new Float32Array(height * width);
const alpha = new Tensor('float32', alphaData, [1, 1, height, width]);

video.processFrame((frame) => {
    // Reuse pre-built tensor
    return { foreground: frame.tensor, alpha };
});

2. Reuse input tensor when possible

// If your foreground output equals the input (pass-through), don't copy it:
video.processFrame((frame) => {
    return { foreground: frame.tensor, alpha }; // Zero-copy reuse
});

3. Use TypedArray methods

// ❌ Slow: Nested loops
for (let y = 0; y < height; y++) {
    for (let x = 0; x < width; x++) {
        alphaData[y * width + x] = 1.0;
    }
}

// ✅ Fast: Use fill()
alphaData.fill(1.0);

Memory Usage

The library uses a streaming architecture that processes and encodes frames immediately, keeping memory usage constant regardless of video length:

| Video Length | Memory Usage | |-------------|--------------| | 20 seconds | ~50 MB | | 5 minutes | ~50 MB | | 1 hour | ~50 MB |

This is achieved by:

  • Online processing: Frames are decoded, processed, and encoded in a streaming loop
  • Buffer pooling: Internal buffers are reused across frames
  • Immediate cleanup: Input samples are closed after processing

License

MIT