npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@ainative/ai-kit-video

v0.1.1

Published

AI Kit - Video processing utilities including recording and transcription

Readme

@ainative/ai-kit-video

Video processing utilities for AI Kit, including screen recording, camera recording, audio processing, and AI-powered transcription using OpenAI's Whisper API.

npm version License: MIT TypeScript

Features

Recording

  • Screen Recording - Capture screen with customizable settings (quality, frame rate, audio)
  • Camera Recording - Record from webcam with MediaStream API
  • Audio Recording - Record audio with noise cancellation and processing
  • Picture-in-Picture - Composite camera feed over screen recording

Processing

  • Audio Transcription - AI-powered transcription using OpenAI Whisper
  • Text Formatting - Clean and format transcribed text
  • Highlight Detection - Detect significant moments in video
  • Noise Processing - Reduce background noise in audio

Observability

  • Instrumented Recording - Built-in performance metrics and event logging
  • Correlation IDs - Track recordings across your application
  • Performance Metrics - Monitor bitrate, file size, and duration

Installation

npm install @ainative/ai-kit-video

Usage

Screen Recording

import { ScreenRecorder } from '@ainative/ai-kit-video/recording'

const recorder = new ScreenRecorder({
  mimeType: 'video/webm;codecs=vp9',
  videoBitsPerSecond: 2500000,
  audioBitsPerSecond: 128000
})

// Start recording
const stream = await recorder.getStream()
await recorder.start()

// Stop and get recording
await recorder.stop()
const blob = recorder.getRecordingBlob()
const url = recorder.getRecordingURL()

// Download the recording
const link = document.createElement('a')
link.href = url
link.download = 'screen-recording.webm'
link.click()

// Clean up
recorder.revokeURL(url)

Camera Recording

import { CameraRecorder } from '@ainative/ai-kit-video/recording'

const recorder = new CameraRecorder({
  video: {
    width: { ideal: 1920 },
    height: { ideal: 1080 },
    frameRate: { ideal: 30 }
  },
  audio: true
})

const stream = await recorder.getStream()
await recorder.start()

// Display preview
const videoElement = document.querySelector('video')
videoElement.srcObject = stream

// Stop and save
await recorder.stop()
const blob = recorder.getRecordingBlob()

Audio Transcription with Whisper

import { transcribeAudio } from '@ainative/ai-kit-video/processing'

// Transcribe audio file
const result = await transcribeAudio(audioFile, {
  apiKey: process.env.OPENAI_API_KEY,
  language: 'en',
  response_format: 'verbose_json',
  timestamp_granularities: ['segment', 'word']
})

console.log(result.text)
console.log(result.segments) // Segment-level timestamps
console.log(result.words)    // Word-level timestamps

Text Formatting

import { TextFormatter } from '@ainative/ai-kit-video/processing'

const formatter = new TextFormatter({
  removeFillerWords: true,
  correctPunctuation: true,
  capitalizeFirstWord: true
})

const formatted = formatter.format(transcribedText)
console.log(formatted)

Noise Cancellation

import { NoiseProcessor } from '@ainative/ai-kit-video/recording'

const processor = new NoiseProcessor({
  noiseReduction: 0.8,
  autoGain: true
})

// Process audio stream
const cleanStream = await processor.process(audioStream)

Instrumented Recording with Observability

import { InstrumentedScreenRecorder } from '@ainative/ai-kit-video/recording'

const recorder = new InstrumentedScreenRecorder({
  correlationId: 'user-session-123',
  logger: customLogger,
  enablePerformanceMetrics: true
})

// All events are logged with correlation ID
recorder.on('recording_started', (event) => {
  console.log(`Recording started: ${event.recordingId}`)
  console.log(`Correlation: ${event.correlationId}`)
})

recorder.on('performance_metrics', (metrics) => {
  console.log(`Bitrate: ${metrics.avgBitrate}`)
  console.log(`File size: ${metrics.fileSize}`)
})

await recorder.start()

Picture-in-Picture Composite

import { PiPCompositor } from '@ainative/ai-kit-video/recording'

const compositor = new PiPCompositor({
  position: 'bottom-right',
  size: { width: 320, height: 180 },
  borderRadius: 8,
  border: '2px solid white'
})

// Combine screen and camera
const screenStream = await screenRecorder.getStream()
const cameraStream = await cameraRecorder.getStream()

const composite = compositor.composite(screenStream, cameraStream)

API Reference

Recording Classes

ScreenRecorder

  • getStream(): Promise<MediaStream> - Get display media stream
  • start(): Promise<void> - Start recording
  • stop(): Promise<void> - Stop recording
  • pause(): void - Pause recording
  • resume(): void - Resume recording
  • getRecordingBlob(): Blob - Get recorded video as Blob
  • getRecordingURL(): string - Get Blob URL for download
  • revokeURL(url: string): void - Revoke Blob URL to free memory

CameraRecorder

  • Same API as ScreenRecorder
  • Automatically cleans up MediaStream on page unload
  • Prevents memory leaks with proper resource disposal

AudioRecorder

  • Record audio only with customizable settings
  • Built-in noise reduction
  • Auto-gain control

InstrumentedScreenRecorder

  • Extends ScreenRecorder with observability
  • Emits performance metrics and lifecycle events
  • Supports correlation IDs for distributed tracing

Processing Functions

transcribeAudio(file, options)

Transcribe audio using OpenAI Whisper API.

Options:

  • apiKey (required) - OpenAI API key
  • language - ISO-639-1 language code (e.g., 'en', 'es')
  • prompt - Guide the model's style or terminology
  • response_format - 'json' | 'text' | 'srt' | 'verbose_json' | 'vtt'
  • temperature - Sampling temperature (0-1)
  • timestamp_granularities - ['word', 'segment']

Returns: TranscriptionResult

  • text - Full transcription
  • language - Detected language (verbose_json only)
  • duration - Audio duration in seconds (verbose_json only)
  • segments - Timestamped segments
  • words - Word-level timestamps

formatSegments(segments)

Format transcription segments into readable text with timestamps.

extractSpeakers(text)

Extract speaker-labeled text from transcription (requires speaker hints in prompt).

estimateTranscriptionCost(durationSeconds)

Calculate estimated cost for Whisper transcription ($0.006/minute).

Utility Classes

TextFormatter

Clean and format transcribed text.

Options:

  • removeFillerWords - Remove 'um', 'uh', 'like', etc.
  • correctPunctuation - Add proper punctuation
  • capitalizeFirstWord - Capitalize first word of sentences
  • removeExtraSpaces - Normalize whitespace

NoiseProcessor

Process audio streams to reduce background noise.

Options:

  • noiseReduction - Noise reduction strength (0-1)
  • autoGain - Enable automatic gain control
  • echoCancellation - Enable echo cancellation

TypeScript Support

This package is written in TypeScript and includes complete type definitions.

import type {
  RecordingConfig,
  TranscriptionOptions,
  TranscriptionResult,
  InstrumentationConfig
} from '@ainative/ai-kit-video'

Browser Compatibility

  • Chrome/Edge 87+
  • Firefox 94+
  • Safari 15.4+
  • Requires HTTPS for MediaStream APIs

Memory Management

All recorders automatically clean up resources:

  • Blob URLs are revocable via revokeURL()
  • MediaStreams are stopped on page unload
  • No memory leaks from unreleased resources
// Manual cleanup
recorder.stop()
const url = recorder.getRecordingURL()
// ... use the URL
recorder.revokeURL(url) // Free memory

Performance

  • Screen Recording: 2.5 Mbps video, 128 kbps audio (default)
  • Camera Recording: 1920x1080 @ 30fps (configurable)
  • Whisper Transcription: ~$0.006 per minute of audio

Examples

See the examples directory for complete working examples:

  • Screen recording with download
  • Camera recording with preview
  • Audio transcription with Whisper
  • PiP composite recording
  • Instrumented recording with metrics

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

License

MIT - See LICENSE for details.

Support


Built by AINative Studio