npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

deepgram-media-transcriber

v1.0.8

Published

A package to transcribe media files using Deepgram with speaker-labeled subtitles (SRT/VTT).

Readme

Deepgram Media Transcriber

A robust TypeScript package for transcribing audio/video files using Deepgram's API, with speaker diarization and subtitle generation (SRT/VTT) capabilities.

Features

  • 🎙️ Audio/video transcription with speaker identification
  • 📄 Multiple output formats: SRT, VTT, Plain Text
  • ⏱️ Automatic splitting of long utterances (>15s)
  • 🔊 Supports various audio formats (automatic conversion to MP3)
  • 🎯 Accurate word-level timing information
  • 🗣️ Speaker confidence scores (when available)

Installation

npm install deepgram-media-transcriber
## Quick Start
```typescript
import { transcribeMedia } from 'deepgram-media-transcriber';

async function main() {
  const results = await transcribeMedia(
    '/path/to/your/media/file.mp4',
    'your-deepgram-api-key'
  );
  
  console.log('Formatted Text:', results.formattedText);
  console.log('SRT Subtitles:', results.srt);
  console.log('VTT Subtitles:', results.vtt);
}

main().catch(console.error);

API Documentation

transcribeMedia(filePath: string, deepgramApiKey: string, keepAudioFile?: boolean) Parameters

  • filePath : Path to media file (supports MP3, WAV, MP4, MOV, etc.)
  • deepgramApiKey : Your Deepgram API key
  • keepAudioFile : Keep converted audio file (default: false) Returns
{
  transcript: any;         // Raw Deepgram response
  formattedText: string;   // Speaker-formatted plain text
  srt: string;             // SRT formatted subtitles
  vtt: string;             // VTT formatted subtitles
  audioFilePath?: string   // Path to converted audio (if kept)
}

Output Formats

SRT Format Example

1
00:00:00,000 --> 00:00:04,120
Speaker 0: Let's start with the main agenda items...

2
00:00:04,240 --> 00:00:07,800
Speaker 1: I agree, we should prioritize...

VTT Format Example

WEBVTT

1
00:00:00.000 --> 00:00:04.120
Speaker 0: Let's start with the main agenda items...

2
00:00:04.240 --> 00:00:07.800
Speaker 1: I agree, we should prioritize...

Text Format Example

Speaker 0: Let's start with the main agenda items...

Speaker 1: I agree, we should prioritize...

Configuration

DeepGram API Setup

  1. Get API key from Deepgram Console
  2. Enable the following features in your Deepgram project:
    • Speaker Diarization
    • Punctuation
    • Utterance Detection

Browser Usage

This package also supports browser environments using ffmpeg.wasm:

import { transcribeMediaBrowser } from 'deepgram-media-transcriber/browser';

async function processMedia() {
  const fileInput = document.getElementById('fileInput');
  const file = fileInput.files[0];
  
  try {
    const { formattedText, srt, vtt } = await transcribeMediaBrowser(
      file,
      'your-deepgram-api-key'
    );
    
    console.log('Formatted Text:', formattedText);
    console.log('SRT Subtitles:', srt);
    console.log('VTT Subtitles:', vtt);
  } catch (error) {
    console.error('Processing failed:', error.message);
  }
}

Browser Considerations

  • The browser version uses ffmpeg.wasm which requires loading WebAssembly modules
  • Cross-Origin Resource Sharing (CORS) must be properly configured when loading the WebAssembly modules
  • The browser version doesn't support the keepAudioFile option as files are processed in memory

Audio Conversion

The package automatically:

  • Converts non-MP3 files to high-quality MP3
  • Maintains original audio quality (44.1kHz sample rate)
  • Handles stereo-to-mono conversion when needed

Error Handling

The package throws specific errors for:

  • Invalid file paths
  • Deepgram API errors
  • FFmpeg conversion failures
  • Invalid audio formats

Example Usage

import { writeFileSync } from 'fs';
import { transcribeMedia } from 'deepgram-media-transcriber';

async function processMedia() {
  try {
    const { formattedText, srt, vtt } = await transcribeMedia(
      'interview.mp4',
      process.env.DEEPGRAM_KEY,
      true
    );

    writeFileSync('interview.txt', formattedText);
    writeFileSync('interview.srt', srt);
    writeFileSync('interview.vtt', vtt);
    
    console.log('Processing complete!');
  } catch (error) {
    console.error('Processing failed:', error.message);
  }
}

processMedia();

Development

Build

npm run build

Contribution

  1. Clone repository
  2. Install dependencies: npm install
  3. Implement features/fixes
  4. Write tests (coming soon)
  5. Submit PR

License

MIT © Ernesto Voltaggio

Note: This package requires FFmpeg for audio conversion. The ffmpeg-static dependency is included automatically.