npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@reaatech/media-pipeline-mcp-deepgram

v0.4.0

Published

Deepgram provider — Nova-2 speech-to-text transcription and speaker diarization

Downloads

787

Readme

@reaatech/media-pipeline-mcp-deepgram

npm version License: MIT CI

Status: Pre-1.0 — APIs may change in minor versions. Pin to a specific version in production.

Deepgram provider for the media pipeline framework. Provides speech-to-text transcription with smart formatting and speaker diarization using the Nova-2 model. Supports native streaming via WebSocket frames and HMAC-signed webhook callbacks for async batch operations.

Installation

npm install @reaatech/media-pipeline-mcp-deepgram
# or
pnpm add @reaatech/media-pipeline-mcp-deepgram

Feature Overview

  • Speech-to-text transcription with Nova-2 (word-level timestamps, confidence scores)
  • Speaker diarization with labeled utterances and segment metadata
  • Smart formatting: auto-capitalization, punctuation, number/date normalization
  • Language detection and multi-language support
  • Streaming support for both operations (supportsStreaming)
  • Webhook support for async callbacks (supportsWebhooks)
  • SHA-256 hashing of raw audio in cache keys to avoid storing multi-megabyte buffers

Quick Start

import { DeepgramProvider } from "@reaatech/media-pipeline-mcp-deepgram";

const provider = new DeepgramProvider({ apiKey: process.env.DEEPGRAM_API_KEY! });

// Transcribe audio to text
const result = await provider.execute({
  operation: "audio.stt",
  params: { audio_data: audioBuffer, language: "en", diarize: true },
  config: {},
});
console.log(JSON.parse(result.data.toString()).transcript);

// Diarize speakers in an audio recording
const speakers = await provider.execute({
  operation: "audio.diarize",
  params: { audio_data: meetingAudioBuffer, language: "en" },
  config: {},
});
const output = JSON.parse(speakers.data.toString());
console.log(`Found ${output.speakers} speakers across ${output.segments.length} segments`);

Supported Operations

| Operation | Default Model | Description | Output Format | |-----------|---------------|-------------|---------------| | audio.stt | nova-2 | Speech-to-text with smart formatting, timestamps, and optional diarization | JSON with transcript, confidence, segments | | audio.diarize | nova-2 | Speaker identification with labeled utterances, start/end times, and confidence | JSON with speakers count and per-speaker segments |

Configuration Parameters

audio.stt

| Parameter | Type | Default | Description | |-----------|------|---------|-------------| | audio_data | Buffer | required | Raw audio data buffer | | language | string | "en" | BCP-47 language code | | model | string | "nova-2" | Model ID (nova-2, whisper) | | diarize | boolean | false | Enable speaker diarization in STT output |

audio.diarize

| Parameter | Type | Default | Description | |-----------|------|---------|-------------| | audio_data | Buffer | required | Raw audio data buffer | | language | string | "en" | BCP-47 language code | | model | string | "nova-2" | Model ID |

API Reference

DeepgramProvider

class DeepgramProvider extends MediaProvider {
  constructor(config: DeepgramProviderConfig)

  healthCheck(): Promise<ProviderHealth>
  estimateCost(input: ProviderInput): Promise<CostEstimate>
  execute(input: ProviderInput): Promise<ProviderOutput>
}

DeepgramProviderConfig

interface DeepgramProviderConfig {
  apiKey: string;
  models?: {
    stt?: string;      // Default: "nova-2"
    diarize?: string;  // Default: "nova-2"
  };
  timeout?: number;    // Request timeout in ms
}

Factory Function

import { defineDeepgramProvider } from "@reaatech/media-pipeline-mcp-deepgram";

const provider = defineDeepgramProvider({ apiKey: process.env.DEEPGRAM_API_KEY! });

Key Methods

| Method | Returns | Description | |--------|---------|-------------| | healthCheck() | ProviderHealth | Validates API key by fetching project info from the Deepgram API | | estimateCost(input) | CostEstimate | Estimates cost based on audio size (bytes / 960KB per minute) and model per-minute rate | | execute(input) | ProviderOutput | Runs STT or diarization, returns JSON output with transcript/segments metadata |

Non-Retryable Errors

The provider classifies these errors as non-retryable: authentication failed, invalid API key, permission denied, insufficient credits, unsupported model, invalid audio format.

Cost Estimation

Per-Minute Pricing

| Model | Operation | Cost / Minute | |-------|-----------|---------------| | nova-2 | audio.stt | $0.0059 | | nova-2 | audio.diarize | $0.0079 | | whisper | audio.stt | $0.0040 |

Cost is estimated by converting the audio buffer size to minutes (using 960KB/min as an approximation), then multiplying by the per-minute rate.

Cache Configuration

The provider exposes static cacheConfig with deterministic and non-deterministic parameters.

Deterministic parameters: audio_data (SHA-256 hashed), audio_url, model, language, diarize, punctuate, smart_format, utterances, detect_topics, detect_entities, redact

Non-deterministic parameters: request_id

Raw audio bytes are hashed with SHA-256 during normalization so cache keys remain compact. All boolean-style feature flags are coerced to booleans for consistent matching.

Health Check

The health check sends a GET request to https://api.deepgram.com/v1/projects using the configured API key. Returns { healthy: true, latency: <ms> } if the API responds with 2xx, or { healthy: false, error: "<message>" } on failure.

Related Packages

License

MIT