npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

voice-router-dev

v0.8.0

Published

Universal speech-to-text router for Gladia, AssemblyAI, Deepgram, Azure, OpenAI Whisper, Speechmatics, and Soniox

Readme

Voice Router SDK

Universal speech-to-text router for 8 transcription providers with a single, unified API.

npm version License: MIT Node.js Version

Why Voice Router?

Switch between speech-to-text providers without changing your code. One API for Gladia, AssemblyAI, Deepgram, Azure, OpenAI Whisper, Speechmatics, and Soniox.

import { VoiceRouter } from 'voice-router-dev';

const router = new VoiceRouter({
  providers: {
    gladia: { apiKey: process.env.GLADIA_KEY },
    deepgram: { apiKey: process.env.DEEPGRAM_KEY }
  }
});

// Same code works with ANY provider
const result = await router.transcribe(audio, {
  provider: 'gladia'  // Switch to 'deepgram' anytime
});

Features

  • Provider-Agnostic - Switch providers with one line
  • Unified API - Same interface for all providers
  • Webhook Normalization - Auto-detect and parse webhooks
  • Real-time Streaming - WebSocket support (Gladia, AssemblyAI, Deepgram, Soniox, OpenAI Realtime)
  • Advanced Features - Diarization, sentiment, summarization, chapters, entities
  • Type-Safe - Full TypeScript support with OpenAPI-generated types
  • Typed Extended Data - Access provider-specific features with full autocomplete
  • Provider Fallback - Automatic failover strategies
  • Zero Config - Works out of the box

Supported Providers

| Provider | Batch | Streaming | Webhooks | Special Features | |----------|-------|-----------|----------|------------------| | Gladia | Yes | WebSocket | Yes | Multi-language, code-switching, translation | | AssemblyAI | Yes | Real-time | HMAC | Chapters, entities, content moderation | | Deepgram | Sync | WebSocket | Yes | PII redaction, keyword boosting | | Azure STT | Async | No | HMAC | Custom models, language ID | | OpenAI | Sync | Realtime | No | gpt-4o, diarization, Realtime API | | Speechmatics | Async | No | Query params | High accuracy, summarization | | Soniox | Yes | WebSocket | No | 60+ languages, translation, regions |

Installation

npm install voice-router-dev
# or
pnpm add voice-router-dev
# or
yarn add voice-router-dev

Quick Start

Basic Transcription

import { VoiceRouter, GladiaAdapter } from 'voice-router-dev';

// Initialize router
const router = new VoiceRouter({
  providers: {
    gladia: { apiKey: 'YOUR_GLADIA_KEY' }
  },
  defaultProvider: 'gladia'
});

// Register adapter
router.registerAdapter(new GladiaAdapter());

// Transcribe from URL
const result = await router.transcribe({
  type: 'url',
  url: 'https://example.com/audio.mp3'
}, {
  language: 'en',
  diarization: true
});

if (result.success) {
  console.log('Transcript:', result.data.text);
  console.log('Speakers:', result.data.speakers);
}

Multi-Provider with Fallback

import {
  VoiceRouter,
  GladiaAdapter,
  AssemblyAIAdapter,
  DeepgramAdapter
} from 'voice-router-dev';

const router = new VoiceRouter({
  providers: {
    gladia: { apiKey: process.env.GLADIA_KEY },
    assemblyai: { apiKey: process.env.ASSEMBLYAI_KEY },
    deepgram: { apiKey: process.env.DEEPGRAM_KEY }
  },
  selectionStrategy: 'round-robin'  // Auto load-balance
});

// Register all providers
router.registerAdapter(new GladiaAdapter());
router.registerAdapter(new AssemblyAIAdapter());
router.registerAdapter(new DeepgramAdapter());

// Automatically rotates between providers
await router.transcribe(audio1);  // Uses Gladia
await router.transcribe(audio2);  // Uses AssemblyAI
await router.transcribe(audio3);  // Uses Deepgram

Real-time Streaming

import { VoiceRouter, DeepgramAdapter } from 'voice-router-dev';

const router = new VoiceRouter({
  providers: {
    deepgram: { apiKey: process.env.DEEPGRAM_KEY }
  }
});

router.registerAdapter(new DeepgramAdapter());

// Start streaming session
const session = await router.transcribeStream({
  provider: 'deepgram',
  encoding: 'linear16',
  sampleRate: 16000,
  language: 'en',
  interimResults: true
}, {
  onTranscript: (event) => {
    if (event.isFinal) {
      console.log('Final:', event.text);
    } else {
      console.log('Interim:', event.text);
    }
  },
  onError: (error) => console.error(error)
});

// Send audio chunks
const audioStream = getMicrophoneStream();
for await (const chunk of audioStream) {
  await session.sendAudio({ data: chunk });
}

await session.close();

Webhook Normalization

import express from 'express';
// Webhooks use node:crypto - import from separate entry point
import { WebhookRouter } from 'voice-router-dev/webhooks';

const app = express();
const webhookRouter = new WebhookRouter();

// Single endpoint handles ALL providers
app.post('/webhooks/transcription', express.json(), (req, res) => {
  // Auto-detect provider from payload
  const result = webhookRouter.route(req.body, {
    queryParams: req.query,
    userAgent: req.headers['user-agent'],
    verification: {
      signature: req.headers['x-signature'],
      secret: process.env.WEBHOOK_SECRET
    }
  });

  if (!result.success) {
    return res.status(400).json({ error: result.error });
  }

  // Unified format across all providers
  console.log('Provider:', result.provider);  // 'gladia' | 'assemblyai' | etc
  console.log('Event:', result.event?.eventType);  // 'transcription.completed'
  console.log('ID:', result.event?.data?.id);
  console.log('Text:', result.event?.data?.text);

  res.json({ received: true });
});

Advanced Usage

Provider-Specific Features with Type Safety

Use typed provider options for full autocomplete and compile-time safety:

// Gladia - Full type-safe options
const result = await router.transcribe(audio, {
  provider: 'gladia',
  gladia: {
    translation: true,
    translation_config: { target_languages: ['fr', 'es'] },
    moderation: true,
    named_entity_recognition: true,
    sentiment_analysis: true,
    chapterization: true,
    audio_to_llm: true,
    audio_to_llm_config: [{ prompt: 'Summarize key points' }],
    custom_metadata: { session_id: 'abc123' }
  }
});

// Access typed extended data
if (result.extended) {
  const translations = result.extended.translation?.results;
  const chapters = result.extended.chapters?.results;
  const entities = result.extended.entities?.results;
  console.log('Custom metadata:', result.extended.customMetadata);
}

// AssemblyAI - Typed options with extended data
const assemblyResult = await router.transcribe(audio, {
  provider: 'assemblyai',
  assemblyai: {
    auto_chapters: true,
    entity_detection: true,
    sentiment_analysis: true,
    auto_highlights: true,
    content_safety: true,
    iab_categories: true
  }
});

if (assemblyResult.extended) {
  assemblyResult.extended.chapters?.forEach(ch => {
    console.log(`${ch.headline}: ${ch.summary}`);
  });
  assemblyResult.extended.entities?.forEach(e => {
    console.log(`${e.entity_type}: ${e.text}`);
  });
}

// Deepgram - Typed options with metadata tracking
const deepgramResult = await router.transcribe(audio, {
  provider: 'deepgram',
  deepgram: {
    model: 'nova-3',
    smart_format: true,
    paragraphs: true,
    detect_topics: true,
    tag: ['meeting', 'sales'],
    extra: { user_id: '12345' }
  }
});

if (deepgramResult.extended) {
  console.log('Request ID:', deepgramResult.extended.requestId);
  console.log('Audio SHA256:', deepgramResult.extended.sha256);
  console.log('Tags:', deepgramResult.extended.tags);
}

// OpenAI Whisper - Typed options
const whisperResult = await router.transcribe(audio, {
  provider: 'openai-whisper',
  diarization: true,
  openai: {
    temperature: 0.2,
    prompt: 'Technical discussion about APIs'
  }
});

// Speechmatics - Enhanced accuracy with summarization
const speechmaticsResult = await router.transcribe(audio, {
  provider: 'speechmatics',
  model: 'enhanced',
  summarization: true,
  diarization: true
});

// All providers include request tracking
console.log('Request ID:', result.tracking?.requestId);

Error Handling

const result = await router.transcribe(audio, {
  provider: 'gladia',
  language: 'en'
});

if (!result.success) {
  console.error('Provider:', result.provider);
  console.error('Error:', result.error);
  console.error('Details:', result.data);

  // Implement fallback strategy
  const fallbackResult = await router.transcribe(audio, {
    provider: 'assemblyai'  // Try different provider
  });
}

Custom Provider Selection

// Explicit provider selection
const router = new VoiceRouter({
  providers: {
    gladia: { apiKey: '...' },
    deepgram: { apiKey: '...' }
  },
  selectionStrategy: 'explicit'  // Must specify provider
});

// Round-robin load balancing
const router = new VoiceRouter({
  providers: { /* ... */ },
  selectionStrategy: 'round-robin'
});

// Default fallback
const router = new VoiceRouter({
  providers: { /* ... */ },
  defaultProvider: 'gladia',
  selectionStrategy: 'default'
});

API Reference

VoiceRouter

Main class for provider-agnostic transcription.

Constructor:

new VoiceRouter(config: VoiceRouterConfig)

Methods:

  • registerAdapter(adapter: TranscriptionAdapter) - Register a provider adapter
  • transcribe(audio: AudioInput, options?: TranscribeOptions) - Transcribe audio
  • transcribeStream(options: StreamingOptions, callbacks: StreamingCallbacks) - Stream audio
  • getTranscript(id: string, provider: string) - Get transcript by ID
  • getProviderCapabilities(provider: string) - Get provider features

WebhookRouter

Automatic webhook detection and normalization.

Methods:

  • route(payload: unknown, options?: WebhookRouterOptions) - Parse webhook
  • detectProvider(payload: unknown) - Detect provider from payload
  • validate(payload: unknown) - Validate webhook structure

Adapters

Provider-specific implementations:

  • GladiaAdapter - Gladia transcription
  • AssemblyAIAdapter - AssemblyAI transcription
  • DeepgramAdapter - Deepgram transcription
  • AzureSTTAdapter - Azure Speech-to-Text
  • OpenAIWhisperAdapter - OpenAI Whisper + Realtime API
  • SpeechmaticsAdapter - Speechmatics transcription
  • SonioxAdapter - Soniox transcription (batch + streaming)

TypeScript Support

Full type definitions included with provider-specific type safety:

import type {
  VoiceRouter,
  VoiceRouterConfig,
  AudioInput,
  TranscribeOptions,
  UnifiedTranscriptResponse,
  StreamingSession,
  StreamingOptions,
  UnifiedWebhookEvent,
  TranscriptionProvider,
  // Normalized data types
  TranscriptData,
  Word,
  Utterance,
  Speaker
} from 'voice-router-dev';

Normalized Result Structure

All providers return UnifiedTranscriptResponse with consistent data structure:

interface UnifiedTranscriptResponse<P extends TranscriptionProvider> {
  success: boolean;
  provider: P;

  // Normalized data - same structure for ALL providers
  data?: {
    id: string;              // Transcript ID
    text: string;            // Full transcript text
    status: TranscriptionStatus;
    confidence?: number;     // 0-1 confidence score
    duration?: number;       // Audio duration in seconds
    language?: string;       // Detected/specified language

    // Normalized arrays - consistent across providers
    words?: Word[];          // { word, start, end, confidence, speaker }
    utterances?: Utterance[]; // { text, start, end, speaker, words }
    speakers?: Speaker[];    // { id, label, confidence }
    summary?: string;
    metadata?: TranscriptMetadata;
  };

  // Provider-specific rich data (typed per provider)
  extended?: ProviderExtendedData;

  // Request tracking
  tracking?: { requestId, audioHash, processingTimeMs };

  // Error info (on failure)
  error?: { code, message, details, statusCode };

  // Raw provider response (fully typed per provider)
  raw?: ProviderRawResponse;
}

Use cases:

  • Display transcripts - Use data.text, data.words, data.utterances
  • Re-normalize stored responses - Store raw, reconstruct via adapter
  • Access provider features - Use extended for chapters, entities, etc.

See docs/NORMALIZED_RESULTS.md for detailed documentation.

Provider-Specific Type Safety

The SDK provides full type safety for provider-specific responses:

// Generic response - raw and extended fields are unknown
const result: UnifiedTranscriptResponse = await router.transcribe(audio);

// Provider-specific response - raw and extended are properly typed!
const deepgramResult: UnifiedTranscriptResponse<'deepgram'> = await router.transcribe(audio, {
  provider: 'deepgram'
});

// TypeScript knows raw is ListenV1Response
const metadata = deepgramResult.raw?.metadata;

// TypeScript knows extended is DeepgramExtendedData
const requestId = deepgramResult.extended?.requestId;
const sha256 = deepgramResult.extended?.sha256;

Provider-specific raw response types:

  • gladia - PreRecordedResponse
  • deepgram - ListenV1Response
  • openai-whisper - CreateTranscription200One
  • assemblyai - AssemblyAITranscript
  • azure-stt - AzureTranscription

Provider-specific extended data types:

  • gladia - GladiaExtendedData (translation, moderation, entities, sentiment, chapters, audioToLlm, customMetadata)
  • assemblyai - AssemblyAIExtendedData (chapters, entities, sentimentResults, highlights, contentSafety, topics)
  • deepgram - DeepgramExtendedData (metadata, requestId, sha256, modelInfo, tags)

Typed Extended Data

Access rich provider-specific data beyond basic transcription:

import type {
  GladiaExtendedData,
  AssemblyAIExtendedData,
  DeepgramExtendedData,
  // Individual types for fine-grained access
  GladiaTranslation,
  GladiaChapters,
  AssemblyAIChapter,
  AssemblyAIEntity,
  DeepgramMetadata
} from 'voice-router-dev';

// Gladia extended data
const gladiaResult = await router.transcribe(audio, { provider: 'gladia', gladia: { translation: true } });
const translation: GladiaTranslation | undefined = gladiaResult.extended?.translation;

// AssemblyAI extended data
const assemblyResult = await router.transcribe(audio, { provider: 'assemblyai', assemblyai: { auto_chapters: true } });
const chapters: AssemblyAIChapter[] | undefined = assemblyResult.extended?.chapters;

// All responses include tracking info
console.log('Request ID:', gladiaResult.tracking?.requestId);

Exported Parameter Enums

Import and use provider-specific enums for type-safe configuration:

import {
  // Deepgram enums
  ListenV1EncodingParameter,
  ListenV1ModelParameter,
  SpeakV1EncodingParameter,

  // Gladia enums
  StreamingSupportedEncodingEnum,
  StreamingSupportedSampleRateEnum,

  // OpenAI types
  AudioResponseFormat
} from 'voice-router-dev';

// Type-safe Deepgram encoding
const session = await router.transcribeStream({
  provider: 'deepgram',
  encoding: ListenV1EncodingParameter.linear16,
  model: ListenV1ModelParameter['nova-2'],
  sampleRate: 16000
});

// Type-safe Gladia encoding - use unified format
const gladiaSession = await router.transcribeStream({
  provider: 'gladia',
  encoding: 'linear16', // Unified format - mapped to Gladia's 'wav/pcm'
  sampleRate: 16000
});

Type-Safe Streaming Options

Streaming options are fully typed based on provider OpenAPI specifications:

// Deepgram streaming - all options are type-safe
const deepgramSession = await router.transcribeStream({
  provider: 'deepgram',
  encoding: 'linear16',
  model: 'nova-3',
  language: 'en-US',
  diarization: true
}, callbacks);

// Gladia streaming - with typed gladiaStreaming options
const gladiaSession = await router.transcribeStream({
  provider: 'gladia',
  encoding: 'linear16', // Unified format - mapped to Gladia's 'wav/pcm'
  sampleRate: 16000,
  gladiaStreaming: {
    realtime_processing: { words_accurate_timestamps: true },
    messages_config: { receive_partial_transcripts: true }
  }
}, callbacks);

// AssemblyAI streaming
const assemblySession = await router.transcribeStream({
  provider: 'assemblyai',
  sampleRate: 16000,
  wordTimestamps: true
}, callbacks);

Benefits:

  • Full IntelliSense - Autocomplete for all provider-specific options
  • Compile-time Safety - Invalid options caught before runtime
  • Provider Discrimination - Type system knows which provider you're using
  • OpenAPI-Generated - Types come directly from provider specifications

Typed Field Configs

Build type-safe UI field overrides with compile-time validation:

import {
  GladiaStreamingFieldName,
  GladiaStreamingConfig,
  FieldOverrides,
  GladiaStreamingSchema,
  FieldConfig
} from 'voice-router-dev/field-configs'

// Type-safe field overrides - typos caught at compile time!
const overrides: Partial<Record<GladiaStreamingFieldName, FieldConfig | null>> = {
  encoding: { name: 'encoding', type: 'select', required: false },
  language_config: null, // Hide this field
  // typo_field: null, // ✗ TypeScript error!
}

// Fully typed config values - option values validated too!
const config: Partial<GladiaStreamingConfig> = {
  encoding: 'wav/pcm', // ✓ Only valid options allowed
  sample_rate: 16000,
}

// Extract specific field's valid options
type EncodingOptions = GladiaStreamingConfig['encoding']
// = 'wav/pcm' | 'wav/alaw' | 'wav/ulaw'

Available for all 7 providers:

  • GladiaStreamingFieldName, DeepgramTranscriptionFieldName, AssemblyAIStreamingFieldName, etc.
  • GladiaStreamingConfig, DeepgramTranscriptionConfig, AzureTranscriptionConfig, etc.
  • GladiaStreamingSchema, DeepgramTranscriptionSchema, etc. (Zod schemas for advanced extraction)

Lightweight Field Metadata (Performance-Optimized)

For UI form generation without heavy Zod schema types (156KB vs 2.8MB):

// Lightweight import - 156KB types instead of 2.8MB
import {
  GLADIA_STREAMING_FIELDS,
  GladiaStreamingFieldName,
  PROVIDER_FIELDS,
  FieldMetadata
} from 'voice-router-dev/field-metadata'

// Pre-computed field metadata - no Zod at runtime
GLADIA_STREAMING_FIELDS.forEach(field => {
  if (field.type === 'select' && field.options) {
    renderDropdown(field.name, field.options)
  }
})

When to use which:

| Use Case | Import | Types Size | |----------|--------|------------| | UI form generation (no validation) | field-metadata | 156 KB | | Runtime Zod validation needed | field-configs | 2.8 MB |

Requirements

  • Node.js: 20.0.0 or higher
  • TypeScript: 5.0+ (optional)
  • Package Managers: npm, pnpm, or yarn

Documentation

API Reference (Auto-Generated)

Comprehensive API documentation is auto-generated with TypeDoc from TypeScript source code:

docs/generated/ - Complete API reference

Main Documentation Sets:

  1. router/ - Core SDK API

    • voice-router.md - VoiceRouter class (main entry point)
    • types.md - Unified types (UnifiedTranscriptResponse, StreamingOptions, etc.)
    • adapters/base-adapter.md - BaseAdapter interface
  2. webhooks/ - Webhook handling

    • webhook-router.md - WebhookRouter class (auto-detect providers)
    • types.md - Webhook event types
    • {provider}-webhook.md - Provider-specific webhook handlers
  3. Provider-Specific Adapters:

Most Important Files:

  • docs/generated/router/router/voice-router.md - Main router class
  • docs/generated/router/router/types.md - Core types
  • docs/generated/webhooks/webhook-router.md - Webhook handling

Developer Documentation

Provider Setup Guides

Gladia

import { VoiceRouter, GladiaAdapter } from 'voice-router-dev';

const router = new VoiceRouter({
  providers: { gladia: { apiKey: 'YOUR_KEY' } }
});
router.registerAdapter(new GladiaAdapter());

Get your API key: https://gladia.io

AssemblyAI

import { VoiceRouter, AssemblyAIAdapter } from 'voice-router-dev';

const router = new VoiceRouter({
  providers: { assemblyai: { apiKey: 'YOUR_KEY' } }
});
router.registerAdapter(new AssemblyAIAdapter());

Get your API key: https://assemblyai.com

Deepgram

import { VoiceRouter, DeepgramAdapter } from 'voice-router-dev';

const router = new VoiceRouter({
  providers: { deepgram: { apiKey: 'YOUR_KEY' } }
});
router.registerAdapter(new DeepgramAdapter());

Get your API key: https://deepgram.com

Azure Speech-to-Text

import { VoiceRouter, AzureSTTAdapter } from 'voice-router-dev';

const router = new VoiceRouter({
  providers: {
    'azure-stt': {
      apiKey: 'YOUR_KEY',
      region: 'eastus'  // Required
    }
  }
});
router.registerAdapter(new AzureSTTAdapter());

Get your credentials: https://azure.microsoft.com/en-us/services/cognitive-services/speech-to-text/

OpenAI Whisper

import { VoiceRouter, OpenAIWhisperAdapter } from 'voice-router-dev';

const router = new VoiceRouter({
  providers: { 'openai-whisper': { apiKey: 'YOUR_KEY' } }
});
router.registerAdapter(new OpenAIWhisperAdapter());

Get your API key: https://platform.openai.com

Speechmatics

import { VoiceRouter, SpeechmaticsAdapter } from 'voice-router-dev';

const router = new VoiceRouter({
  providers: { speechmatics: { apiKey: 'YOUR_KEY' } }
});
router.registerAdapter(new SpeechmaticsAdapter());

Get your API key: https://speechmatics.com

Soniox

import { VoiceRouter, SonioxAdapter, SonioxRegion } from 'voice-router-dev';

const router = new VoiceRouter({
  providers: {
    soniox: {
      apiKey: 'YOUR_KEY',
      region: SonioxRegion.us  // or 'eu', 'jp'
    }
  }
});
router.registerAdapter(new SonioxAdapter());

Get your API key: https://soniox.com

Contributing

Contributions welcome! Please read our Contributing Guide.

License

MIT © Lazare Zemliak

Support


Note: This is a development version (voice-router-dev). The stable release will be published as voice-router.