npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@deepdub/node

v2.0.0

Published

Deepdub API SDK

Readme

Deepdub Node.js SDK

Install and use the Deepdub Node.js SDK for text-to-speech generation with streaming support.

Installation

npm install --save @deepdub/node
# or
yarn add @deepdub/node

Requirements: Node.js 18+ (uses native fetch)

Initialization

const { DeepdubClient } = require('@deepdub/node');

// Option 1: Pass API key directly
const deepdub = new DeepdubClient('dd-your-api-key');

// Option 2: Use DEEPDUB_API_KEY environment variable
require('dotenv').config();
const deepdubFromEnv = new DeepdubClient();

// HTTP protocol supports voiceReference and sampleRate with all formats
const deepdubHttp = new DeepdubClient('dd-your-api-key', { protocol: 'http' });

Constructor parameters

| Parameter | Type | Default | Description | | --- | --- | --- | --- | | apiKey | string | process.env.DEEPDUB_API_KEY | Your Deepdub API key. Falls back to DEEPDUB_API_KEY if not provided. | | options.protocol | 'websocket' \| 'http' | 'websocket' | Transport protocol: "websocket" for real-time streaming, or "http" for REST API. | | options.baseUrl | string | US/EU REST URL | Base URL for the REST API. Falls back to DEEPDUB_BASE_URL. | | options.baseWebsocketUrl | string | US/EU WebSocket URL | Base URL for the WebSocket API. Falls back to DEEPDUB_BASE_WEBSOCKET_URL. | | options.baseWebsocketStreamingUrl | string | wss://wss.deepdub.ai/ws | Base URL for the WebSocket streaming API. Falls back to DEEPDUB_BASE_WEBSOCKET_STREAMING_URL. | | options.eu | boolean | false | Use EU region endpoints. Falls back to DD_EU=1. |

Protocol comparison

| Feature | WebSocket | HTTP | | --- | --- | --- | | Streaming chunks (onChunk) | Yes | No | | sampleRate option | mp3 only | All formats | | voiceReference option | No | Yes | | Concurrent generations | Yes | Yes |

Use WebSocket (default) for real-time streaming and low-latency playback. Use HTTP when you need voiceReference for instant voice cloning or sampleRate with non-mp3 formats.

Region endpoints

| Region | REST API | WebSocket API | | --- | --- | --- | | US (default) | https://restapi.deepdub.ai/api/v1 | wss://wsapi.deepdub.ai/open | | EU | https://restapi.eu.deepdub.ai/api/v1 | wss://wsapi.eu.deepdub.ai/open |

Enable EU with { eu: true } or DD_EU=1.

Connection

For WebSocket protocol, call connect() before using generateToBuffer(), generateToFile(), or generateTo(). For HTTP protocol and REST methods, no connection step is needed.

connect() — Open a WebSocket connection

const deepdub = new DeepdubClient('dd-your-api-key');
await deepdub.connect();

Returns: Promise<WebSocket> — the opened WebSocket connection.

Parameters

No parameters.

asyncConnect() — Alias for connect()

await deepdub.asyncConnect();

Returns: Promise<WebSocket> — the opened WebSocket connection.

Parameters

No parameters.

disconnect() — Close the WebSocket connection

await deepdub.connect();

// ...generate audio...

deepdub.disconnect();

Call disconnect() when you are done with the WebSocket. If you skip it, the open connection keeps the Node.js process alive.

Returns: void

Parameters

No parameters.

Text-to-Speech

tts() — Synchronous generation

Generate speech and receive the complete audio as a Buffer.

const fs = require('fs');

const audio = await deepdub.tts('Hello, welcome to Deepdub!', {
  voicePromptId: 'your-voice-id',
  model: 'dd-etts-3.2',
  locale: 'en-US',
  format: 'mp3',
});

fs.writeFileSync('output.mp3', audio);

Returns: Promise<Buffer> — binary audio data in the specified format.

Parameters

| Parameter | Type | Default | Description | | --- | --- | --- | --- | | text | string | Required | Text to convert to speech. | | params.voicePromptId | string | — | Voice prompt ID to use. Either this or voiceReference must be provided. | | params.voiceReference | string \| Buffer | — | Audio reference for instant voice cloning. Accepts a file path, raw Buffer, or base64-encoded string. Either this or voicePromptId must be provided. | | params.model | string | dd-etts-3.2 | Model ID. | | params.locale | string | en-US | Language locale code (e.g. en-US, fr-FR). | | params.format | string | mp3 | Audio output format. REST supports mp3, opus, and mulaw. | | params.temperature | number | — | Generation temperature (0.0-1.0). Higher values produce more varied output. | | params.variance | number | — | Voice variation level (0.0-1.0). | | params.duration | number | — | Target audio duration in seconds. Mutually exclusive with tempo. | | params.tempo | number | — | Playback speed multiplier. Mutually exclusive with duration. | | params.seed | number | — | Random seed for deterministic generation. | | params.promptBoost | boolean | — | Enhance voice prompt characteristics. | | params.sampleRate | number | — | Output sample rate in Hz. Supported: 8000, 16000, 22050, 24000, 44100, 48000. | | params.accentBaseLocale | string | — | Base accent locale. Must be provided together with accentLocale and accentRatio. | | params.accentLocale | string | — | Target accent locale. Must be provided together with accentBaseLocale and accentRatio. | | params.accentRatio | number | — | Accent blend ratio (0.0-1.0). Must be provided together with accentBaseLocale and accentLocale. | | params.accentControl | object | — | Accent blending object: { accentBaseLocale, accentLocale, accentRatio }. | | params.targetGender | string | — | Target gender for the output voice. | | params.generationId | string | Auto-generated UUID | Optional UUID for request tracking. | | params.superStretch | boolean | — | Enable super stretch for longer audio. | | params.realtime | boolean | — | Enable real-time priority processing. | | params.cleanAudio | boolean | — | Request audio cleanup when supported by the API. | | params.autoGain | boolean | — | Request automatic gain control when supported by the API. | | params.publish | boolean | — | Publish the generated asset when supported by the API. | | params.performanceReferencePromptId | string | — | Voice prompt ID to use as a performance reference. |

Full example with all common TTS parameters

const audio = await deepdub.tts('This demonstrates common TTS parameters.', {
  voicePromptId: 'your-voice-id',
  model: 'dd-etts-3.2',
  locale: 'en-US',
  format: 'mp3',
  temperature: 0.7,
  variance: 0.6,
  tempo: 1.1,
  seed: 42,
  promptBoost: true,
  sampleRate: 44100,
  accentBaseLocale: 'en-US',
  accentLocale: 'fr-FR',
  accentRatio: 0.3,
});

require('fs').writeFileSync('output.mp3', audio);

Voice cloning from audio reference

const audio = await deepdub.tts('Cloning a voice from an audio sample.', {
  voiceReference: './reference_audio.mp3',
  model: 'dd-etts-3.2',
  locale: 'en-US',
});

require('fs').writeFileSync('cloned_output.mp3', audio);

ttsRetro() — Retroactive generation

Submit a TTS request and receive a URL for later retrieval.

const response = await deepdub.ttsRetro('Generate this audio for later retrieval.', {
  voicePromptId: 'your-voice-id',
  model: 'dd-etts-3.2',
  locale: 'en-US',
});

const audioUrl = response.url;
console.log(`Audio available at: ${audioUrl}`);

Fetch the audio later with the same x-api-key header.

Returns: Promise<{ url: string }> — an object with a url key pointing to the generated audio.

Parameters

| Parameter | Type | Default | Description | | --- | --- | --- | --- | | text | string | Required | Text to convert to speech. | | params.voicePromptId | string | — | Voice prompt ID to use. Either this or voiceReference must be provided. | | params.voiceReference | string \| Buffer | — | Audio reference for instant voice cloning. Accepts a file path, raw Buffer, or base64-encoded string. Either this or voicePromptId must be provided. | | params.model | string | dd-etts-3.2 | Model ID. | | params.locale | string | en-US | Language locale code. | | params | TtsParams | — | Supports all tts() parameter fields. Retroactive generation is most commonly used with voicePromptId, model, and locale. |

WebSocket TTS

generateToBuffer() — Generate to buffer

Generate audio and receive a Buffer of audio data. WebSocket generation returns WAV by default.

const deepdub = new DeepdubClient('dd-your-api-key');
await deepdub.connect();

const buffer = await deepdub.generateToBuffer('Hello, welcome to Deepdub!', {
  locale: 'en-US',
  voicePromptId: 'your-voice-id',
});

console.log(`Generated ${buffer.length} bytes of audio`);
deepdub.disconnect();

WebSocket generation defaults to format: 'wav'. Use { protocol: 'http' } on the client for HTTP-based generateToBuffer() with voiceReference or sampleRate for non-mp3 formats.

Returns: Promise<Buffer> — generated audio data.

Parameters

| Parameter | Type | Default | Description | | --- | --- | --- | --- | | text | string | Required | Text to convert to speech. | | params | TtsParams | { format: 'wav' } over WebSocket | Supports voicePromptId, voiceReference (HTTP only), model, locale, format, temperature, variance, duration, tempo, seed, promptBoost, sampleRate, accent options, targetGender, generationId, superStretch, realtime, cleanAudio, autoGain, publish, and performanceReferencePromptId. | | params.onChunk | (chunk: Buffer) => void | — | Callback receiving each audio chunk as a Buffer. WebSocket protocol only. | | params.headerless | boolean | false | When true, chunks passed to onChunk have WAV headers stripped. WebSocket protocol only. |

generateToFile() — Generate to file

Generate audio and save directly to a file.

const deepdub = new DeepdubClient('dd-your-api-key');
await deepdub.connect();

await deepdub.generateToFile('./output.wav', 'Hello, welcome to Deepdub!', {
  locale: 'en-US',
  voicePromptId: 'your-voice-id',
});

deepdub.disconnect();

Returns: Promise<Buffer | void> — generated audio data for HTTP protocol, or resolves when the WebSocket file write completes.

Parameters

| Parameter | Type | Default | Description | | --- | --- | --- | --- | | filePath | string | Required | Output file path. | | text | string | Required | Text to convert to speech. | | params | TtsParams | { format: 'wav' } over WebSocket | Same generation parameters as generateToBuffer(), including onChunk and headerless. |

generateTo() — Low-level generation helper

Generate audio to a selected output type. Most applications should use generateToBuffer() or generateToFile() instead.

await deepdub.connect();

const buffer = await deepdub.generateTo('buffer', 'Hello from the low-level API.', {
  locale: 'en-US',
  voicePromptId: 'your-voice-id',
});

deepdub.disconnect();

Returns: Promise<Buffer | void> — generated audio for buffer output, or resolves when file output completes.

Parameters

| Parameter | Type | Default | Description | | --- | --- | --- | --- | | outputType | 'buffer' \| 'file' | Required | Output destination type. | | text | string | Required | Text to convert to speech. | | params | TtsParams | {} | Same generation parameters as generateToBuffer(), including onChunk and headerless. | | filePath | string \| null | null | Output file path when outputType is 'file'. |

Streaming chunks

Receive audio data incrementally for real-time playback:

const buffer = await deepdub.generateToBuffer('Streaming audio in real time!', {
  locale: 'en-US',
  voicePromptId: 'your-voice-id',
  model: 'dd-etts-3.2',
  onChunk: (chunk) => {
    console.log(`Received ${chunk.length} bytes`);
    // Stream to an audio player, network response, etc.
  },
});

Headerless chunks

Strip WAV headers from each chunk for raw PCM data:

await deepdub.generateToBuffer('Raw PCM streaming.', {
  locale: 'en-US',
  voicePromptId: 'your-voice-id',
  headerless: true,
  onChunk: (chunk) => {
    audioPlayer.write(chunk);
  },
});

asyncTts() — Streaming generation

Stream audio chunks over WebSocket for low-latency playback. If no WebSocket is connected, asyncTts() opens one automatically; call disconnect() when finished.

const deepdub = new DeepdubClient('dd-your-api-key');

const audioChunks = [];
for await (const chunk of deepdub.asyncTts('Streaming audio in real time!', {
  voicePromptId: 'your-voice-id',
  model: 'dd-etts-3.2',
  locale: 'en-US',
  format: 'wav',
})) {
  audioChunks.push(chunk);
  console.log(`Received chunk: ${chunk.length} bytes`);
}

require('fs').writeFileSync('streamed.wav', Buffer.concat(audioChunks));
deepdub.disconnect();

Yields: Buffer — audio chunks as they are generated.

Parameters

| Parameter | Type | Default | Description | | --- | --- | --- | --- | | text | string | Required | Text to convert to speech. | | params | TtsParams | { format: 'wav' } over WebSocket | Supports voicePromptId, model, locale, format, temperature, variance, duration, tempo, seed, promptBoost, accent options, sampleRate with mp3, targetGender, and generationId. | | params.generationId | string | Auto-generated UUID | Optional UUID for request tracking. | | params.targetGender | string | — | Target gender for the output voice. | | params.onChunk | (chunk: Buffer) => void | — | Internal chunk callback used by the iterator. You usually do not need to pass this directly. | | params.headerless | boolean | false | When true, chunks have WAV headers stripped. |

Concurrent generations

Run multiple generations in parallel on the same WebSocket connection:

const deepdub = new DeepdubClient('dd-your-api-key');
await deepdub.connect();

const sentences = [
  'First sentence to generate.',
  'Second sentence in parallel.',
  'Third sentence simultaneously.',
];

await Promise.all(
  sentences.map((text, index) =>
    deepdub.generateToFile(`./output_${index}.wav`, text, {
      locale: 'en-US',
      voicePromptId: 'your-voice-id',
      model: 'dd-etts-3.2',
    })
  )
);

deepdub.disconnect();

Voice Management

listVoices() — List all voice prompts

const voices = await deepdub.listVoices();

for (const voice of voices.voicePrompts ?? []) {
  console.log(`${voice.id}: ${voice.name ?? voice.title ?? 'Untitled'}`);
}

Returns: Promise<{ voicePrompts: VoicePrompt[] }> — an object with a voicePrompts key containing voice prompt objects.

Parameters

| Parameter | Type | Default | Description | | --- | --- | --- | --- | | limit | number | — | Optional maximum number of voices to return. |

addVoice() — Upload a voice sample

const response = await deepdub.addVoice({
  data: './voice_sample.wav',
  name: 'Professional Narrator',
  gender: 'female',
  locale: 'en-US',
  publish: false,
  speakingStyle: 'Neutral',
  age: 30,
});

console.log(`Created voice: ${JSON.stringify(response)}`);

Returns: Promise<object> — created voice prompt information.

Parameters

| Parameter | Type | Default | Description | | --- | --- | --- | --- | | data | string \| Buffer | Required | Audio data: file path, raw Buffer, or base64-encoded string. | | name | string | Required | Display name for the voice prompt. | | gender | string | Required | Speaker gender: "male" or "female". Sent to the API as uppercase. | | locale | string | Required | Language locale code (e.g. en-US). | | publish | boolean | false | Whether to make the voice publicly available. | | speakingStyle | string | 'Neutral' | Speaking style descriptor. | | age | number | 0 | Age of the speaker. | | filename | string | Derived from data | File name to send with the voice sample. | | text | string | — | Transcript or text associated with the voice sample. | | speakerId | string | — | Speaker ID to associate with the voice prompt. |

CLI Reference

# List available voices
deepdub list-voices

# Upload a new voice
deepdub add-voice \
  --file path/to/audio.mp3 \
  --name "My Voice" \
  --gender male \
  --locale en-US

# Generate text-to-speech
deepdub tts \
  --text "Hello from the CLI!" \
  --voice-prompt-id your-voice-id \
  --out output.mp3

# Set API key via flag or environment
deepdub --api-key dd-your-key tts --text "Hello!" --voice-prompt-id your-id
export DEEPDUB_API_KEY=dd-your-key

Environment Variables

| Variable | Description | Default | | --- | --- | --- | | DEEPDUB_API_KEY | API key for authentication | — | | DEEPDUB_BASE_URL | REST API base URL | US/EU production URL | | DEEPDUB_BASE_WEBSOCKET_URL | WebSocket API base URL | US/EU production URL | | DEEPDUB_BASE_WEBSOCKET_STREAMING_URL | Streaming WebSocket URL | wss://wss.deepdub.ai/ws | | DD_EU | Use EU endpoints ("1" to enable) | "0" |

Error Handling

try {
  const audio = await deepdub.tts('Hello!', {
    voicePromptId: 'your-voice-id',
  });
} catch (error) {
  if (error.status === 401) {
    console.error('Invalid API key');
  } else if (error.status === 400) {
    console.error('Invalid request parameters:', error.message);
  } else {
    console.error('API error:', error.message);
  }
}

For WebSocket operations, server errors are thrown as Error with the server message, such as rate limits or insufficient credits.

Available Models

| Model ID | Description | | --- | --- | | dd-etts-3.2 | Latest model (default) | | dd-etts-3.0 | High-quality production model | | dd-etts-2.5 | Stable production model |

Tests

Live API tests require DEEPDUB_API_KEY in .env:

npm test

Set DEEPDUB_VOICE_REFERENCE_FILE to a real reference audio file to enable the optional voice reference test.

Individual suites: node test/test-tts.js, node test/test-async-tts.js, node test/test-eu-region.js, etc.