@travisvn/edge-tts

v1.0.2

Published

9 months ago

Use Microsoft Edge's online text-to-speech service in Node.js WITHOUT needing Microsoft Edge or Windows or an API key

0High
0Medium
0Low

travisvn

text-to-speech tts edge-tts

edge-tts (Node.js)

This is a Node.js TypeScript conversion of the Python edge-tts library. It allows you to use Microsoft Edge's online text-to-speech service from within your Node.js applications.

This package provides high fidelity to the original, replicating the specific headers and WebSocket communication necessary to interact with the service.

Installation

npm install @travisvn/edge-tts
# or
yarn add @travisvn/edge-tts

Quick Start

Simple API (Recommended for most use cases)

import { EdgeTTS } from '@travisvn/edge-tts';
import fs from 'fs/promises';

// Simple one-shot synthesis
const tts = new EdgeTTS('Hello, world!', 'en-US-EmmaMultilingualNeural');
const result = await tts.synthesize();

// Save audio file
const audioBuffer = Buffer.from(await result.audio.arrayBuffer());
await fs.writeFile('output.mp3', audioBuffer);

Advanced Streaming API (For real-time processing)

import { Communicate } from '@travisvn/edge-tts';
import fs from 'fs/promises';

const communicate = new Communicate('Hello, world!', {
  voice: 'en-US-EmmaMultilingualNeural',
});

const buffers: Buffer[] = [];
for await (const chunk of communicate.stream()) {
  if (chunk.type === 'audio' && chunk.data) {
    buffers.push(chunk.data);
  }
}

await fs.writeFile('output.mp3', Buffer.concat(buffers));

Usage Examples

Here's how to use the simple, promise-based API for quick synthesis:

// examples/simple-api.ts
import { EdgeTTS, createVTT, createSRT } from '@travisvn/edge-tts';
import { promises as fs } from 'fs';
import path from 'path';

const TEXT = 'Hello, world! This is a test of the simple edge-tts API.';
const VOICE = 'en-US-EmmaMultilingualNeural';
const OUTPUT_FILE = path.join(__dirname, 'simple-test.mp3');

async function main() {
  // Create TTS instance with prosody options
  const tts = new EdgeTTS(TEXT, VOICE, {
    rate: '+10%',
    volume: '+0%',
    pitch: '+0Hz',
  });

  try {
    // Synthesize speech (one-shot)
    const result = await tts.synthesize();

    // Save audio file
    const audioBuffer = Buffer.from(await result.audio.arrayBuffer());
    await fs.writeFile(OUTPUT_FILE, audioBuffer);

    // Generate subtitle files
    const vttContent = createVTT(result.subtitle);
    const srtContent = createSRT(result.subtitle);

    await fs.writeFile('subtitles.vtt', vttContent);
    await fs.writeFile('subtitles.srt', srtContent);

    console.log(`Audio saved to ${OUTPUT_FILE}`);
    console.log(`Generated ${result.subtitle.length} word boundaries`);
  } catch (error) {
    console.error('Synthesis failed:', error);
  }
}

main().catch(console.error);

Here is an example using the advanced streaming API for real-time processing:

// examples/streaming.ts
import { Communicate } from '@travisvn/edge-tts';
import { promises as fs } from 'fs';
import path from 'path';

const TEXT =
  'Hello, world! This is a test of the new edge-tts Node.js library.';
const VOICE = 'en-US-EmmaMultilingualNeural';
const OUTPUT_FILE = path.join(__dirname, 'test.mp3');

async function main() {
  const communicate = new Communicate(TEXT, { voice: VOICE });

  const buffers: Buffer[] = [];
  for await (const chunk of communicate.stream()) {
    if (chunk.type === 'audio' && chunk.data) {
      buffers.push(chunk.data);
    }
  }

  const finalBuffer = Buffer.concat(buffers);
  await fs.writeFile(OUTPUT_FILE, finalBuffer);

  console.log(`Audio saved to ${OUTPUT_FILE}`);
}

main().catch(console.error);

You can list all available voices and filter them by criteria.

// examples/listVoices.ts
import { VoicesManager } from '@travisvn/edge-tts';

async function main() {
  const voicesManager = await VoicesManager.create();

  // Find all English voices
  const voices = voicesManager.find({ Language: 'en' });
  console.log(
    'English voices:',
    voices.map((v) => v.ShortName)
  );

  // Find female US voices
  const femaleUsVoices = voicesManager.find({
    Gender: 'Female',
    Locale: 'en-US',
  });
  console.log(
    'Female US voices:',
    femaleUsVoices.map((v) => v.ShortName)
  );
}

main().catch(console.error);

The stream() method provides WordBoundary events for generating subtitles.

// examples/streaming.ts
import { Communicate, SubMaker } from '@travisvn/edge-tts';

const TEXT = 'This is a test of the streaming functionality, with subtitles.';
const VOICE = 'en-GB-SoniaNeural';

async function main() {
  const communicate = new Communicate(TEXT, { voice: VOICE });
  const subMaker = new SubMaker();

  for await (const chunk of communicate.stream()) {
    if (chunk.type === 'audio' && chunk.data) {
      // Do something with the audio data, e.g., stream it to a client.
      console.log(`Received audio chunk of size: ${chunk.data.length}`);
    } else if (chunk.type === 'WordBoundary') {
      subMaker.feed(chunk);
    }
  }

  // Get the subtitles in SRT format.
  const srt = subMaker.getSrt();
  console.log('\nGenerated Subtitles (SRT):\n', srt);
}

main().catch(console.error);

API Reference

📖 Complete API Documentation →

The main exports of the package are:

Simple API:

EdgeTTS - Simple, promise-based TTS class for one-shot synthesis
createVTT / createSRT - Utility functions for subtitle generation

Advanced API:

Communicate - Advanced streaming TTS class for real-time processing
VoicesManager - A class to find and filter voices
listVoices - A function to get all available voices
SubMaker - A utility to generate SRT subtitles from WordBoundary events

Common:

Exception classes - NoAudioReceived, WebSocketError, etc.
TypeScript types - Complete type definitions for voices, options, and stream chunks

Both APIs use the same robust infrastructure including DRM security handling, error recovery, proxy support, and all Microsoft Edge authentication features.

For detailed documentation, examples, and advanced usage patterns, see the comprehensive API guide.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme