npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

@andresaya/edge-tts

v1.7.2

Published

Edge TTS is a package that allows access to the online text-to-speech service used by Microsoft Edge without the need for Microsoft Edge, Windows, or an API key.

Readme

Edge TTS

Ask DeepWiki

Edge TTS is a powerful Text-to-Speech (TTS) package that leverages Microsoft's Edge capabilities. This package allows you to synthesize speech from text and manage voice options easily through a command-line interface (CLI).

Features

  • Text-to-Speech: Convert text into natural-sounding speech using Microsoft Edge's TTS capabilities.
  • TypeScript Support: Full TypeScript support with comprehensive type definitions included.
  • Multiple Audio Formats: Support for 36+ audio formats (MP3, WebM, OGG, WAV, PCM, and more).
  • Multiple Voices: Access a variety of voices to suit your project's needs.
  • Voice Filtering: Filter voices by language and gender for better selection.
  • Audio Information: Get detailed information about generated audio (size, duration, format).
  • Audio Export Options: Export synthesized audio in different formats (raw, base64, or directly to a file).
  • Streaming Support: Stream audio data in real-time for better performance.
  • Word Boundaries Metadata: Get word boundary information with precise timestamps.
  • Command-Line Interface: Use a simple CLI for easy access to functionality.
  • Easy Integration: Modular structure allows for easy inclusion in existing projects.
  • SSML Custom: 🥳🥳 Edge TTS accepts raw SSML with all characteristics of Azure AI Speech.

Installation

You can install Edge TTS via npm or bun:

bun add @andresaya/edge-tts
npm install @andresaya/edge-tts

TypeScript Support

Edge TTS is written in TypeScript and includes full type definitions. No additional @types packages are needed.

Available Types

import { 
    EdgeTTS, 
    Constants,
    Voice,
    SynthesisOptions,
    WordBoundary 
} from '@andresaya/edge-tts';

// Voice interface
interface Voice {
    Name: string;
    ShortName: string;
    Gender: 'Male' | 'Female';
    Locale: string;
    FriendlyName: string;
    LocalName: string;
}

// Synthesis options
interface SynthesisOptions {
    pitch?: string | number;       // e.g., '+20Hz' or 20
    rate?: string | number;        // e.g., '50%' or 50
    volume?: string | number;      // e.g., '90%' or 90
    inputType?: 'auto' | 'ssml' | 'text';  // Default: 'auto'
    outputFormat?: string;         // e.g., Constants.OUTPUT_FORMAT.AUDIO_24KHZ_96KBITRATE_MONO_MP3
}

// Word boundary metadata
interface WordBoundary {
    type: "WordBoundary";
    offset: number;
    duration: number;
    text: string;
}

Type-Safe Usage Example

import { EdgeTTS, SynthesisOptions, Constants } from '@andresaya/edge-tts';

const tts = new EdgeTTS();

const options: SynthesisOptions = {
    pitch: '+10Hz',
    rate: '100%',
    volume: '90%',
    outputFormat: Constants.OUTPUT_FORMAT.WEBM_24KHZ_16BIT_MONO_OPUS
};

await tts.synthesize("TypeScript example", 'en-US-AriaNeural', options);

const info = tts.getAudioInfo(); // Returns: { size: number; format: string; estimatedDuration: number }
const boundaries = tts.getWordBoundaries(); // Returns: WordBoundary[]

Usage

Command-Line Interface

Install globally to use the CLI:

npm install -g @andresaya/edge-tts

To synthesize speech from text:

edge-tts synthesize -t "Hello, world!" -o hello_world_audio

From file in format SSML:

edge-tts synthesize -f ssml.txt --ssml -o salida

To list available voices:

edge-tts voice-list

Integration into Your Project

import { EdgeTTS } from '@andresaya/edge-tts';

// Initialize the EdgeTTS service
const tts = new EdgeTTS();

API Reference

Voice Management

Get All Voices

const voices = await tts.getVoices();
console.log(`Found ${voices.length} voices`);

Filter Voices by Language

// Get all English voices
const englishVoices = await tts.getVoicesByLanguage('en');

// Get specific locale voices
const usEnglishVoices = await tts.getVoicesByLanguage('en-US');

Filter Voices by Gender

// Get all female voices
const femaleVoices = await tts.getVoicesByGender('Female');

// Get all male voices
const maleVoices = await tts.getVoicesByGender('Male');

Text Synthesis

Custom SSML (Advanced)

Edge TTS accepts raw SSML so you can control prosody, styles, pauses, pronunciations, and more. You can pass SSML from code or the CLI. By default the library auto-detects if your input is SSML; you can also force the mode.

More information Azure AI Speech

SSML-builder

@andresaya/ssml-builder A powerful, type-safe TypeScript library for building Speech Synthesis Markup Language (SSML) documents. Create expressive text-to-speech applications with Azure Speech Service and other SSML-compliant engines.

What the library does for you

  • Auto-detect / force mode: options.inputType may be 'auto' | 'ssml' | 'text' (default: auto).
  • Validation: Throws helpful errors if SSML is malformed (e.g., missing , , or the synthesis namespace).
  • Voice injection: If your SSML lacks , it injects one with the voice you passed.
  • Text wrapping: If you pass plain text (or inputType: 'text'), it wraps it in a valid SSML envelope using your rate, pitch, and volume.
import { EdgeTTS } from '@andresaya/edge-tts';

const tts = new EdgeTTS();

const ssml = `
<speak version="1.0"
       xmlns="http://www.w3.org/2001/10/synthesis"
       xmlns:mstts="https://www.w3.org/2001/mstts"
       xml:lang="es-CO">
  <voice name="es-CO-GonzaloNeural">
    <mstts:express-as style="narration-professional">
      <prosody rate="+5%" pitch="+10Hz" volume="+0%">
        Hola, este es un ejemplo de <emphasis>SSML</emphasis>.
        <break time="400ms" />
        El número es <say-as interpret-as="cardinal">2025</say-as>.
        La palabra se pronuncia
        <phoneme alphabet="ipa" ph="ˈxola">hola</phoneme>.
      </prosody>
    </mstts:express-as>
  </voice>
</speak>`.trim();

// Auto-detects SSML, or force it with inputType: 'ssml'
await tts.synthesize(ssml, 'es-CO-GonzaloNeural', { inputType: 'ssml' });

Basic Synthesis

// Simple synthesis with default voice
await tts.synthesize("Hello, world!");

// Synthesis with specific voice
await tts.synthesize("Hello, world!", 'en-US-AriaNeural');

Advanced Synthesis with Options

await tts.synthesize("Hello, world!", 'en-US-AriaNeural', {
    rate: '50%',           // Speech rate: -100% to +200% (or number)
    volume: '90%',         // Speech volume: -100% to +100% (or number)
    pitch: '+20Hz',        // Voice pitch: -100Hz to +100Hz (or number)
    outputFormat: 'audio-24khz-96kbitrate-mono-mp3'  // Audio output format
});

Audio Output Formats

Edge TTS supports multiple audio formats. You can specify the format using the outputFormat option:

import { EdgeTTS, Constants } from '@andresaya/edge-tts';

const tts = new EdgeTTS();

// High quality MP3
await tts.synthesize("Hello!", 'en-US-AriaNeural', {
    outputFormat: Constants.OUTPUT_FORMAT.AUDIO_24KHZ_96KBITRATE_MONO_MP3
});
await tts.toFile('./output/audio'); // Automatically saved as .mp3

// WebM/Opus for web
await tts.synthesize("Hello!", 'en-US-AriaNeural', {
    outputFormat: Constants.OUTPUT_FORMAT.WEBM_24KHZ_16BIT_MONO_OPUS
});
await tts.toFile('./output/audio'); // Automatically saved as .webm

// Lossless WAV
await tts.synthesize("Hello!", 'en-US-AriaNeural', {
    outputFormat: Constants.OUTPUT_FORMAT.RIFF_24KHZ_16BIT_MONO_PCM
});
await tts.toFile('./output/audio'); // Automatically saved as .wav

Available formats (all 36 tested and compatible):

  • MP3 Formats (Streaming): 16kHz, 24kHz, 48kHz with various bitrates (32-192 kbps)
  • Opus Formats (Streaming): Audio, WebM, and OGG containers
  • WAV/PCM Formats (Non-streaming): RIFF (8-48 kHz) and RAW variants
  • Specialized Codecs: AMR-WB, G.722, TrueSilk, A-law, μ-law

See Constants.OUTPUT_FORMAT for the complete list. The file extension is automatically detected based on the format.

Format recommendations:

  • 🌐 Web streaming: WEBM_24KHZ_16BIT_MONO_OPUS or AUDIO_24KHZ_96KBITRATE_MONO_MP3
  • 📱 Mobile apps: AUDIO_24KHZ_48KBITRATE_MONO_MP3
  • 💾 High quality: AUDIO_48KHZ_192KBITRATE_MONO_MP3 or RIFF_48KHZ_16BIT_MONO_PCM
  • Low bandwidth: AUDIO_16KHZ_32KBITRATE_MONO_MP3

Streaming Synthesis

// Stream audio data in real-time
for await (const chunk of tts.synthesizeStream("Long text to stream...", 'en-US-AriaNeural')) {
    // Process each audio chunk as it arrives
    console.log(`Received chunk: ${chunk.length} bytes`);
}

Audio Information

Get Audio Details

await tts.synthesize("Hello, world!");

const audioInfo = tts.getAudioInfo();
console.log(`Size: ${audioInfo.size} bytes`);
console.log(`Format: ${audioInfo.format}`);
console.log(`Duration: ${audioInfo.estimatedDuration} seconds`);

Get Duration Only

const duration = tts.getDuration();
console.log(`Audio duration: ${duration} seconds`);

Export Options

Export as Base64

await tts.synthesize("Hello, world!");
const base64Audio = tts.toBase64();
console.log(`Base64 length: ${base64Audio.length}`);

Export as Raw Buffer

const rawAudio = tts.toRaw(); // Alias for toBase64()
const buffer = tts.toBuffer(); // Get as Buffer object

Export to File

const filePath = await tts.toFile("output_audio");
console.log(`Audio saved to: ${filePath}`);
// Creates: output_audio.mp3

Word Boundaries Metadata

// Get word boundaries with timestamps
$boundaries = $tts->getWordBoundaries();

// Save metadata to file
$tts->saveMetadata('metadata.json');

Examples

Complete Example with Voice Selection

import { EdgeTTS } from '@andresaya/edge-tts';

async function textToSpeechExample() {
    const tts = new EdgeTTS();
    
    // Get available English voices
    const englishVoices = await tts.getVoicesByLanguage('en-US');
    console.log(`Available English voices: ${englishVoices.length}`);
    
    // Use the first available voice
    const voice = englishVoices[0];
    console.log(`Using voice: ${voice.FriendlyName}`);
    
    // Synthesize with custom options
    await tts.synthesize(
        "This is a test of the Edge TTS system with custom voice parameters.",
        voice.ShortName,
        {
            pitch: '+10Hz',
            rate: '-10%',
            volume: '90%'
        }
    );
    
    // Get audio information
    const info = tts.getAudioInfo();
    console.log(`Generated audio: ${info.size} bytes, ${info.estimatedDuration.toFixed(2)}s`);
    
    // Save to file
    const outputPath = await tts.toFile('./output/speech');
    console.log(`Audio saved to: ${outputPath}`);
}

textToSpeechExample().catch(console.error);

Streaming Example

import { EdgeTTS } from '@andresaya/edge-tts';
import { createWriteStream } from 'fs';

async function streamingExample() {
    const tts = new EdgeTTS();
    const writeStream = createWriteStream('streaming_output.mp3');
    
    const longText = "This is a very long text that will be streamed...";
    
    for await (const chunk of tts.synthesizeStream(longText, 'en-US-AriaNeural')) {
        writeStream.write(chunk);
        console.log(`Streamed ${chunk.length} bytes`);
    }
    
    writeStream.end();
    console.log('Streaming completed!');
}

streamingExample().catch(console.error);

Voice Exploration Example

import { EdgeTTS } from '@andresaya/edge-tts';

async function exploreVoices() {
    const tts = new EdgeTTS();
    
    // Get all voices
    const allVoices = await tts.getVoices();
    console.log(`Total voices available: ${allVoices.length}`);
    
    // Group by language
    const languages = [...new Set(allVoices.map(v => v.Locale.split('-')[0]))];
    console.log(`Languages available: ${languages.join(', ')}`);
    
    // Get Spanish voices
    const spanishVoices = await tts.getVoicesByLanguage('es');
    console.log(`Spanish voices: ${spanishVoices.length}`);
    
    // Get female voices
    const femaleVoices = await tts.getVoicesByGender('Female');
    console.log(`Female voices: ${femaleVoices.length}`);
    
    // Test different voices
    const testText = "Hola, este es un ejemplo de síntesis de voz.";
    
    for (const voice of spanishVoices.slice(0, 3)) {
        console.log(`Testing voice: ${voice.FriendlyName}`);
        
        await tts.synthesize(testText, voice.ShortName);
        const filePath = await tts.toFile(`./voices/${voice.ShortName}`);
        
        console.log(`Saved: ${filePath}`);
    }
}

exploreVoices().catch(console.error);

Browser Support

This library can be used directly in web browsers via CDN or ES modules.

⚠️ Important: Currently, this library only works reliably with Microsoft Edge browser. We are working to extend support to other browsers. Community contributions and suggestions are welcome!

CDN Usage (UMD)

<!-- Load from CDN -->
<script src="https://unpkg.com/@andresaya/edge-tts@latest/dist/browser/edge-tts.umd.min.js"></script>

<script>
  const tts = new EdgeTTS();
  
  // Get available voices
  tts.getVoices().then(voices => {
    console.log('Available voices:', voices.length);
  });
  
  // Synthesize speech
  async function speak() {
    await tts.synthesize("Hello from the browser!", 'en-US-AriaNeural');
    const audioData = tts.getAudioData();
    
    // Play audio
    const audioBlob = new Blob([audioData], { type: 'audio/mp3' });
    const audioUrl = URL.createObjectURL(audioBlob);
    const audio = new Audio(audioUrl);
    audio.play();
  }
</script>

ES Module Import

<script type="module">
  import { EdgeTTS } from 'https://unpkg.com/@andresaya/edge-tts@latest/dist/browser/edge-tts.esm.min.js';
  
  const tts = new EdgeTTS();
  
  // Use the library
  const voices = await tts.getVoices();
  console.log(voices);
</script>

Custom SSML Support in Browser

The browser version supports custom SSML (Speech Synthesis Markup Language) for advanced speech control:

<script src="https://unpkg.com/@andresaya/edge-tts@latest/dist/browser/edge-tts.umd.min.js"></script>

<script>
  const tts = new EdgeTTS();
  
  // Custom SSML with emphasis, breaks, and expression
  const ssml = `
    <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" 
           xmlns:mstts="https://www.w3.org/2001/mstts" xml:lang="en-US">
      <voice name="en-US-AriaNeural">
        <mstts:express-as style="cheerful">
          <prosody rate="+10%" pitch="+5Hz">
            Hello! This is <emphasis>custom SSML</emphasis>.
            <break time="500ms"/>
            You have full control over speech synthesis!
          </prosody>
        </mstts:express-as>
      </voice>
    </speak>
  `;
  
  // Synthesize with SSML
  async function speakSSML() {
    await tts.synthesize(ssml, '', { inputType: 'ssml' });
    const audioData = tts.getAudioData();
    
    const audioBlob = new Blob([audioData], { type: 'audio/mp3' });
    const audioUrl = URL.createObjectURL(audioBlob);
    const audio = new Audio(audioUrl);
    audio.play();
  }
</script>

Streaming Support in Browser

<script type="module">
  import { EdgeTTS } from 'https://unpkg.com/@andresaya/edge-tts@latest/dist/browser/edge-tts.esm.min.js';
  
  const tts = new EdgeTTS();
  const chunks = [];
  
  // Stream audio chunks in real-time
  for await (const chunk of tts.synthesizeStream("Long text to stream...", 'en-US-AriaNeural')) {
    chunks.push(chunk);
    console.log(`Received chunk: ${chunk.length} bytes`);
  }
  
  // Combine and play all chunks
  const totalLength = chunks.reduce((acc, chunk) => acc + chunk.length, 0);
  const audioData = new Uint8Array(totalLength);
  let offset = 0;
  for (const chunk of chunks) {
    audioData.set(chunk, offset);
    offset += chunk.length;
  }
  
  const blob = new Blob([audioData], { type: 'audio/mp3' });
  const audio = new Audio(URL.createObjectURL(blob));
  audio.play();
</script>

Complete Browser Example

For a full working example with voice selection and synthesis, see examples/browser-standalone.html.

For advanced SSML examples, see examples/browser-ssml-demo.html.

Voice Options

Synthesis Parameters

| Parameter | Type | Range | Description | |-----------|------|-------|-------------| | pitch | string \| number | -100Hz to +100Hz | Voice pitch adjustment | | rate | string \| number | -100% to +200% | Speech rate adjustment | | volume | string \| number | -100% to +100% | Volume adjustment | | inputType | string | ssml or auto | Determines whether the input is SSML. default(auto) |

Parameter Examples


// Using numbers (recommended)
{ pitch: 20, rate: -10, volume: 90 }

// Using strings
{ pitch: '+20Hz', rate: '-10%', volume: '90%' }

// Mixed usage
{ pitch: 15, rate: '25%', volume: 85 }

// send SSML 
{ pitch: 15, rate: '25%', volume: 85, inputType: 'ssml' }

Error Handling

import { EdgeTTS } from '@andresaya/edge-tts';

async function handleErrors() {
    const tts = new EdgeTTS();
    
    try {
        await tts.synthesize("Test text", 'invalid-voice-name');
    } catch (error) {
        console.error('Synthesis failed:', error.message);
    }
    
    try {
        // This will throw an error - no audio data
        const duration = tts.getDuration();
    } catch (error) {
        console.error('No audio data available:', error.message);
    }
    
    try {
        // Invalid volume range
        await tts.synthesize("Test", 'en-US-AriaNeural', { volume: -150 });
    } catch (error) {
        console.error('Invalid parameter:', error.message);
    }
}

PHP Version

If you want to use Edge TTS with PHP, you can check out the PHP version of this package: Edge TTS PHP

License

This project is licensed under the GNU General Public License v3 (GPLv3).

Acknowledgments

We would like to extend our gratitude to the developers and contributors of the following projects for their inspiration and groundwork:

  • https://github.com/rany2/edge-tts/tree/master/examples
  • https://github.com/rany2/edge-tts/blob/master/src/edge_tts/util.py
  • https://github.com/hasscc/hass-edge-tts/blob/main/custom_components/edge_tts/tts.py