@twn39/edgetts-js
v1.1.0
Published
TypeScript/JavaScript port of edge-tts for browser environments using native WebSocket and Fetch APIs
Downloads
171
Maintainers
Readme
@twn39/edgetts-js
TypeScript/JavaScript port of the Python edge-tts library, designed to work in browser environments using native WebSocket and Fetch APIs.
This library allows you to use Microsoft Edge's online text-to-speech service without needing Windows or the Edge browser.
Features
- 🌐 Browser-compatible - Uses native WebSocket and Fetch APIs
- 🎯 TypeScript support - Full type definitions included
- 🎤 Multiple voices - Access to all Microsoft Edge TTS voices
- 📝 Subtitle support - Generate SRT subtitles with WordBoundary/SentenceBoundary events
- 🔄 Streaming - Stream audio and metadata in real-time
- 🎛️ Configurable - Adjust rate, volume, pitch, and more
Installation
npm install @twn39/edgetts-jsQuick Start
import { Communicate } from '@twn39/edgetts-js';
const communicate = new Communicate('Hello, world!', {
voice: 'en-US-EmmaMultilingualNeural',
rate: '+0%',
volume: '+0%',
pitch: '+0Hz',
boundary: 'SentenceBoundary'
});
for await (const chunk of communicate.stream()) {
if (chunk.type === 'audio') {
// Handle audio data (Uint8Array)
console.log('Received audio chunk:', chunk.data.length, 'bytes');
} else if (chunk.type === 'WordBoundary' || chunk.type === 'SentenceBoundary') {
// Handle metadata
console.log('Word:', chunk.text, 'at', chunk.offset);
}
}Browser Usage
<!DOCTYPE html>
<html>
<head>
<script type="module">
import { Communicate } from './dist/index.js';
const communicate = new Communicate('Hello, world!');
const audioChunks = [];
for await (const chunk of communicate.stream()) {
if (chunk.type === 'audio') {
audioChunks.push(chunk.data);
}
}
const audioBlob = new Blob(audioChunks, { type: 'audio/mpeg' });
const audioUrl = URL.createObjectURL(audioBlob);
const audio = new Audio(audioUrl);
audio.play();
</script>
</head>
</html>API Reference
Communicate
Main class for streaming audio and metadata from the Edge TTS service.
Constructor
new Communicate(text: string, options?: CommunicateOptions)Parameters:
text(string): The text to convert to speechoptions(CommunicateOptions, optional): Configuration options
CommunicateOptions:
voice(string): Voice name (default: 'en-US-EmmaMultilingualNeural')rate(string): Speech rate, e.g., '+0%', '+10%', '-20%' (default: '+0%')volume(string): Volume, e.g., '+0%', '+50%', '-10%' (default: '+0%')pitch(string): Pitch, e.g., '+0Hz', '+10Hz', '-5Hz' (default: '+0Hz')boundary('WordBoundary' | 'SentenceBoundary'): Metadata boundary type (default: 'SentenceBoundary')proxy(string): Proxy URL (not supported in browser)connectTimeout(number): Connection timeout in seconds (default: 10)receiveTimeout(number): Receive timeout in seconds (default: 60)
Methods
stream()
async* stream(): AsyncGenerator<TTSChunk, void, unknown>Streams audio and metadata from the service.
Yields: TTSChunk objects
TTSChunk types:
TTSChunkAudio:{ type: 'audio', data: Uint8Array }TTSChunkMetadata:{ type: 'WordBoundary' | 'SentenceBoundary', offset: number, duration: number, text: string }
save()
async save(audioData: Uint8Array[], metadataData?: TTSChunk[]): Promise<void>Save audio and metadata to the specified arrays.
SubMaker
Class for generating SRT subtitles from WordBoundary and SentenceBoundary events.
Constructor
new SubMaker()Methods
feed()
feed(msg: TTSChunk): voidFeed a WordBoundary or SentenceBoundary message to the SubMaker.
getSrt()
getSrt(): stringGet the SRT formatted subtitles.
Example:
import { Communicate, SubMaker } from '@twn39/edgetts-js';
const communicate = new Communicate('Hello world!', { boundary: 'SentenceBoundary' });
const submaker = new SubMaker();
for await (const chunk of communicate.stream()) {
if (chunk.type === 'SentenceBoundary') {
submaker.feed(chunk);
}
}
console.log(submaker.getSrt());listVoices()
async function listVoices(proxy?: string): Promise<Voice[]>List all available voices and their attributes.
Returns: Array of Voice objects
Voice object:
Name: Full voice nameShortName: Short voice name (e.g., 'en-US-EmmaMultilingualNeural')Gender: 'Female' or 'Male'Locale: Locale code (e.g., 'en-US')SuggestedCodec: Suggested codecFriendlyName: Friendly nameStatus: 'Deprecated', 'GA', or 'Preview'VoiceTag: Additional voice tags
VoicesManager
Class for finding voices based on their attributes.
Static Methods
create()
static async create(customVoices?: Voice[]): Promise<VoicesManager>Creates a VoicesManager object and populates it with all available voices.
Instance Methods
find()
find(criteria: VoicesManagerFind): VoicesManagerVoice[]Find all matching voices based on the provided attributes.
VoicesManagerFind:
Gender?: 'Female' | 'Male'Locale?: stringLanguage?: string
Other Methods
getAllVoices(): Get all voicesgetLocales(): Get all unique localesgetLanguages(): Get all unique languagesfindByLocale(locale): Find voices by localefindByLanguage(language): Find voices by languagefindByGender(gender): Find voices by gender
Example:
import { VoicesManager } from '@twn39/edgetts-js';
const manager = await VoicesManager.create();
// Find all English female voices
const englishFemaleVoices = manager.find({
Language: 'en',
Gender: 'Female'
});
// Find voices by locale
const usVoices = manager.findByLocale('en-US');
console.log('Available locales:', manager.getLocales());Demo
Open demo.html in a browser to try an interactive demo:
# Start a local server
pnpm build
python3 -m http.server 8080
# Open http://localhost:8080/demo.htmlThe demo showcases:
- 🎙️ Text-to-speech synthesis with adjustable rate/pitch
- 🔍 Voice search and filtering (400+ voices)
- 📝 Real-time SRT subtitle generation
- 🔊 Audio playback
Building
# Install dependencies
pnpm install
# Build the library
pnpm build
# Type check
pnpm type-check
# Watch mode for development
pnpm devTesting
This library includes comprehensive unit and integration tests using Vitest:
# Run all tests
pnpm test
# Run tests in watch mode
pnpm test:watchTest coverage:
- ✅ Utils (XML escaping, text splitting, SSML generation)
- ✅ DRM (token generation, MUID, clock skew)
- ✅ Exceptions (error hierarchy)
- ✅ SRT Composer (timestamp formatting, subtitle sorting)
- ✅ SubMaker (subtitle generation)
- ✅ VoicesManager (voice filtering - integration tests with real API)
- ✅ Communicate (parameter validation)
Browser Compatibility
This library uses modern browser APIs:
WebSocket- For streaming audiofetch- For HTTP requestscrypto.subtle- For DRM token generationAsyncGenerator- For streaming data
Minimum browser versions:
- Chrome 63+
- Firefox 57+
- Safari 11+
- Edge 79+
Limitations
- Custom Headers: Browser WebSocket API doesn't support custom request headers. Authentication is handled via URL parameters.
- Proxy: Proxy configuration is not supported in browser environments.
- CORS: The service must allow CORS requests from your domain.
License
MIT License - See LICENSE file for details.
Acknowledgments
This is a TypeScript/JavaScript port of the Python edge-tts library by rany.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
