@ziplayer/tts
v1.0.0
Published
Convert text to speech via Google Translate TTS (URLs) and Microsoft Edge TTS (audio buffer). Written in TypeScript.
Maintainers
Readme
@ziplayer/tts
A TypeScript package for text-to-speech with two providers:
| Provider | Returns | Notes |
| ------------------------ | ----------------------------- | ------------------------------------- |
| Google Translate TTS | string[] (audio URLs) | No auth required; max 200 chars/chunk |
| Microsoft Edge TTS | Promise<Buffer> (raw audio) | 400+ neural voices; WebSocket-based |
Installation
npm install @ziplayer/ttsRequires Node.js ≥ 18 (uses native fetch + ws).
Google TTS — getTTSUrls
Returns an array of Google Translate audio URLs. Long texts are automatically split into ≤ 200-character chunks.
import { getTTSUrls } from "@ziplayer/tts";
const urls = getTTSUrls("Hello! This is a test.", { lang: "en" });
console.log(urls);
// ["https://translate.google.com/translate_tts?..."]Options — GoogleTTSOptions
| Property | Type | Default | Description |
| -------- | --------- | ------- | ------------------------------------------------ |
| lang | string | "en" | BCP-47 language code ("vi", "ja", "ko", …) |
| slow | boolean | false | Speak at a slower pace |
Edge TTS — getEdgeTTS
Synthesises text via the Microsoft Edge Read Aloud API and returns a raw audio Buffer. Supports 400+ neural voices across many
languages.
import { getEdgeTTS } from "@ziplayer/tts";
import { writeFileSync } from "fs";
const audio = await getEdgeTTS("こんにちは!", {
voice: "ja-JP-NanamiNeural",
rate: "+10%",
pitch: "+0Hz",
});
writeFileSync("output.mp3", audio);Options — EdgeTTSOptions
| Property | Type | Default | Description |
| -------------- | ------------------ | ----------------------------------- | ---------------------------------------- |
| voice | string | "en-US-AriaNeural" | Voice short name (see voices list below) |
| rate | string | "+0%" | Speaking rate, e.g. "+20%", "-10%" |
| pitch | string | "+0Hz" | Pitch, e.g. "+5Hz", "-10Hz" |
| volume | string | "+0%" | Volume, e.g. "+10%", "-20%" |
| outputFormat | EdgeOutputFormat | "audio-24khz-48kbitrate-mono-mp3" | See formats below |
Output Formats
audio-16khz-32kbitrate-mono-mp3 audio-16khz-64kbitrate-mono-mp3
audio-16khz-128kbitrate-mono-mp3 audio-24khz-48kbitrate-mono-mp3 ← default
audio-24khz-96kbitrate-mono-mp3 audio-24khz-160kbitrate-mono-mp3
audio-48khz-96kbitrate-mono-mp3 audio-48khz-192kbitrate-mono-mp3
webm-24khz-16bit-mono-opus ogg-24khz-16bit-mono-opus
raw-24khz-16bit-mono-pcm ...Edge TTS — getEdgeTTSVoices
Fetches the full list of available voices from Microsoft.
import { getEdgeTTSVoices } from "@ziplayer/tts";
const voices = await getEdgeTTSVoices();
// Filter by locale
const japanese = voices.filter((v) => v.Locale.startsWith("ja-"));
const female = voices.filter((v) => v.Gender === "Female");
console.log(voices[0]);
// {
// Name: "Microsoft Server Speech Text to Speech Voice (en-US, AriaNeural)",
// ShortName: "en-US-AriaNeural",
// Gender: "Female",
// Locale: "en-US",
// FriendlyName: "Microsoft Aria Online (Natural) - English (United States)",
// ...
// }Popular Voices
| Language | Voice | Gender |
| ------------------ | ---------------------- | ------ |
| English (US) | en-US-AriaNeural | Female |
| English (US) | en-US-GuyNeural | Male |
| English (GB) | en-GB-SoniaNeural | Female |
| Vietnamese | vi-VN-HoaiMyNeural | Female |
| Japanese | ja-JP-NanamiNeural | Female |
| Korean | ko-KR-SunHiNeural | Female |
| Chinese (Mandarin) | zh-CN-XiaoxiaoNeural | Female |
TypeScript Types
All types are exported:
import type { GoogleTTSOptions, EdgeTTSOptions, EdgeTTSVoice, EdgeOutputFormat } from "@ziplayer/tts";License
MIT © Ziji
