@duyquangnvx/tts-client
v0.1.0-alpha.2
Published
Multi-provider TTS wrapper (cloud + local) built on the Vercel AI SDK
Readme
@duyquangnvx/tts-client
Multi-provider TTS wrapper (cloud + local) built on the Vercel AI SDK where possible, with first-class custom providers for backends the AI SDK does not ship — Edge TTS today, Kokoro / sherpa coming.
Quickstart
pnpm i @duyquangnvx/tts-client msedge-ttsimport { writeFile } from "node:fs/promises";
import { createTTSClient } from "@duyquangnvx/tts-client";
import { edge } from "@duyquangnvx/tts-client/edge";
const tts = createTTSClient({ providers: [edge()] });
const r = await tts.speak({ text: "Hello world", voice: "edge:en-US-AriaNeural" });
await writeFile("out.mp3", r.audio);Supported providers (v0.1)
| Provider | Subpath | Runtime | Streaming | Voices | Install | Notes |
|---|---|---|---|---|---|---|
| Edge TTS | @duyquangnvx/tts-client/edge | Node | native (WS) | ~400 dynamic | pnpm i msedge-tts | No API key. Default for the Quickstart. |
| OpenAI | @duyquangnvx/tts-client/openai | universal | sentence-chunk fallback | 9 (tts-1 / tts-1-hd) · 13 (gpt-4o-mini-tts) | pnpm i ai @ai-sdk/openai | Requires OPENAI_API_KEY. |
More providers on the way: ElevenLabs (v0.2), Kokoro local model (v0.2), sherpa-onnx-node (v1).
Subpath imports
import { createTTSClient, TTSError } from "@duyquangnvx/tts-client";
import { edge } from "@duyquangnvx/tts-client/edge";
import { openai } from "@duyquangnvx/tts-client/openai";
import { toSpeechModel, edgeSpeechModel } from "@duyquangnvx/tts-client/ai-sdk";AI SDK bridge
Plug our backends into AI SDK's experimental_generateSpeech:
import { experimental_generateSpeech as generateSpeech } from "ai";
import { edgeSpeechModel } from "@duyquangnvx/tts-client/ai-sdk";
const r = await generateSpeech({
model: edgeSpeechModel("en-US-AriaNeural"),
text: "Hello world",
});Or wrap any TTSProvider (yours or one of ours) with toSpeechModel(provider, voice).
Streaming
for await (const chunk of tts.stream({
text: "Streaming audio.",
voice: "edge:en-US-AriaNeural",
})) {
if (chunk.audio.byteLength > 0) audioElement.appendBuffer(chunk.audio);
if (chunk.isLast) console.log("done");
}AudioChunk.audio may be empty when a chunk carries only word-boundary
metadata. See DATA-MODEL.md for the full shape.
Fallback
const tts = createTTSClient({
providers: [edge(), openai({ apiKey: process.env.OPENAI_API_KEY })],
fallback: ["openai"], // tries openai when edge throws retryable provider_unavailable
});Examples
Runnable scripts live in examples/. Run any with:
pnpm tsx examples/quickstart.tsDocs
- Architecture — design + two-layer adapter pattern.
- Data model — every public type.
- Pipeline — request flow, fallback, voice resolution.
- Conventions — code style + things-we-don't-do.
- Roadmap — milestones.
License
MIT.
