@tamnt-work/tts-stream
v1.0.0
Published
Framework-agnostic text-to-speech library with streaming support and multilingual capabilities
Maintainers
Readme
🎙️ TTS Stream
The ultimate AI-powered realtime text-to-speech library for streaming LLM responses with zero latency
Perfect for OpenAI, Claude, Gemini, and any streaming AI assistant
🎯 Client-Side Library: TTS Stream runs in the browser only. Speech synthesis requires Web Audio API. Your server streams text responses, and the browser speaks them in real-time.
Quick Start • Examples • API Reference • Demo
🚀 Why Choose TTS Stream for AI Applications?
⚡ Zero-Latency AI Speech
Transform your AI assistants with instant voice feedback. No more waiting for complete responses - users hear speech as your AI generates text, creating natural conversational experiences.
🎯 Built for Modern AI Workflows
- ✅ OpenAI GPT-4 streaming responses
- ✅ Claude realtime conversations
- ✅ Gemini voice interactions
- ✅ Custom LLM streaming outputs
- ✅ Chatbot voice integration
✨ Core Features
- 🤖 AI-First Design: Specifically optimized for streaming LLM responses
- 📡 Real-time Streaming: Speak text as it arrives - perfect for AI chat applications
- 🌍 Multilingual AI: Auto voice selection for 30+ languages with AI context switching
- ⚡ Smart Queueing: Prevents overlapping speech with intelligent buffer management
- 🧹 Auto Cleanup: Automatic cleanup on page unload/reload
- 🛡️ Error Handling: Robust error handling with AI response fallback strategies
- ⚛️ Framework Ready: Ready-to-use hooks/composables for React, Vue, Svelte
- 🎛️ AI-Tuned: Optimal speech settings for conversational AI (rate, pitch, timing)
- 📦 Zero Dependencies: Lightweight and fast - no external API calls needed
- 🔄 Event-driven: Subscribe to speech events for better UX integration
📦 Installation
Choose your preferred package manager:
npm install @tamnt-work/tts-streamyarn add @tamnt-work/tts-streampnpm add @tamnt-work/tts-streambun add @tamnt-work/tts-stream🚀 Quick Start
🤖 Client-Side AI Streaming (Real Architecture)
// 🎯 CLIENT-SIDE ONLY - Browser speech synthesis
import { TextToSpeechStream } from '@tamnt-work/tts-stream';
// Initialize TTS in browser
const tts = new TextToSpeechStream({
defaultLanguage: 'en-US',
rate: 1.1, // Optimal for conversation
pitch: 1.0,
volume: 1.0,
bufferTimeout: 800 // Reduced for faster AI response
});
// ✨ Fetch streaming AI response from your API
async function streamAIResponse(userMessage: string) {
try {
// Call your backend API that proxies to OpenAI/Claude
const response = await fetch('/api/ai/chat', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
message: userMessage,
stream: true
})
});
if (!response.body) throw new Error('No response body');
const reader = response.body.getReader();
const decoder = new TextDecoder();
// 🔥 Read and speak each chunk immediately!
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
// Parse SSE or JSON chunks from your API
const lines = chunk.split('\n');
for (const line of lines) {
if (line.startsWith('data: ')) {
const data = line.slice(6);
if (data === '[DONE]') break;
try {
const parsed = JSON.parse(data);
if (parsed.content) {
// 🎤 Speak immediately as text arrives
tts.speakStream(parsed.content, "en-US");
}
} catch (e) {
// Handle text chunks directly
if (data.trim()) {
tts.speakStream(data, "en-US");
}
}
}
}
}
} catch (error) {
console.error('Streaming error:', error);
tts.speak("Sorry, there was an error with the AI response.", "en-US");
}
}
// Usage - runs in browser only
streamAIResponse("Tell me about space exploration");
// User immediately hears: "Space exploration is fascinating..."🌐 Universal Client Pattern
// Works with any streaming AI API endpoint
async function streamFromAnyAI(endpoint: string, prompt: string) {
const tts = new TextToSpeechStream({
defaultLanguage: 'en-US',
rate: 1.1,
bufferTimeout: 500
});
const response = await fetch(endpoint, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
prompt,
stream: true,
// Add your API specific parameters
})
});
const reader = response.body?.getReader();
if (!reader) return;
while (true) {
const { done, value } = await reader.read();
if (done) break;
const text = new TextDecoder().decode(value);
// 🚀 Speak any text chunk immediately
if (text.trim()) {
tts.speakStream(text, "en-US");
}
}
}
// Examples of different API endpoints
streamFromAnyAI('/api/openai/stream', 'What is machine learning?');
streamFromAnyAI('/api/claude/stream', 'Explain quantum computing');
streamFromAnyAI('/api/gemini/stream', 'Tell me about space');📱 Basic Usage
import { TextToSpeechStream } from '@tamnt-work/tts-stream';
const tts = new TextToSpeechStream({
defaultLanguage: 'en-US',
rate: 1.2,
pitch: 1.0,
volume: 1.0
});
// Basic speech
await tts.speak("Hello world!", "en-US");
// Streaming speech (perfect for any realtime text)
tts.speakStream("Hello ", "en-US");
tts.speakStream("from ", "en-US");
tts.speakStream("streaming!", "en-US");
// Event listeners for AI integration
tts.on('start', () => console.log('🎤 AI is speaking'));
tts.on('end', () => console.log('✅ AI finished speaking'));
tts.on('error', (error) => console.error('❌ Speech error:', error));React Hook
import React from 'react';
import { useTextToSpeech } from '@tamnt-work/tts-stream/react';
function MyComponent() {
const { isSpeaking, speak, speakStream, stop, voices } = useTextToSpeech({
defaultLanguage: 'en-US',
rate: 1.1
});
const handleSpeak = async () => {
await speak("Hello from React! 🚀", "en-US");
};
const handleStream = () => {
speakStream("Streaming ", "en-US");
speakStream("text ", "en-US");
speakStream("works great! ⚡", "en-US");
};
return (
<div className="flex gap-4 p-6">
<button
onClick={handleSpeak}
className="px-4 py-2 bg-blue-500 text-white rounded"
>
Speak
</button>
<button
onClick={handleStream}
className="px-4 py-2 bg-green-500 text-white rounded"
>
Stream
</button>
<button
onClick={stop}
disabled={!isSpeaking}
className="px-4 py-2 bg-red-500 text-white rounded disabled:opacity-50"
>
Stop
</button>
<div className="flex items-center gap-2">
<span className={`w-2 h-2 rounded-full ${isSpeaking ? 'bg-green-400 animate-pulse' : 'bg-gray-400'}`} />
<p>{isSpeaking ? 'Speaking...' : 'Ready'}</p>
<p className="text-sm text-gray-500">({voices.length} voices available)</p>
</div>
</div>
);
}Vue Composable
<template>
<div class="flex gap-4 p-6">
<button @click="handleSpeak" class="px-4 py-2 bg-blue-500 text-white rounded">
Speak
</button>
<button @click="handleStream" class="px-4 py-2 bg-green-500 text-white rounded">
Stream
</button>
<button @click="stop" :disabled="!isSpeaking" class="px-4 py-2 bg-red-500 text-white rounded disabled:opacity-50">
Stop
</button>
<div class="flex items-center gap-2">
<span :class="`w-2 h-2 rounded-full ${isSpeaking ? 'bg-green-400 animate-pulse' : 'bg-gray-400'}`" />
<p>{{ isSpeaking ? 'Speaking...' : 'Ready' }}</p>
<p class="text-sm text-gray-500">({{ voices.length }} voices available)</p>
</div>
</div>
</template>
<script setup>
import { useTextToSpeech } from '@tamnt-work/tts-stream/vue';
const { isSpeaking, speak, speakStream, stop, voices } = useTextToSpeech({
defaultLanguage: 'en-US',
rate: 1.1
});
const handleSpeak = async () => {
await speak("Hello from Vue! 🌟", "en-US");
};
const handleStream = () => {
speakStream("Streaming ", "en-US");
speakStream("text ", "en-US");
speakStream("in Vue! 🔥", "en-US");
};
</script>Svelte Store
<script>
import { createTextToSpeechStore } from '@tamnt-work/tts-stream/svelte';
import { onDestroy } from 'svelte';
const ttsStore = createTextToSpeechStore({
defaultLanguage: 'en-US',
rate: 1.1
});
const handleSpeak = async () => {
await ttsStore.speak("Hello from Svelte! ⚡", "en-US");
};
const handleStream = () => {
ttsStore.speakStream("Streaming ", "en-US");
ttsStore.speakStream("text ", "en-US");
ttsStore.speakStream("in Svelte! 🎯", "en-US");
};
// Cleanup on destroy
onDestroy(() => {
ttsStore.destroy();
});
</script>
<div class="flex gap-4 p-6">
<button on:click={handleSpeak} class="px-4 py-2 bg-blue-500 text-white rounded">
Speak
</button>
<button on:click={handleStream} class="px-4 py-2 bg-green-500 text-white rounded">
Stream
</button>
<button
on:click={() => ttsStore.stop()}
disabled={!$ttsStore.isSpeaking}
class="px-4 py-2 bg-red-500 text-white rounded disabled:opacity-50"
>
Stop
</button>
<div class="flex items-center gap-2">
<span class="w-2 h-2 rounded-full {$ttsStore.isSpeaking ? 'bg-green-400 animate-pulse' : 'bg-gray-400'}" />
<p>{$ttsStore.isSpeaking ? 'Speaking...' : 'Ready'}</p>
<p class="text-sm text-gray-500">({$ttsStore.voices.length} voices available)</p>
</div>
</div>💡 AI Integration Examples
🚀 Complete Client-Side AI Chatbot
// 🎯 CLIENT-SIDE ONLY - All speech happens in browser
import { TextToSpeechStream } from '@tamnt-work/tts-stream';
class VoiceAIChatbot {
private tts: TextToSpeechStream;
private apiEndpoint: string;
constructor(apiEndpoint: string = '/api/ai/chat') {
this.apiEndpoint = apiEndpoint;
// Optimized settings for AI conversation
this.tts = new TextToSpeechStream({
defaultLanguage: 'en-US',
rate: 1.15, // Natural conversation speed
pitch: 1.0,
volume: 1.0,
bufferTimeout: 600, // Quick response time
maxBufferLength: 50 // Shorter buffers for immediate speech
});
// Set up AI speech events
this.tts.on('start', () => this.onAISpeechStart());
this.tts.on('end', () => this.onAISpeechEnd());
}
async chat(userMessage: string): Promise<void> {
console.log('🎤 User:', userMessage);
try {
// Call your backend API (which handles OpenAI/Claude/etc)
const response = await fetch(this.apiEndpoint, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
message: userMessage,
stream: true,
model: "gpt-4" // Your backend decides which AI to use
})
});
if (!response.body) throw new Error('No response stream');
const reader = response.body.getReader();
const decoder = new TextDecoder();
// 🔥 Real-time speech generation in browser
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
// Handle different streaming formats
if (chunk.includes('data: ')) {
const lines = chunk.split('\n');
for (const line of lines) {
if (line.startsWith('data: ')) {
const data = line.slice(6).trim();
if (data && data !== '[DONE]') {
try {
const parsed = JSON.parse(data);
if (parsed.content || parsed.text || parsed.delta) {
const text = parsed.content || parsed.text || parsed.delta;
this.tts.speakStream(text, "en-US");
}
} catch (e) {
// Handle plain text chunks
if (data.trim()) {
this.tts.speakStream(data, "en-US");
}
}
}
}
}
} else if (chunk.trim()) {
// Direct text streaming
this.tts.speakStream(chunk, "en-US");
}
}
} catch (error) {
console.error('Chat error:', error);
this.tts.speak("Sorry, I encountered an error. Please try again.", "en-US");
}
}
private onAISpeechStart() {
console.log('🤖 AI started speaking...');
// Show speaking indicator in UI
document.querySelector('.ai-speaking')?.classList.add('active');
}
private onAISpeechEnd() {
console.log('✅ AI finished speaking');
// Hide speaking indicator, enable user input
document.querySelector('.ai-speaking')?.classList.remove('active');
document.querySelector('input')?.removeAttribute('disabled');
}
stopSpeaking() {
this.tts.stop();
}
destroy() {
this.tts.destroy();
}
}
// Usage in browser
const chatbot = new VoiceAIChatbot('/api/openai/stream');
chatbot.chat("What's the weather like today?");
// User immediately hears AI response as it streams from your API!🌐 Multi-Model Client-Side Assistant
// 🎯 CLIENT-SIDE ONLY - Speech synthesis in browser
import { TextToSpeechStream } from '@tamnt-work/tts-stream';
class MultiAIVoiceAssistant {
private tts: TextToSpeechStream;
constructor() {
this.tts = new TextToSpeechStream({
defaultLanguage: 'en-US',
rate: 1.1,
bufferTimeout: 500 // Ultra-fast response
});
}
// OpenAI GPT-4 with voice (via your API)
async askGPT(question: string) {
const response = await fetch('/api/openai/stream', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
message: question,
model: 'gpt-4',
stream: true
})
});
await this.processStreamResponse(response);
}
// Claude with voice (via your API)
async askClaude(question: string) {
const response = await fetch('/api/claude/stream', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
message: question,
model: 'claude-3-sonnet-20240229',
stream: true
})
});
await this.processStreamResponse(response);
}
// Gemini with voice (via your API)
async askGemini(question: string) {
const response = await fetch('/api/gemini/stream', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
message: question,
model: 'gemini-pro',
stream: true
})
});
await this.processStreamResponse(response);
}
// Universal streaming processor
private async processStreamResponse(response: Response) {
if (!response.body) return;
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
// Handle SSE format
if (chunk.includes('data: ')) {
const lines = chunk.split('\n');
for (const line of lines) {
if (line.startsWith('data: ')) {
const data = line.slice(6);
if (data.trim() && data !== '[DONE]') {
try {
const parsed = JSON.parse(data);
const text = parsed.content || parsed.text || parsed.delta || parsed.message;
if (text) {
this.tts.speakStream(text, "en-US");
}
} catch (e) {
// Handle plain text
if (data.trim()) {
this.tts.speakStream(data, "en-US");
}
}
}
}
}
} else {
// Direct text streaming
if (chunk.trim()) {
this.tts.speakStream(chunk, "en-US");
}
}
}
}
stop() {
this.tts.stop();
}
destroy() {
this.tts.destroy();
}
}
// Usage in browser
const assistant = new MultiAIVoiceAssistant();
assistant.askGPT("Explain machine learning");
assistant.askClaude("What is quantum computing?");
assistant.askGemini("Tell me about space exploration");🎯 React AI Chat Component
import React, { useState, useCallback } from 'react';
import { useTextToSpeech } from '@tamnt-work/tts-stream/react';
function AIChatInterface() {
const [messages, setMessages] = useState([]);
const [isAISpeaking, setIsAISpeaking] = useState(false);
const [userInput, setUserInput] = useState('');
const { tts, isSpeaking } = useTextToSpeech({
defaultLanguage: 'en-US',
rate: 1.1,
bufferTimeout: 500
});
const streamAIResponse = useCallback(async (userMessage: string) => {
setMessages(prev => [...prev, { role: 'user', content: userMessage }]);
setIsAISpeaking(true);
try {
const response = await fetch('/api/ai/stream', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ message: userMessage })
});
const reader = response.body?.getReader();
if (!reader) return;
let aiResponse = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = new TextDecoder().decode(value);
aiResponse += chunk;
// 🎯 Speak immediately as text arrives
tts?.speakStream(chunk, "en-US");
}
setMessages(prev => [...prev, { role: 'ai', content: aiResponse }]);
} finally {
setIsAISpeaking(false);
}
}, [tts]);
return (
<div className="ai-chat-interface">
<div className="messages">
{messages.map((msg, idx) => (
<div key={idx} className={`message ${msg.role}`}>
{msg.content}
</div>
))}
</div>
<div className="input-area">
<input
value={userInput}
onChange={(e) => setUserInput(e.target.value)}
placeholder="Ask AI anything..."
disabled={isAISpeaking}
/>
<button
onClick={() => {
streamAIResponse(userInput);
setUserInput('');
}}
disabled={isAISpeaking}
>
{isAISpeaking ? 'AI Speaking...' : 'Send'}
</button>
<button onClick={() => tts?.stop()}>Stop Speech</button>
</div>
{isSpeaking && (
<div className="speaking-indicator">
🎤 AI is speaking...
</div>
)}
</div>
);
}🏗️ Client-Server Architecture
┌─────────────────┐ HTTP Request ┌──────────────────┐ API Call
│ │ ──────────────────► │ │ ──────────────► OpenAI/Claude
│ Browser │ │ Your Server │ Gemini/etc
│ (TTS Stream) │ ◄────────────────── │ (API Proxy) │ ◄──────────────
│ │ Streaming Text │ │ Streaming Text
└─────────────────┘ └──────────────────┘
🎤 Speech synthesis happens ONLY in browser
📡 Your server just proxies/streams AI responses
🔐 API keys stay secure on your server💡 Any Streaming AI Integration
// 🎯 CLIENT-SIDE ONLY - Universal pattern for any AI
async function integrateAnyStreamingAI(apiEndpoint: string, prompt: string) {
const tts = new TextToSpeechStream({
defaultLanguage: 'en-US',
rate: 1.0,
bufferTimeout: 400
});
try {
// Your server handles the AI API calls
const response = await fetch(apiEndpoint, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
prompt: prompt,
stream: true,
// Add any AI-specific parameters
})
});
if (!response.body) return;
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
// 🚀 Speak immediately - works with any streaming format
if (chunk.trim()) {
// Handle JSON streaming
if (chunk.startsWith('{')) {
try {
const data = JSON.parse(chunk);
const text = data.response || data.content || data.text || data.message;
if (text) tts.speakStream(text, "en-US");
} catch (e) {
tts.speakStream(chunk, "en-US");
}
} else {
// Handle plain text streaming
tts.speakStream(chunk, "en-US");
}
}
}
} catch (error) {
console.error('AI streaming error:', error);
tts.speak("Sorry, there was an error connecting to the AI.", "en-US");
}
}
// Examples - all run in browser, call your server APIs
integrateAnyStreamingAI('/api/openai/stream', 'What is AI?');
integrateAnyStreamingAI('/api/claude/stream', 'Explain quantum physics');
integrateAnyStreamingAI('/api/ollama/stream', 'Tell me a story'); // Local LLM
integrateAnyStreamingAI('/api/custom-llm/stream', 'Help me code'); // Your custom AI🔧 Server-Side Example (Node.js)
// Example server endpoint that your client calls
// This runs on your server and proxies to AI APIs
app.post('/api/openai/stream', async (req, res) => {
try {
const { message } = req.body;
// Server calls OpenAI (API key stays secure)
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const stream = await openai.chat.completions.create({
model: "gpt-4",
messages: [{ role: "user", content: message }],
stream: true,
});
res.writeHead(200, {
'Content-Type': 'text/plain',
'Cache-Control': 'no-cache',
'Connection': 'keep-alive',
});
// Stream AI response back to browser
for await (const chunk of stream) {
if (chunk.choices[0]?.delta?.content) {
res.write(chunk.choices[0].delta.content);
}
}
res.end();
} catch (error) {
res.status(500).json({ error: 'AI request failed' });
}
});
// Browser receives text chunks and speaks them immediately!Multi-language Support
const tts = new TextToSpeechStream();
// Mix different languages seamlessly
const multilingualGreeting = [
{ text: "Hello! ", language: "en-US" },
{ text: "Bonjour! ", language: "fr-FR" },
{ text: "¡Hola! ", language: "es-ES" },
{ text: "こんにちは!", language: "ja-JP" },
{ text: "Xin chào!", language: "vi-VN" }
];
multilingualGreeting.forEach(({ text, language }) => {
tts.speakStream(text, language);
});Custom Voice Selection
const tts = new TextToSpeechStream();
// Wait for voices to load
tts.on('voicesloaded', (voices) => {
console.log('📢 Available voices:', voices.length);
// Find specific voices
const femaleVoices = voices.filter(v => v.name.toLowerCase().includes('female'));
const maleVoices = voices.filter(v => v.name.toLowerCase().includes('male'));
console.log('👩 Female voices:', femaleVoices.length);
console.log('👨 Male voices:', maleVoices.length);
// Group by language
const voicesByLang = voices.reduce((acc, voice) => {
const lang = voice.lang.split('-')[0];
if (!acc[lang]) acc[lang] = [];
acc[lang].push(voice);
return acc;
}, {});
console.log('🌍 Languages available:', Object.keys(voicesByLang));
});Advanced Error Handling
const tts = new TextToSpeechStream({
defaultLanguage: 'en-US',
rate: 1.0
});
tts.on('error', (error) => {
console.error('🚨 Speech error:', error.error);
switch (error.error) {
case 'voice-unavailable':
console.log('🔄 Voice not available, trying fallback...');
tts.speak(error.utterance?.text || '', 'en-US');
break;
case 'synthesis-unavailable':
console.log('❌ Speech synthesis not supported in this browser');
// Show visual fallback (text display, etc.)
break;
case 'synthesis-failed':
console.log('🔄 Synthesis failed, retrying...');
setTimeout(() => {
tts.speak(error.utterance?.text || '', 'en-US');
}, 1000);
break;
default:
console.log('❓ Unknown speech error:', error.error);
}
});
// Graceful degradation
if (!window.speechSynthesis) {
console.log('⚠️ Speech synthesis not supported, using visual fallback');
// Implement visual text display or other fallback
}📋 API Reference
TextToSpeechStream Class
Constructor Options
interface TextToSpeechOptions {
defaultLanguage?: string; // Default: 'vi-VN'
rate?: number; // Speech rate (0.1-10), Default: 1.1
pitch?: number; // Speech pitch (0-2), Default: 1.0
volume?: number; // Speech volume (0-1), Default: 1.0
bufferTimeout?: number; // Stream buffer timeout (ms), Default: 1000
maxBufferLength?: number; // Max buffer length, Default: 100
}Methods
| Method | Description | Returns |
|--------|-------------|---------|
| speak(text, language?) | Speak text immediately | Promise<void> |
| speakStream(text, language?) | Add text to streaming buffer | void |
| stop() | Stop current speech and clear queue | void |
| pause() | Pause current speech | void |
| resume() | Resume paused speech | void |
| getVoices() | Get available voices | SpeechSynthesisVoice[] |
| isCurrentlySpeaking() | Check if currently speaking | boolean |
| isPaused() | Check if speech is paused | boolean |
| on(event, listener) | Add event listener | void |
| off(event, listener) | Remove event listener | void |
| destroy() | Cleanup and remove all listeners | void |
Events
| Event | Payload | Description |
|-------|---------|-------------|
| start | void | Speech started |
| end | void | Speech ended |
| pause | void | Speech paused |
| resume | void | Speech resumed |
| error | SpeechSynthesisErrorEvent | Speech error occurred |
| voicesloaded | SpeechSynthesisVoice[] | Voices loaded/changed |
| boundary | SpeechSynthesisEvent | Word/sentence boundary |
React Hook
interface UseTextToSpeechReturn {
isSpeaking: boolean;
isPaused: boolean;
voices: SpeechSynthesisVoice[];
speak: (text: string, language?: string) => Promise<void>;
speakStream: (text: string, language?: string) => void;
stop: () => void;
pause: () => void;
resume: () => void;
tts: TextToSpeechStream | null;
}
const useTextToSpeech = (options?: TextToSpeechOptions): UseTextToSpeechReturnVue Composable
interface UseTextToSpeechReturn {
isSpeaking: Ref<boolean>;
isPaused: Ref<boolean>;
voices: Ref<SpeechSynthesisVoice[]>;
speak: (text: string, language?: string) => Promise<void>;
speakStream: (text: string, language?: string) => void;
stop: () => void;
pause: () => void;
resume: () => void;
tts: Ref<TextToSpeechStream | null>;
}
const useTextToSpeech = (options?: TextToSpeechOptions): UseTextToSpeechReturnSvelte Store
interface TextToSpeechStore {
isSpeaking: boolean;
isPaused: boolean;
voices: SpeechSynthesisVoice[];
tts: TextToSpeechStream | null;
}
interface TextToSpeechStoreReturn {
subscribe: (fn: (value: TextToSpeechStore) => void) => () => void;
speak: (text: string, language?: string) => Promise<void>;
speakStream: (text: string, language?: string) => void;
stop: () => void;
pause: () => void;
resume: () => void;
destroy: () => void;
}
const createTextToSpeechStore = (options?: TextToSpeechOptions): TextToSpeechStoreReturn🌍 Supported Languages
The library automatically selects appropriate voices for 30+ languages:
| Language | Code | Language | Code |
|----------|------|----------|------|
| English (US) | en-US | Japanese | ja-JP |
| English (UK) | en-GB | Korean | ko-KR |
| Spanish | es-ES | Chinese (Simplified) | zh-CN |
| French | fr-FR | Chinese (Traditional) | zh-TW |
| German | de-DE | Vietnamese | vi-VN |
| Italian | it-IT | Thai | th-TH |
| Portuguese | pt-PT | Russian | ru-RU |
| Dutch | nl-NL | Arabic | ar-SA |
| Polish | pl-PL | Hindi | hi-IN |
| Swedish | sv-SE | Indonesian | id-ID |
And many more! The exact voices available depend on your operating system and browser.
🌐 Browser Support
Uses the Web Speech API, supported in:
| Browser | Support | Notes | |---------|---------|-------| | ✅ Chrome | 33+ | Full support | | ✅ Safari | 7+ | Full support | | ✅ Edge | 14+ | Full support | | ⚠️ Firefox | Limited | Basic support | | ❌ IE | None | Not supported |
Note: On mobile devices, speech synthesis may require user interaction to start.
🎯 AI-Specific Benefits
⚡ Instant User Feedback
- Traditional TTS: Wait 5-10 seconds for complete AI response, then speak
- TTS Stream: User hears speech in 0.1 seconds as AI generates text
- Result: 50x faster perceived response time!
🧠 Enhanced AI Conversations
- Natural Flow: Speech starts immediately, feels like talking to a human
- Reduced Latency: No awkward pauses waiting for complete responses
- Better UX: Users can interrupt or respond while AI is still thinking
- Engagement: Voice keeps users engaged during long AI responses
💰 Cost & Performance Benefits
- Bandwidth Efficient: Stream text immediately, no waiting for full response
- Memory Friendly: Process text chunks instead of storing full responses
- Battery Optimized: Continuous small operations vs large batch processing
- Scalable: Works with any streaming AI service without modifications
🔧 Perfect For These AI Use Cases
1. 🤖 Conversational AI Assistants
OpenAI GPT, Claude, Gemini voice interfaces with zero-delay response.
2. 📞 AI Customer Support
Real-time voice responses for support bots and virtual agents.
3. 🎓 AI Tutors & Education
Interactive learning with immediate voice feedback from AI teachers.
4. 🏥 Healthcare AI
Medical assistant AI with natural voice interactions for patient care.
5. 🏠 Smart Home Voice
IoT devices with streaming AI responses for home automation.
6. 🎮 Gaming AI NPCs
Game characters with real-time voice generation from AI dialogue systems.
7. 📰 AI News Readers
Streaming news summaries with immediate voice narration.
8. 🌐 Multilingual AI
AI translators with instant voice output in multiple languages.
🎯 Best Practices
Performance Tips
// ✅ Good: Reuse TTS instance
const tts = new TextToSpeechStream();
// ❌ Avoid: Creating new instances repeatedly
function speakText(text) {
const tts = new TextToSpeechStream(); // Don't do this
tts.speak(text);
}
// ✅ Good: Clean up when done
useEffect(() => {
return () => {
tts.destroy();
};
}, []);Streaming Best Practices
// ✅ Good: Small chunks for natural flow
tts.speakStream("Hello ");
tts.speakStream("world ");
tts.speakStream("today!");
// ❌ Avoid: Very small fragments
tts.speakStream("H");
tts.speakStream("e");
tts.speakStream("l");
// ✅ Good: Handle language switches gracefully
const sentences = [
{ text: "Hello there! ", lang: "en-US" },
{ text: "Comment allez-vous? ", lang: "fr-FR" },
{ text: "¿Cómo está usted?", lang: "es-ES" }
];
sentences.forEach(({text, lang}) => {
tts.speakStream(text, lang);
});🤝 Contributing
We welcome contributions! Please see our Contributing Guide for details.
Development Setup
# Clone the repository
git clone https://github.com/tamnt-work/tts-stream.git
cd tts-stream
# Install dependencies (choose your preferred manager)
npm install
# or
yarn install
# or
pnpm install
# or
bun install
# Build the project
npm run build
# Run tests
npm test📄 License
MIT License - feel free to use this project commercially.
🙏 Acknowledgments
- Built on the Web Speech API
- Inspired by the need for better AI voice interfaces
- Thanks to all contributors and users!
Made with ❤️ by tamnt-work
⭐ Star this project if you find it useful!
