npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@loonylabs/tts-middleware

v0.8.0

Published

Provider-agnostic Text-to-Speech middleware for Azure, OpenAI, ElevenLabs, Google Cloud, Deepgram, Fish Audio, and Inworld AI

Readme

TTS Middleware

Provider-agnostic Text-to-Speech middleware with GDPR compliance support. Currently supports Azure Speech Services, EdenAI, Google Cloud TTS, Fish Audio, and Inworld AI. Features EU data residency via Azure and Google Cloud, pluggable logging, character-based billing, and comprehensive error handling.

npm version npm downloads TypeScript Node.js MIT License GitHub


Features

  • Multi-Provider Architecture: Unified API for all TTS providers
    • Azure Speech Services (MVP): Neural voices with emotion/style, EU regions
    • EdenAI: Aggregator with access to Google, OpenAI, Amazon, IBM, ElevenLabs
    • Google Cloud TTS: Neural2, WaveNet, Studio voices with EU data residency
    • Fish Audio: S1 model with 13 languages & 64+ emotions (test/admin only)
    • Inworld AI: TTS 1.5 Max/Mini with 15 languages & voice cloning (test/admin only)
    • Ready for: OpenAI, ElevenLabs, Deepgram (interfaces prepared)
  • GDPR/DSGVO Compliance: Built-in EU region support for Azure and Google Cloud
  • SSML Abstraction: Auto-generates provider-specific SSML from simple JSON options
  • Character Billing: Accurate character counting for cost calculation
  • Pluggable Logger: Bring your own logger (Winston, Pino, etc.) or use the built-in console logger
  • TypeScript First: Full type safety with comprehensive interfaces
  • Retry with Backoff: Automatic retry for transient errors (429, 5xx, timeouts) with exponential backoff and jitter
  • Error Handling: Typed error classes (InvalidConfig, QuotaExceeded, SynthesisFailed, etc.)
  • Zero Lock-in: Switch providers without changing your application code

Quick Start

Installation

Install from npm:

npm install @loonylabs/tts-middleware

Or install directly from GitHub:

npm install github:loonylabs-dev/tts-middleware

Basic Usage

import { ttsService, TTSProvider } from '@loonylabs/tts-middleware';
import fs from 'fs';

const response = await ttsService.synthesize({
  text: 'Hallo Welt! Dies ist ein Test.',
  voice: { id: 'de-DE-KatjaNeural' },
  audio: { format: 'mp3', speed: 1.0 },
});

fs.writeFileSync('output.mp3', response.audio);
console.log('Characters billed:', response.billing.characters);
console.log('Audio length:', response.metadata.audioDuration, 'ms');
// Azure with emotion
const azure = await ttsService.synthesize({
  text: 'Great news!',
  provider: TTSProvider.AZURE,
  voice: { id: 'en-US-JennyNeural' },
  providerOptions: { emotion: 'cheerful', style: 'chat' },
});

// Google Cloud TTS (EU-compliant)
const google = await ttsService.synthesize({
  text: 'Hallo aus Frankfurt!',
  provider: TTSProvider.GOOGLE,
  voice: { id: 'de-DE-Neural2-C' },
  providerOptions: { region: 'europe-west3' },
});

// EdenAI (OpenAI voices via aggregator)
const edenai = await ttsService.synthesize({
  text: 'Hello World',
  provider: TTSProvider.EDENAI,
  voice: { id: 'en-US' },
  providerOptions: { provider: 'openai', settings: { openai: 'en_nova' } },
});

// EdenAI (ElevenLabs with specific voice)
const elevenlabs = await ttsService.synthesize({
  text: 'Hallo, willkommen!',
  provider: TTSProvider.EDENAI,
  voice: { id: 'de' },
  providerOptions: { provider: 'elevenlabs', voice_id: 'Aria' },
});

// Fish Audio (test/admin only)
const fish = await ttsService.synthesize({
  text: '(excited) Das ist fantastisch!',
  provider: TTSProvider.FISH_AUDIO,
  voice: { id: '90042f762dbf49baa2e7776d011eee6b' },
  providerOptions: { model: 's1' },
});

// Inworld AI (test/admin only)
const inworld = await ttsService.synthesize({
  text: 'Hello from Inworld AI!',
  provider: TTSProvider.INWORLD,
  voice: { id: 'Ashley' },
  providerOptions: { modelId: 'inworld-tts-1.5-max', temperature: 1.1 },
});
// German with OpenAI "nova" voice (female)
const response = await ttsService.synthesize({
  text: 'Hallo Welt! Das ist ein Test.',
  provider: TTSProvider.EDENAI,
  voice: { id: 'de' },
  providerOptions: {
    provider: 'openai',
    settings: { openai: 'de_nova' },
  },
});

Available OpenAI Voices:

| Voice | Character | |-------|-----------| | alloy | Neutral | | echo | Male | | fable | Expressive | | onyx | Male, deep | | nova | Female | | shimmer | Female, warm |

Format: {language}_{voice} (e.g., de_nova, en_alloy, fr_shimmer)

// With Frankfurt endpoint for maximum DSGVO compliance
const response = await ttsService.synthesize({
  text: 'Guten Tag, wie geht es Ihnen?',
  provider: TTSProvider.GOOGLE,
  voice: { id: 'de-DE-Neural2-G' },
  audio: { format: 'mp3' },
  providerOptions: {
    region: 'europe-west3',
    effectsProfileId: ['headphone-class-device'],
  },
});

Available German Voices:

| Type | Female | Male | Quality | |------|--------|------|---------| | Neural2 | de-DE-Neural2-G | de-DE-Neural2-H | Best value | | WaveNet | de-DE-Wavenet-G | de-DE-Wavenet-H | Good | | Studio | de-DE-Studio-C | de-DE-Studio-B | Premium | | Chirp3-HD | Aoede, Kore, ... | Fenrir, Puck, ... | Newest |

Prerequisites

  • Node.js 18+
  • TypeScript 5.3+
  • Provider credentials (API keys / service accounts)

Configuration

Create a .env file in your project root:

# Default provider
TTS_DEFAULT_PROVIDER=azure

# Azure Speech Services (EU-compliant)
AZURE_SPEECH_KEY=your-azure-speech-key
AZURE_SPEECH_REGION=germanywestcentral

# EdenAI (multi-provider aggregator)
EDENAI_API_KEY=your-edenai-api-key

# Google Cloud TTS (EU-compliant)
GOOGLE_APPLICATION_CREDENTIALS=./service-account.json
GOOGLE_CLOUD_PROJECT=your-project-id
GOOGLE_TTS_REGION=eu

# Fish Audio (test/admin only – no EU data residency)
FISH_AUDIO_API_KEY=your-fish-audio-api-key

# Inworld AI (test/admin only – no EU data residency)
INWORLD_API_KEY=your-inworld-api-key

# Logging
TTS_DEBUG=false
LOG_LEVEL=info

Providers & Models

Azure Speech Services (MVP)

| Feature | Details | |---------|---------| | Voices | 180+ neural voices | | Languages | 100+ locales | | Emotions | cheerful, sad, angry, friendly, etc. | | Styles | chat, newscast, customerservice, etc. | | Audio | MP3, WAV, Opus | | EU Region | germanywestcentral (Frankfurt) | | Pricing | ~$16/1M characters |

Google Cloud TTS

| Feature | Details | |---------|---------| | Voices | Neural2, WaveNet, Standard, Studio, Chirp3-HD | | Languages | 40+ languages | | Audio | MP3, WAV, Opus | | EU Regions | eu, europe-west1 through europe-west9 | | Pricing | ~$16/1M characters |

EdenAI (Aggregator)

| Feature | Details | |---------|---------| | Providers | Google, OpenAI, Amazon, IBM, Microsoft, ElevenLabs | | Voices | Depends on underlying provider | | OpenAI Voices | alloy, echo, fable, onyx, nova, shimmer (57 languages) | | ElevenLabs Voices | Aria, Roger, Sarah, Laura, Charlie, George (via voice_id) |

Fish Audio (Test/Admin Only)

| Feature | Details | |---------|---------| | Models | S1 (flagship, 4B params), speech-1.6, speech-1.5 | | Languages | 13 with auto-detection (EN, DE, FR, ES, JA, ZH, KO, AR, RU, NL, IT, PL, PT) | | Emotions | 64+ expressions via text markers: (excited), (sad), (whispering) | | Voices | Community library + custom voice cloning | | Audio | MP3, WAV, PCM, Opus | | Pricing | $15/1M UTF-8 bytes | | EU Compliance | No data residency guarantees |

Inworld AI (Test/Admin Only)

| Feature | Details | |---------|---------| | Models | TTS 1.5 Max (~200ms latency), TTS 1.5 Mini (~120ms latency) | | Languages | 15 languages | | Voices | Instant voice cloning + professional voice cloning | | Audio | MP3, LINEAR16, OGG_OPUS, ALAW, MULAW, FLAC | | Controls | temperature, speakingRate, timestamps, text normalization | | Pricing | $10/1M chars (Max), $5/1M chars (Mini) | | EU Compliance | No data residency guarantees |

GDPR / Compliance

Provider Compliance Overview

| Provider | DPA | GDPR | EU Data Residency | Notes | |----------|-----|------|-------------------|-------| | Azure | Yes | Yes | Yes (Frankfurt) | Recommended for EU | | Google Cloud | Yes | Yes | Yes (EU multi-region) | Full EU endpoint support | | EdenAI | Yes | Depends* | Depends* | Depends on underlying provider | | Fish Audio | No | No | No | Test/admin only | | Inworld AI | No | No | No | Test/admin only |

*EdenAI is an aggregator - compliance depends on the underlying provider.

API Reference

TTSService

class TTSService {
  synthesize(request: TTSSynthesizeRequest): Promise<TTSResponse>;
  getProvider(provider: TTSProvider): BaseTTSProvider;
  setDefaultProvider(provider: TTSProvider): void;
  getAvailableProviders(): TTSProvider[];
  isProviderAvailable(provider: TTSProvider): boolean;
}

TTSSynthesizeRequest

interface TTSSynthesizeRequest {
  text: string;
  provider?: TTSProvider;
  voice: { id: string };
  audio?: {
    format?: 'mp3' | 'wav' | 'opus' | 'aac' | 'flac';
    speed?: number;        // 0.5 - 2.0
    pitch?: number;        // -20 to 20
    volumeGainDb?: number; // -96 to 16
    sampleRate?: number;
  };
  providerOptions?: Record<string, unknown>;
  retry?: boolean | RetryConfig; // default: true
}

TTSResponse

interface TTSResponse {
  audio: Buffer;
  metadata: {
    provider: string;
    voice: string;
    duration: number;        // Synthesis time (API call duration) in ms
    audioDuration?: number;  // Actual audio length in ms (MP3 only)
    audioFormat: string;
    sampleRate: number;
  };
  billing: {
    characters: number;
    tokensUsed?: number;
  };
}

Advanced Features

Replace the default console logger with your own:

import { setLogger, silentLogger, setLogLevel } from '@loonylabs/tts-middleware';

// Use Winston, Pino, or any custom logger
setLogger({
  info: (msg, meta) => winston.info(msg, meta),
  warn: (msg, meta) => winston.warn(msg, meta),
  error: (msg, meta) => winston.error(msg, meta),
  debug: (msg, meta) => winston.debug(msg, meta),
});

// Disable all logging
setLogger(silentLogger);

// Control log level
setLogLevel('warn');

All provider calls are automatically retried on transient errors (429 rate limit, 5xx server errors, timeouts). Non-retryable errors (401, 403, 400) are thrown immediately.

// Default: retry enabled (3 retries, 1s initial delay, 2x multiplier)
const response = await ttsService.synthesize({
  text: 'Hello World',
  voice: { id: 'en-US-JennyNeural' },
});

// Disable retry
const response = await ttsService.synthesize({
  text: 'Hello World',
  voice: { id: 'en-US-JennyNeural' },
  retry: false,
});

// Custom retry config
const response = await ttsService.synthesize({
  text: 'Hello World',
  voice: { id: 'en-US-JennyNeural' },
  retry: {
    maxRetries: 5,
    initialDelayMs: 500,
    multiplier: 2,
    maxDelayMs: 10000,
  },
});

| Error Type | Retried? | Examples | |------------|----------|----------| | Rate limit | Yes | 429 Too Many Requests | | Server error | Yes | 500, 502, 503, 504 | | Timeout | Yes | Request timeout, ECONNREFUSED, ECONNRESET | | Auth error | No | 401, 403 | | Bad request | No | 400, invalid voice | | Unknown | No | SynthesisFailedError |

Typed error classes for precise error handling:

import {
  TTSError,
  InvalidConfigError,
  InvalidVoiceError,
  QuotaExceededError,
  ProviderUnavailableError,
  SynthesisFailedError,
  NetworkError,
} from '@loonylabs/tts-middleware';

try {
  const result = await ttsService.synthesize({ text: 'test', voice: { id: 'en-US' } });
} catch (error) {
  if (error instanceof QuotaExceededError) {
    console.log('Rate limit hit, try again later');
  } else if (error instanceof InvalidVoiceError) {
    console.log('Voice not found');
  } else if (error instanceof TTSError) {
    console.log(`TTS Error [${error.code}]: ${error.message}`);
  }
}

The middleware returns character counts for cost calculation:

const PROVIDER_RATES = {
  [TTSProvider.AZURE]: 16 / 1_000_000,
  [TTSProvider.GOOGLE]: 16 / 1_000_000,
  [TTSProvider.FISH_AUDIO]: 15 / 1_000_000,
  [TTSProvider.INWORLD]: 10 / 1_000_000, // Max model; Mini: $5/1M
};

const response = await ttsService.synthesize({ /* ... */ });
const costUSD = response.billing.characters * PROVIDER_RATES[TTSProvider.AZURE];

Architecture

graph TD
    App[Your Application] -->|synthesize()| Service[TTSService]
    Service -->|getProvider()| Registry{Provider Registry}

    Registry -->|Select| Azure[AzureProvider]
    Registry -->|Select| GCloud[GoogleCloudTTSProvider]
    Registry -->|Select| Eden[EdenAIProvider]
    Registry -->|Select| Fish[FishAudioProvider]
    Registry -->|Select| Inworld[InworldProvider]

    Azure -->|SSML/SDK| AzureAPI[Azure Speech API]
    GCloud -->|gRPC/SDK| GoogleAPI[Google Cloud TTS API]
    Eden -->|REST| EdenAPI[EdenAI API]
    Fish -->|REST| FishAPI[Fish Audio API]
    Inworld -->|REST| InworldAPI[Inworld AI API]

    GoogleAPI -->|EU Endpoint| EU[eu-texttospeech.googleapis.com]
    EdenAPI -.-> OpenAI[OpenAI TTS]
    EdenAPI -.-> Amazon[Amazon Polly]

Testing

# Run all tests (555 tests, >90% coverage)
npm test

# Unit tests only
npm run test:unit

# Integration tests
npm run test:integration

# Coverage report
npm run test:coverage

# Manual test scripts
npx ts-node scripts/manual-test-edenai.ts
npx ts-node scripts/manual-test-google-cloud-tts.ts
npx ts-node scripts/manual-test-fish-audio.ts [en] [de]

# List available Google Cloud voices
npx ts-node scripts/list-google-voices.ts de-DE

Contributing

We welcome contributions! Please ensure:

  1. Tests: Add tests for new features

  2. Linting: Run npm run lint before committing

  3. Conventions: Follow the existing project structure

  4. Fork the repository

  5. Create your feature branch (git checkout -b feature/amazing-feature)

  6. Commit your changes (git commit -m 'Add some amazing feature')

  7. Push to the branch (git push origin feature/amazing-feature)

  8. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Links


Made with care by the LoonyLabs Team

GitHub stars Follow on GitHub