@silyze/async-audio-tts

v1.1.0

Published

4 months ago

Async audio text-to-speach interface

0High
0Medium
0Low

simeonmarkoski

mojsoski

Async Audio TTS

Minimal TypeScript interface for Text-to-Speech engines that output audio as an asynchronous stream. Built to interoperate with @silyze/async-audio-stream.

Install

npm install @silyze/async-audio-tts

This package depends on @silyze/async-audio-stream and re-exports its types (e.g., AudioOutputStream, AudioFormat).

Quick Start

@silyze/async-audio-tts exports a single interface describing a TTS engine that produces audio you can read asynchronously.

import TextToSpeachModel from "@silyze/async-audio-tts";
import { AudioFormat } from "@silyze/async-audio-stream";

class YourEngine implements TextToSpeachModel {
  // AudioOutputStream requirement
  get format(): AudioFormat {
    // Return the format of emitted audio (e.g., PCM 16kHz mono)
    throw new Error("Not implemented");
  }

  // TTS-specific lifecycle
  get ready(): Promise<void> {
    // Resolve when the engine is initialized and ready
    return Promise.resolve();
  }

  async speak(text: string): Promise<void> {
    // Enqueue synthesized audio for the provided text into the output stream
    throw new Error("Not implemented");
  }

  async close(): Promise<void> {
    // Release internal resources and stop emitting audio
    throw new Error("Not implemented");
  }

  // Provide async iteration of audio chunks (Buffer)
  async *[Symbol.asyncIterator](): AsyncIterator<Buffer> {
    // Yield audio buffers as they become available
    // yield chunk
  }
}

Consume audio from any TextToSpeachModel implementation:

const tts = new YourEngine();
await tts.ready;
await tts.speak("Hello world");

for await (const chunk of tts) {
  // Handle PCM/encoded audio Buffer chunks (e.g., write to a file or stream to a player)
}

await tts.close();

API

Default export TextToSpeachModel extends AudioOutputStream and adds:
- ready: Promise<void> - resolves when the engine is initialized and ready to synthesize.
- speak(text: string): Promise<void> - synthesizes and enqueues audio for text onto the output stream.
- close(): Promise<void> - stops synthesis and releases any resources used by the engine.

From @silyze/async-audio-stream:

AudioOutputStream - an async readable stream of Buffer chunks with a format: AudioFormat getter.
AudioFormat - describes the encoding of the emitted audio (e.g., PCM sample rate, encoding name).

Type Definition

import { AudioOutputStream } from "@silyze/async-audio-stream";

export default interface TextToSpeachModel extends AudioOutputStream {
  get ready(): Promise<void>;
  speak(text: string): Promise<void>;
  close(): Promise<void>;
}

Notes

Multiple speak() calls may enqueue additional audio onto the same output stream.
Consumers should continuously read from the stream to avoid backpressure.
Call tts.close() when finished to free resources and stop audio production.
The emitted audio format is discoverable via tts.format.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Async Audio TTS

Install

Quick Start

API

Type Definition

Notes