@polytts/browser

v0.1.2

Published

2 months ago

Browser entrypoint for polytts text-to-speech runtimes.

0High
0Medium
0Low

dengqing

browser onnx polytts text-to-speech tts web

@polytts/browser

Explicit browser entrypoint for polytts.

Use this package when you want the structured scoped package layout for browser apps, PWAs, or Electron renderers.

Install

npm install @polytts/browser

Usage

import { createBrowserTTS } from "@polytts/browser";

const tts = createBrowserTTS({
  initialModelId: "browser-speech",
});

await tts.ready();
await tts.speak("Hello from the browser runtime.");

For progressive audio:

for await (const chunk of tts.synthesizeStream("Hello from a streaming browser model.")) {
  console.log(chunk.sampleRate, chunk.channels[0]?.length ?? 0);
}

The high-level controller also exposes getInstallState(modelId) and isInstalled(modelId) for quick download checks without reading raw runtime state.

If you do not want IndexedDB, pass a custom assetStore:

import { MemoryAssetStore } from "@polytts/core";
import { createBrowserTTS } from "@polytts/browser";

const tts = createBrowserTTS({
  assetStore: new MemoryAssetStore(),
});

@polytts/browser also exports LocalStorageAssetStore, but it is only suitable for tiny assets, demos, and tests. localStorage is usually capped around 5 MB and base64 storage adds roughly 33% overhead, so do not use it for real ONNX bundles.

Lifecycle

The browser controller keeps three separate concepts:

selected model and voice
downloaded assets
loaded runtime instance

Important behavior:

initialModelId and initialVoiceId only set the starting selection
download(modelId) caches assets, but does not prepare the runtime instance
ready(), selectModel(), selectFamily(), and selectVoice() prepare the selected model
speak(), synthesize(), and synthesizeStream() prepare on demand if needed

If your UI needs explicit install vs load states, use getInstallState(), isInstalled(), status, phase, and phaseProgress together instead of assuming they mean the same thing.

Voices

Catalog voice metadata and runtime-resolved voices are not always identical.

Models such as Piper usually expose stable voices from catalog metadata.
Models such as Kokoro may populate their final voice list only after the model is prepared.

If your UI renders a voice picker, expect some models to show no resolved voices until after ready() or selectModel() completes.

Platform notes

browser-speech has no download step and depends entirely on the host browser engine.
ONNX-backed models may download large bundles and should usually expose progress UI.
Piper uses a safer main-thread ONNX path on iOS. That path is broadly compatible, but stop/cancel can only be observed before or after the current inference call returns.
Worker-backed models stop more aggressively because the worker can be recycled on abort.

SSR

@polytts/browser is a browser entrypoint. The React providers can render on the server, but createBrowserTTS() and createBrowserTTSRuntime() should still be created in browser/client code.

Exports

createBrowserTTS()
createBrowserTTSRuntime()
browser audio and storage helpers
official browser adapters
official catalogs from @polytts/presets

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@polytts/browser

Install

Usage

Lifecycle

Voices

Platform notes

SSR

Exports