@wcstack/speech
v1.14.0
Published
Declarative Web Speech components for Web Components. Framework-agnostic SpeechSynthesis (TTS) and SpeechRecognition (STT) primitives via wc-bindable-protocol.
Maintainers
Readme
@wcstack/speech
@wcstack/speech is a headless Web Speech component pair for the wcstack ecosystem.
These are not visual UI widgets. They are async primitive nodes that turn the browser's Web Speech APIs into reactive state — the same way @wcstack/fetch turns a network request into reactive state and @wcstack/geolocation turns the device's location into reactive state.
The package ships two complementary tags, the two halves of the same protocol:
| Tag | API | Direction | Protocol role |
|---|---|---|---|
| <wcs-speak> | SpeechSynthesis (TTS) | state → speech | command-token (state drives speech) |
| <wcs-listen> | SpeechRecognition (STT) | speech → state | event-token (recognition flows to state) |
Their coexistence in one package is the point: <wcs-speak> is a perfect showcase of command-driven output, <wcs-listen> of event-driven input. Wire them together for a speak ⇄ listen loop.
Both follow the CSBC (Core / Shell / Binding Contract) architecture:
- Core (
SpeakCore/ListenCore) wraps the native API, normalizes data, manages lifecycle/permission, and never throws (failures surface througherror). - Shell (
<wcs-speak>/<wcs-listen>) connects that state to DOM attributes, lifecycle, and declarative commands. - Binding Contract (
static wcBindable) declares observableproperties, writableinputs, and callablecommands.
Install
npm install @wcstack/speechOr buildless via CDN (registers both tags):
<script type="module" src="https://esm.run/@wcstack/speech/auto"></script><wcs-speak> — text to speech
Two ways to speak
<wcs-speak> exposes the same action through two surfaces that differ in when they fire:
<!-- 1. Reactive: speaks whenever `status` changes (same value is NOT re-spoken). -->
<wcs-speak data-wcs="say: status"></wcs-speak>
<!-- 2. Imperative: speaks on demand, even the same text again, via the command token. -->
<wcs-speak data-wcs="command.speak: $command.announce"></wcs-speak>// state
export default {
$commandTokens: ["announce"],
status: "Ready.",
onClick() {
this.$command.announce.emit("Button clicked again."); // imperative — re-speaks same text
},
};| Surface | Fires when | Same value re-speaks? | Use for |
|---|---|---|---|
| say (reactive input) | the bound value changes | no (guarded) | status / a11y announcements |
| speak (imperative command) | the command is invoked | yes | "speak this on click", "say it again" |
Tip: wire
saythrough a\|debouncefilter when binding to a rapidly-changing source (e.g. an<input>value), or it will speak on every keystroke. Set themanualattribute to mute thesaypath entirely (also the hook for muting speech while listening — see the echo example).
Word-boundary highlighting
charIndex / spokenWord update as each word is spoken — bind them to highlight the currently-spoken word (karaoke-style).
Attributes / Inputs
| Attribute | Input | Type | Default | Meaning |
|---|---|---|---|---|
| — | say | string | — | reactive: writing a new value speaks it |
| rate | rate | number | 1 | speech rate (0.1–10) |
| pitch | pitch | number | 1 | pitch (0–2) |
| volume | volume | number | 1 | volume (0–1) |
| voice | voice | string | — | voice selected by name |
| lang | lang | string | — | BCP-47 language tag |
| manual | manual | boolean | false | mute the say path |
Observable Properties (outputs)
| Property | Type | Meaning |
|---|---|---|
| voices | SpeechVoiceInfo[] | available voices (populated asynchronously) |
| speaking | boolean | an utterance is being spoken |
| paused | boolean | speech is paused |
| pending | boolean | utterances are queued |
| charIndex | number | null | offset of the word being spoken |
| spokenWord | string | null | the word being spoken |
| error | WcsSpeakErrorDetail | null | last failure |
| unsupported | boolean | SpeechSynthesis is unavailable |
Commands
| Command | Meaning |
|---|---|
| speak(text) | queue an utterance (uses current rate/pitch/… attributes) |
| cancel() | clear the queue and stop |
| pause() / resume() | suspend / resume |
Optional DOM triggering
With autoTrigger on (default), clicking an element carrying data-speaktarget="<id>" speaks its data-speaktext (or its text content) through the <wcs-speak id="<id>">.
<wcs-speak id="tts"></wcs-speak>
<button data-speaktarget="tts" data-speaktext="Hello!">Speak</button><wcs-listen> — speech to text
<!-- Auto-start on connect; bind the transcript to state -->
<wcs-listen lang="en-US" interim data-wcs="finalTranscript: transcript; interimTranscript: draft"></wcs-listen>
<!-- Manual, continuous, command-driven -->
<wcs-listen manual continuous max-restarts="5"
data-wcs="command.start: $command.listen; finalTranscript: transcript; listening: isListening"></wcs-listen>Like <wcs-geo>, it has two phases: a one-shot recognition (default) and a continuous session (continuous attribute). The browser still ends a session on silence; auto-restart bridges that, but is opt-in via max-restarts — continuous alone (with the default max-restarts="0") does not restart on silence. Set max-restarts="5" to bridge up to 5 silences. This bound is deliberate: unbounded restart is an infinite-loop / quota-exhaustion risk.
Microphone auto-start. Without
manual,<wcs-listen>callsstart()on connect — placing the tag in the DOM begins recognition (a permission prompt, then continuous capture). Addmanualto require an explicitstart()/ DOM-trigger /triggerwrite instead. Mirrors<wcs-geo>'smanualconvention, but mind that microphone capture is more privacy-sensitive.
Attributes / Inputs
| Attribute | Input | Type | Default | Meaning |
|---|---|---|---|---|
| lang | lang | string | — | BCP-47 language tag |
| continuous | continuous | boolean | false | keep the session open & auto-restart on end |
| interim | interim | boolean | false | emit live interim transcripts |
| max-restarts | maxRestarts | number | 0 | cap on automatic restarts (continuous) |
| manual | manual | boolean | false | do not auto-start on connect |
| — | trigger | boolean | — | momentary: false→true starts a session |
Observable Properties (outputs)
| Property | Type | Meaning |
|---|---|---|
| interimTranscript | string | live, not-yet-final text |
| finalTranscript | string | accumulated final text |
| result | WcsListenResultDetail | null | latest result (transcript / confidence / alternatives / isFinal) |
| listening | boolean | a session is active |
| permission | "prompt"\|"granted"\|"denied"\|"unsupported" | microphone permission |
| error | WcsListenErrorDetail | null | last failure |
| unsupported | boolean | SpeechRecognition is unavailable |
Commands
| Command | Meaning |
|---|---|
| start() | begin a session (resets transcripts) |
| stop() | stop gracefully (no auto-restart) |
| abort() | stop immediately |
Optional DOM triggering
Clicking an element with data-listentarget="<id>" toggles start() / stop() on the target <wcs-listen>.
Notes & limitations
- Secure context required. Both APIs need HTTPS or
localhost;<wcs-listen>additionally needs microphone permission. - Browser support. SpeechSynthesis is broad; SpeechRecognition is Chrome-only (vendor-prefixed
webkitSpeechRecognition) —<wcs-listen>reportsunsupportedelsewhere. - SpeechSynthesis is a global singleton.
<wcs-speak>does notcancel()on disconnect (that would stop other instances); callcancel()explicitly to stop audio. A disconnected element stops tracking but any in-flight utterance finishes. - Echo loop. When wiring
<wcs-listen>→ state →<wcs-speak>, mute speaking while listening (e.g. bindmanual) so the synthesized audio is not re-recognized. See the echo example.
Headless usage (SpeakCore / ListenCore)
Both Cores are framework-agnostic and usable without the custom elements, via bind() from @wc-bindable/core:
import { SpeakCore } from "@wcstack/speech";
const core = new SpeakCore();
core.speak("Hello, world.");License
MIT
