voice-flow-x
v0.0.0
Published
A voice stream processing framework unrelated to the environment
Maintainers
Readme
voice-flow-x
🎙️ Environment-agnostic voice stream processing — a small, testable API for consuming audio chunks → async ASR text streams, merging segments, debouncing deltas, and matching commands.
- 🌊 Streaming-first —
Voiceconsumes incremental recognition asAsyncIterable/ReadableStream, handles segment switches and candidate merging - 🎯 UI-friendly — dedupes snapshots, short-window throttling, and a recent-output window to tame ASR flicker
- 🧩 Pluggable commands — match on
delta/finalwith strings, predicates, or regex;stop,clear, and customhandler - 📘 TypeScript-native — generic
Chunk<T>andVoiceOptions<T>for any audio payload type
📌 Note This library is only a stream + text state machine. It does not ship recording, WebSocket, or a specific ASR SDK — wire
streamto your service.
📦 Install
pnpm add voice-flow-xnpm install voice-flow-x🚀 Usage
Minimal example
Use createVoice (or new Voice) and provide stream: for each Chunk, return the async text stream for that audio segment.
import { createVoice } from 'voice-flow-x'
const voice = createVoice({
stream: async ({ data, id }) => {
// Your ASR: return AsyncIterable<string> or ReadableStream<string>
return yourAsrStream(data, id)
},
onDelta: (text) => {
// Debounced full snapshot (good for the current recognition line)
},
onFinal: (text) => {
// Fires when `done()` completes (e.g. silence timeout with `finalIdleMs`, or manual `done()`)
},
deltaIdleMs: 50, // debounce for onDelta, default 50
finalIdleMs: 2000, // optional: silence window before auto-finalize when no new chunk
debug: false,
})
// After you get audio from the mic or elsewhere:
voice.feed({ data: audioPayload, id: segmentId })Commands
Register match-and-run rules on streaming text — handy for keyword interrupts or clearing the UI.
voice.addCommand({
match: ['stop', 'cancel'],
stage: 'delta',
stop: true, // skip further onDelta/onFinal handling when matched
clear: true, // pass empty string to callbacks to clear the view
handler: (text) => { /* side effects */ },
})match may be string[] (includes), (text: string) => boolean, or RegExp.
State & concurrency
feed(chunk)— queued processing; continues with the next chunk when the current run finisheslock(ms)/unlock()— pause processing; auto-unlock after timeout;unlockclears internal text statefinalize()— whenfinalIdleMsis set, schedulesdone()after silence (no new audio)clear()— resets prefix, merged text, and dedupe window (often via internal oronClearpaths)
Types
Exporting from voice-flow-x: Voice, createVoice, Chunk, Command, VoiceOptions, AsyncIterableStream, and more — see JSDocs.
🔧 For package maintainers
If you use npm Trusted Publisher, run pnpm publish once locally to create the package and link the GitHub repo on npm; later, pnpm run release can drive releases via CI. See npm docs and scripts in this repo.
