vad-recorder
v0.1.3
Published
TypeScript library
Readme
vad-recorder
vad-recorder is a browser-focused TypeScript library that combines voice activity detection (VAD) with automatic audio segment recording.
It uses Silero VAD via @huggingface/transformers (Transformers.js) and the onnx-community/silero-vad model on Hugging Face under the hood.
Install
npm install vad-recorderQuick start
import { VadRecorder } from "vad-recorder";
const info = await VadRecorder.info();
console.log(info.isCached, info.downloadSize);
const recorder = new VadRecorder({
threshold: 0.55,
minSpeechDuration: 250,
minSilenceDuration: 900,
prependSilence: 120,
appendSilence: 300,
});
recorder.onReady(() => console.log("Listening..."));
recorder.onSpeechStart(() => console.log("Speech start"));
recorder.onSpeechEnd(() => console.log("Speech end"));
recorder.onRecord((blob) => console.log("Recorded blob", blob));
recorder.onError((err) => console.error(err));
await recorder.initialize((event) => {
if (event.status === "downloading") {
console.log(`Model download: ${Math.round(event.progress * 100)}%`);
}
});
await recorder.start();React hook
For React projects, use the hook export:
import { useVadRecorder } from "vad-recorder/react";
function App() {
const {
status,
progress,
recordings,
error,
initialize,
start,
stop,
pause,
resume,
clearRecordings,
} = useVadRecorder({
threshold: 0.55,
minSpeechDuration: 250,
minSilenceDuration: 900,
prependSilence: 120,
appendSilence: 300,
});
return (
<div>
<p>Status: {status}</p>
<p>Download: {Math.round(progress * 100)}%</p>
<button onClick={() => void initialize()}>Initialize</button>
<button onClick={() => void start()}>Start</button>
<button onClick={pause}>Pause</button>
<button onClick={resume}>Resume</button>
<button onClick={stop}>Stop</button>
<button onClick={clearRecordings}>Clear</button>
<p>Recordings: {recordings.length}</p>
{error ? <pre>{error.message}</pre> : null}
</div>
);
}useVadRecorder(options?) returns:
status,progress,volumeDb,speechProbabilityrecordings,error,recorderinitialize,start,stop,pause,resume,destroy,clearRecordings,info
API
VadRecorder.info(): Promise<{ isCached: boolean; downloadSize: number }>
Returns model cache/download metadata.
isCached: whether required model files are cached.downloadSize: sum of all model file sizes (bytes).
new VadRecorder(options?)
All options are optional:
threshold(default0.5)- Speech probability cutoff (
0-1). - Higher = stricter detection (fewer false positives, can miss quiet speech).
- Lower = more sensitive (captures quiet speech, can trigger on noise).
- Speech probability cutoff (
minSpeechDurationms (default250)- Minimum continuous speech before a segment officially starts.
- Helps filter clicks, breaths, and very short noises.
minSilenceDurationms (default1000)- Required silence before a segment is considered finished.
- Increase to avoid splitting natural pauses mid-sentence.
prependSilencems (default100)- Audio prepended before detected speech to avoid clipping first phonemes.
- Internally combined with
minSpeechDurationin the rolling pre-buffer.
appendSilencems (default300)- Extra audio kept after speech end is detected.
- Helps avoid cutting off trailing words/syllables.
Lifecycle
initialize(onProgress?): loads VAD model, safe to call multiple times.start(): requests mic and starts frame processing.pause(): pauses VAD processing.resume(): resumes VAD processing.stop(): stops mic + processing, keeps model loaded.destroy(): full cleanup (mic + model + listeners).
Events (single-listener setters)
onRecord((blob) => void)onSpeechStart(() => void)onSpeechEnd(() => void)onReady(() => void)onError((error) => void)onVolumeChange((db) => void)onSpeechProbability((p) => void)
Progress callback
initialize(onProgress) currently emits download progress from progress_total events only.
- Rounded to 2 decimals (
0.00to100.00) - Emitted only when the rounded value changes
Development
npm install
npm run devBuild for publish:
npm run buildType-check:
npm run typecheckExample apps
A minimal vanilla demo is included at examples/simple.
cd examples/simple
npm install
npm run devA React demo is included at examples/react.
cd examples/react
npm install
npm run devNotes
- Designed for browser environments.
- Sample rate is fixed at
16000(Silero VAD requirement). - Channel count is fixed at mono (
1). - Current recording output is WAV blobs (
audio/wav) for deterministic PCM assembly.
