vad-recorder

v0.1.3

Published

2 months ago

TypeScript library

Downloads

114

0High
0Medium
0Low

nico_martin

vad-recorder

vad-recorder is a browser-focused TypeScript library that combines voice activity detection (VAD) with automatic audio segment recording.

It uses Silero VAD via @huggingface/transformers (Transformers.js) and the onnx-community/silero-vad model on Hugging Face under the hood.

Install

npm install vad-recorder

Quick start

import { VadRecorder } from "vad-recorder";

const info = await VadRecorder.info();
console.log(info.isCached, info.downloadSize);

const recorder = new VadRecorder({
  threshold: 0.55,
  minSpeechDuration: 250,
  minSilenceDuration: 900,
  prependSilence: 120,
  appendSilence: 300,
});

recorder.onReady(() => console.log("Listening..."));
recorder.onSpeechStart(() => console.log("Speech start"));
recorder.onSpeechEnd(() => console.log("Speech end"));
recorder.onRecord((blob) => console.log("Recorded blob", blob));
recorder.onError((err) => console.error(err));

await recorder.initialize((event) => {
  if (event.status === "downloading") {
    console.log(`Model download: ${Math.round(event.progress * 100)}%`);
  }
});

await recorder.start();

React hook

For React projects, use the hook export:

import { useVadRecorder } from "vad-recorder/react";

function App() {
  const {
    status,
    progress,
    recordings,
    error,
    initialize,
    start,
    stop,
    pause,
    resume,
    clearRecordings,
  } = useVadRecorder({
    threshold: 0.55,
    minSpeechDuration: 250,
    minSilenceDuration: 900,
    prependSilence: 120,
    appendSilence: 300,
  });

  return (
    <div>
      <p>Status: {status}</p>
      <p>Download: {Math.round(progress * 100)}%</p>
      <button onClick={() => void initialize()}>Initialize</button>
      <button onClick={() => void start()}>Start</button>
      <button onClick={pause}>Pause</button>
      <button onClick={resume}>Resume</button>
      <button onClick={stop}>Stop</button>
      <button onClick={clearRecordings}>Clear</button>
      <p>Recordings: {recordings.length}</p>
      {error ? <pre>{error.message}</pre> : null}
    </div>
  );
}

useVadRecorder(options?) returns:

status, progress, volumeDb, speechProbability
recordings, error, recorder
initialize, start, stop, pause, resume, destroy, clearRecordings, info

API

`VadRecorder.info(): Promise<{ isCached: boolean; downloadSize: number }>`

Returns model cache/download metadata.

isCached: whether required model files are cached.
downloadSize: sum of all model file sizes (bytes).

`new VadRecorder(options?)`

All options are optional:

threshold (default 0.5)
- Speech probability cutoff (0-1).
- Higher = stricter detection (fewer false positives, can miss quiet speech).
- Lower = more sensitive (captures quiet speech, can trigger on noise).
minSpeechDuration ms (default 250)
- Minimum continuous speech before a segment officially starts.
- Helps filter clicks, breaths, and very short noises.
minSilenceDuration ms (default 1000)
- Required silence before a segment is considered finished.
- Increase to avoid splitting natural pauses mid-sentence.
prependSilence ms (default 100)
- Audio prepended before detected speech to avoid clipping first phonemes.
- Internally combined with minSpeechDuration in the rolling pre-buffer.
appendSilence ms (default 300)
- Extra audio kept after speech end is detected.
- Helps avoid cutting off trailing words/syllables.

Lifecycle

initialize(onProgress?): loads VAD model, safe to call multiple times.
start(): requests mic and starts frame processing.
pause(): pauses VAD processing.
resume(): resumes VAD processing.
stop(): stops mic + processing, keeps model loaded.
destroy(): full cleanup (mic + model + listeners).

Events (single-listener setters)

onRecord((blob) => void)
onSpeechStart(() => void)
onSpeechEnd(() => void)
onReady(() => void)
onError((error) => void)
onVolumeChange((db) => void)
onSpeechProbability((p) => void)

Progress callback

initialize(onProgress) currently emits download progress from progress_total events only.

Rounded to 2 decimals (0.00 to 100.00)
Emitted only when the rounded value changes

Development

npm install
npm run dev

Build for publish:

npm run build

Type-check:

npm run typecheck

Example apps

A minimal vanilla demo is included at examples/simple.

cd examples/simple
npm install
npm run dev

A React demo is included at examples/react.

cd examples/react
npm install
npm run dev

Notes

Designed for browser environments.
Sample rate is fixed at 16000 (Silero VAD requirement).
Channel count is fixed at mono (1).
Current recording output is WAV blobs (audio/wav) for deterministic PCM assembly.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

vad-recorder

Install

Quick start

React hook

API

VadRecorder.info(): Promise<{ isCached: boolean; downloadSize: number }>

new VadRecorder(options?)

Lifecycle

Events (single-listener setters)

Progress callback

Development

Example apps

Notes

`VadRecorder.info(): Promise<{ isCached: boolean; downloadSize: number }>`

`new VadRecorder(options?)`