vadscore

v1.0.0

Published

8 months ago

A high-accuracy Voice Activity Detection (VAD) library for React with real-time scoring and speech detection

0High
0Medium
0Low

husnainizhar

vad voice-activity-detection speech-detection audio react typescript real-time

VADScore 🎤

A high-accuracy Voice Activity Detection (VAD) library for React with real-time scoring and speech detection.

Features

✨ High Accuracy: Advanced algorithm combining multiple audio features (energy, zero-crossing rate, spectral centroid)
🎯 Real-time: Low-latency processing for live audio streams
⚛️ React Integration: Easy-to-use React hook with TypeScript support
🔧 Configurable: Customizable thresholds and parameters for different use cases
📊 Audio Metrics: Detailed audio analysis data for visualization
🎛️ Hysteresis: Prevents rapid state changes for stable detection
🔇 Noise Handling: Built-in noise suppression and echo cancellation support

Installation

npm install vadscore

Quick Start

Basic React Hook Usage

import React from 'react';
import { useVAD } from 'vadscore';

function VoiceDetector() {
  const {
    vadScore,
    isSpeaking,
    isListening,
    startListening,
    stopListening,
    error
  } = useVAD({
    onVADScore: (result) => {
      console.log('VAD Score:', result.score);
    },
    onSpeakingChange: (speaking) => {
      console.log('Speaking:', speaking);
    }
  });

  return (
    <div>
      <h2>Voice Activity Detection</h2>
      
      <div>
        <button onClick={startListening} disabled={isListening}>
          Start Listening
        </button>
        <button onClick={stopListening} disabled={!isListening}>
          Stop Listening
        </button>
      </div>

      <div>
        <p>VAD Score: {vadScore.toFixed(3)}</p>
        <p>Speaking: {isSpeaking ? 'Yes' : 'No'}</p>
        <p>Status: {isListening ? 'Listening' : 'Stopped'}</p>
        {error && <p style={{color: 'red'}}>Error: {error}</p>}
      </div>
    </div>
  );
}

export default VoiceDetector;

Advanced Configuration

import { useVAD } from 'vadscore';

function AdvancedVAD() {
  const { vadScore, isSpeaking, audioLevels } = useVAD({
    // VAD algorithm configuration
    energyThreshold: 0.02,
    zcrThreshold: 0.25,
    spectralCentroidThreshold: 1200,
    smoothingFactor: 0.85,
    minSpeechFrames: 5,
    minSilenceFrames: 8,
    
    // Audio settings
    sampleRate: 16000,
    frameSize: 512,
    
    // Browser audio constraints
    audioConstraints: {
      echoCancellation: true,
      noiseSuppression: true,
      autoGainControl: false,
      sampleRate: 16000
    },
    
    // Auto-start detection
    autoStart: true,
    
    // Callbacks
    onVADScore: (result) => {
      // Handle real-time VAD results
      console.log('Detailed VAD result:', result);
    },
    onSpeakingChange: (speaking) => {
      // Handle speaking state changes
      if (speaking) {
        console.log('Speech started');
      } else {
        console.log('Speech stopped');
      }
    }
  });

  return (
    <div>
      <h3>Advanced VAD with Audio Visualization</h3>
      
      {/* VAD Score Meter */}
      <div style={{ width: '100%', background: '#eee', height: '20px' }}>
        <div 
          style={{ 
            width: `${vadScore * 100}%`, 
            background: isSpeaking ? '#4CAF50' : '#2196F3',
            height: '100%',
            transition: 'all 0.1s'
          }} 
        />
      </div>
      
      {/* Audio Levels */}
      <div>
        <p>Energy: {audioLevels.energy.toFixed(4)}</p>
        <p>Zero Crossing Rate: {audioLevels.zcr.toFixed(4)}</p>
        <p>Spectral Centroid: {audioLevels.spectralCentroid.toFixed(2)} Hz</p>
      </div>
    </div>
  );
}

Standalone VAD Processing

import { calculateVADScore, processAudioBuffer, VADCore } from 'vadscore';

// Simple one-time VAD score calculation
function processAudioFrame(audioData: Float32Array) {
  const score = calculateVADScore(audioData, {
    energyThreshold: 0.015,
    zcrThreshold: 0.3
  });
  console.log('VAD Score:', score);
}

// Batch processing of audio buffer
function processEntireAudio(audioBuffer: Float32Array) {
  const scores = processAudioBuffer(audioBuffer, 512, {
    sampleRate: 16000,
    energyThreshold: 0.01
  });
  console.log('VAD Scores:', scores);
}

// Advanced usage with VADCore class
function advancedProcessing() {
  const vadCore = new VADCore({
    sampleRate: 16000,
    frameSize: 512,
    energyThreshold: 0.02
  });

  // Process multiple frames maintaining state
  const audioFrames: Float32Array[] = []; // Your audio frames
  
  audioFrames.forEach(frame => {
    const result = vadCore.processFrame(frame);
    console.log('VAD Result:', {
      score: result.score,
      isSpeaking: result.isSpeaking,
      energy: result.energy,
      zcr: result.zcr,
      spectralCentroid: result.spectralCentroid
    });
  });
}

API Reference

`useVAD(options?)` Hook

The main React hook for voice activity detection.

Parameters

| Parameter | Type | Default | Description | |-----------|------|---------|-------------| | autoStart | boolean | false | Start VAD automatically when component mounts | | onVADScore | (result: VADResult) => void | - | Callback fired on each VAD score update | | onSpeakingChange | (isSpeaking: boolean) => void | - | Callback fired when speaking state changes | | energyThreshold | number | 0.01 | Energy threshold for voice detection | | zcrThreshold | number | 0.3 | Zero crossing rate threshold | | spectralCentroidThreshold | number | 1000 | Spectral centroid threshold (Hz) | | smoothingFactor | number | 0.8 | Temporal smoothing factor (0-1) | | minSpeechFrames | number | 3 | Minimum frames for speech detection | | minSilenceFrames | number | 5 | Minimum frames for silence detection | | sampleRate | number | 16000 | Audio sample rate | | frameSize | number | 512 | Audio frame size in samples | | audioConstraints | MediaTrackConstraints | {echoCancellation: true, noiseSuppression: true} | Browser audio constraints |

Returns

| Property | Type | Description | |----------|------|-------------| | vadScore | number | Current VAD score (0-1) | | isSpeaking | boolean | Whether speech is currently detected | | isListening | boolean | Whether VAD is currently active | | startListening | () => Promise<void> | Start voice activity detection | | stopListening | () => void | Stop voice activity detection | | stream | MediaStream \| null | Current audio stream | | error | string \| null | Error message if any | | audioLevels | {energy: number, zcr: number, spectralCentroid: number} | Current audio analysis metrics |

`VADCore` Class

Low-level VAD processing class for advanced usage.

const vadCore = new VADCore({
  sampleRate: 16000,
  frameSize: 512,
  energyThreshold: 0.01
});

const result = vadCore.processFrame(audioData);

Utility Functions

calculateVADScore(audioData, config?): Calculate VAD score for a single frame
processAudioBuffer(audioBuffer, frameSize?, config?): Process entire audio buffer
resampleAudio(audioData, originalRate, targetRate): Resample audio data
preEmphasis(audioData, alpha?): Apply pre-emphasis filter
highPassFilter(audioData, cutoff?): Apply high-pass filter

Algorithm Details

VADScore uses a sophisticated multi-feature approach:

Energy Analysis: RMS energy calculation with adaptive thresholding
Zero Crossing Rate: Speech vs. noise discrimination
Spectral Centroid: Frequency domain analysis for voice characteristics
Temporal Smoothing: Reduces false positives from transient noise
Hysteresis: Prevents rapid state changes for stable detection

The algorithm combines these features using weighted scoring and applies temporal consistency checks to achieve high accuracy across various acoustic conditions.

Browser Compatibility

Chrome 66+
Firefox 60+
Safari 11.1+
Edge 79+

Requires MediaDevices.getUserMedia() and Web Audio API support.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

Changelog

v1.0.0

Initial release
React hook implementation
Multi-feature VAD algorithm
TypeScript support
Comprehensive documentation

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

VADScore 🎤

Features

Installation

Quick Start

Basic React Hook Usage

Advanced Configuration

Standalone VAD Processing

API Reference

useVAD(options?) Hook

Parameters

Returns

VADCore Class

Utility Functions

Algorithm Details

Browser Compatibility

Contributing

License

Changelog

v1.0.0

`useVAD(options?)` Hook

`VADCore` Class