ten-vad-lib
v1.0.0
Published
A JavaScript library for Ten VAD (Voice Activity Detection) based on WebAssembly
Maintainers
Readme
Ten VAD Library
A JavaScript library for Ten VAD (Voice Activity Detection) based on WebAssembly.
Installation
npm install ten-vad-libUsage
Basic Usage
import { NonRealTimeTenVAD } from 'ten-vad-lib';
async function processAudio() {
const vad = await NonRealTimeTenVAD.new({
hopSize: 256,
voiceThreshold: 0.5,
minSpeechDuration: 100,
wasmPath: '/path/to/ten_vad.wasm',
jsPath: '/path/to/ten_vad.js'
});
const audioData = new Float32Array(/* your audio data */);
const sampleRate = 16000;
const result = await vad.process(audioData, sampleRate);
console.log('Speech segments:', result.speechSegments);
console.log('Statistics:', result.statistics);
}Streaming Usage
import { NonRealTimeTenVAD } from 'ten-vad-lib';
async function streamProcess() {
const vad = await NonRealTimeTenVAD.new();
const audioData = new Float32Array(/* your audio data */);
for await (const segment of vad.run(audioData, 16000)) {
console.log('Speech segment:', segment);
}
}WASM File Handling
The library includes WASM files that need to be accessible at runtime. Here are the supported scenarios:
1. NPM Package Usage (Recommended)
When using as an npm package, the WASM files are automatically included:
import { NonRealTimeTenVAD } from 'ten-vad-lib';
const vad = await NonRealTimeTenVAD.new();The library will automatically detect the correct paths for WASM files.
2. CDN Usage
If you're serving the library from a CDN, specify the WASM paths:
const vad = await NonRealTimeTenVAD.new({
wasmPath: 'https://your-cdn.com/ten-vad-lib/wasm/ten_vad.wasm',
jsPath: 'https://your-cdn.com/ten-vad-lib/wasm/ten_vad.js'
});3. Local Development
For local development, place the WASM files in your public directory:
const vad = await NonRealTimeTenVAD.new({
wasmPath: '/wasm/ten_vad.wasm',
jsPath: '/wasm/ten_vad.js'
});4. Custom Build
If you're building a custom version, copy the WASM files to your build output:
cp node_modules/ten-vad-lib/wasm/* public/wasm/API Reference
NonRealTimeTenVAD
Constructor Options
interface TenVADOptions {
hopSize?: number; // Default: 256
voiceThreshold?: number; // Default: 0.5
wasmPath?: string; // Path to WASM file
jsPath?: string; // Path to JS file
minSpeechDuration?: number; // Default: 100ms
maxSilenceDuration?: number; // Default: 500ms
}Methods
static new(options?: TenVADOptions): Promise<NonRealTimeTenVAD>process(audio: Float32Array, sampleRate: number): Promise<TenVADResult>run(audio: Float32Array, sampleRate: number): AsyncGenerator<TenVADSpeechData>
TenVADResult
interface TenVADResult {
speechSegments: TenVADSpeechData[];
statistics: {
totalFrames: number;
voiceFrames: number;
voicePercentage: number;
processingTime: number;
realTimeFactor: number;
};
}TenVADSpeechData
interface TenVADSpeechData {
audio: Float32Array;
start: number; // Start time in milliseconds
end: number; // End time in milliseconds
probability: number; // Voice probability (0-1)
}Browser Support
- Chrome 57+
- Firefox 52+
- Safari 11+
- Edge 79+
License
MIT
