npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@jaehyun-ko/speaker-verification

v5.0.0

Published

Real-time speaker verification in the browser using NeXt-TDNN models

Downloads

68

Readme

NeXt-TDNN Speaker Verification for Web

Real-time speaker verification in the browser using NeXt-TDNN models. Compare two audio samples to determine if they're from the same speaker.

🎯 Live Demo

Try it now: https://jaehyun-ko.github.io/node-speaker-verification/

Simple and intuitive speaker verification:

  • 🎤 Record audio directly from microphone
  • 📁 Upload audio files
  • 🔍 Get similarity score instantly

🚀 Quick Start (Simple API)

API Methods

  • initialize(model, options?) - Initialize with a model
  • compareAudio(audio1, audio2) - Compare two audio samples
  • getEmbedding(audio) - Extract speaker embedding from audio
  • compareEmbeddings(embedding1, embedding2) - Compare pre-computed embeddings
  • cleanup() - Release resources

CDN Usage (Simplest - Just 3 Lines!)

<!DOCTYPE html>
<html>
<head>
    <!-- IMPORTANT: Load ONNX Runtime first -->
    <script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/ort.min.js"></script>
    <script src="https://cdn.jsdelivr.net/npm/@jaehyun-ko/[email protected]/dist/speaker-verification.js"></script>
</head>
<body>
    <input type="file" id="audio1" accept="audio/*">
    <input type="file" id="audio2" accept="audio/*">
    <button onclick="compareSpeakers()">Compare</button>
    
    <script>
    // Create verifier instance
    const verifier = new SpeakerVerification();
    
    async function compareSpeakers() {
        // 1. Initialize (only needed once)
        await verifier.initialize('standard-256');
        
        // 2. Get audio files
        const file1 = document.getElementById('audio1').files[0];
        const file2 = document.getElementById('audio2').files[0];
        
        // 3. Compare! That's it!
        const result = await verifier.compareAudio(file1, file2);
        
        console.log('Similarity:', (result.similarity * 100).toFixed(1) + '%');
        console.log('Same speaker?', result.similarity > 0.5); // You decide the threshold!
    }
    </script>
</body>
</html>

NPM Installation

# Install both ONNX Runtime and the speaker verification library
npm install onnxruntime-web @jaehyun-ko/speaker-verification
import * as ort from 'onnxruntime-web';
import { SpeakerVerification } from '@jaehyun-ko/speaker-verification';

// Optional: Configure ONNX Runtime WASM paths if needed
// ort.env.wasm.wasmPaths = 'https://cdn.jsdelivr.net/npm/[email protected]/dist/';

// Create instance
const verifier = new SpeakerVerification();

// Initialize with model (auto-downloads from Hugging Face)
await verifier.initialize('standard-256'); // or 'mobile-128' for smaller/faster

// Compare any audio format (File, Blob, ArrayBuffer, Float32Array)
const result = await verifier.compareAudio(audio1, audio2);

console.log(result);
// {
//   similarity: 0.92,      // 0.0 to 1.0 (higher = more similar)
//   processingTime: 523    // milliseconds
// }

// You decide what threshold to use
const isSameSpeaker = result.similarity > 0.5;  // Common threshold: 0.5

Available Models

// Standard models (best accuracy)
'standard-256'  // 28MB - Recommended
'standard-128'  // 7.5MB - Faster
'standard-192'  // 16MB
'standard-384'  // 32MB - Highest accuracy

// Mobile models (optimized for size/speed)
'mobile-128'    // 5MB - Smallest
'mobile-256'    // 20MB - Best mobile balance

📱 Microphone Recording

// With the simple API, just pass the recorded blob
const verifier = new SpeakerVerification();
await verifier.initialize('standard-256');

// Record audio using browser API
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
const mediaRecorder = new MediaRecorder(stream);
const chunks = [];

mediaRecorder.ondataavailable = (e) => chunks.push(e.data);
mediaRecorder.onstop = async () => {
    const audioBlob = new Blob(chunks, { type: 'audio/webm' });
    
    // Compare with another audio
    const result = await verifier.compareAudio(audioBlob, anotherAudio);
    console.log('Similarity:', result.similarity);
};

mediaRecorder.start();
setTimeout(() => mediaRecorder.stop(), 3000); // Record for 3 seconds

🎛️ Available Models

All models are hosted on Hugging Face.

Simple API Model Keys

| Key | Size | Channels | Description | |-----|------|----------|-------------| | standard-256 | 28MB | 256 | Recommended - Best balance | | standard-128 | 7.5MB | 128 | Compact, faster processing | | standard-192 | 16MB | 192 | Medium size and accuracy | | standard-384 | 32MB | 384 | Highest accuracy | | mobile-128 | 5MB | 128 | Smallest, mobile-optimized | | mobile-256 | 20MB | 256 | Best mobile balance |

Full Model Names (for advanced usage)

| Model | Size | Description | |-------|------|-------------| | NeXt_TDNN_C256_B3_K65_7_cosine | 28MB | Standard 256-channel | | NeXt_TDNN_C128_B3_K65_7_cosine | 7.5MB | Compact 128-channel | | NeXt_TDNN_C192_B1_K65_7_cosine | 16MB | Medium 192-channel | | NeXt_TDNN_C384_B1_K65_7_cosine | 32MB | Large 384-channel | | NeXt_TDNN_light_C128_B3_K65_7_cosine | 5MB | Mobile 128-channel | | NeXt_TDNN_light_C256_B3_K65_7_cosine | 20MB | Mobile 256-channel |

📊 Understanding Results

  • Similarity Score: 0.0 to 1.0 (higher = more similar)
  • Recommended Threshold: 0.5
  • Adjust threshold based on your needs:
    • Higher threshold (0.7+) = More strict, fewer false positives
    • Lower threshold (0.3-) = More permissive, fewer false negatives

🛠️ Advanced Usage

Custom Model Loading with Simple API

// Load custom model from ArrayBuffer
const modelData = await fetch('path/to/custom-model.onnx').then(r => r.arrayBuffer());
const verifier = new SpeakerVerification();
await verifier.initialize('standard-256', { modelData });

// Or disable caching for development
await verifier.initialize('standard-256', { cacheModel: false });

Batch Processing

const verifier = new SpeakerVerification();
await verifier.initialize('standard-256');

// Compare multiple audio pairs
const results = [];
for (let i = 0; i < audioFiles.length - 1; i++) {
    const result = await verifier.compareAudio(audioFiles[i], audioFiles[i + 1]);
    results.push(result);
}

// Get average similarity
const avgSimilarity = results.reduce((sum, r) => sum + r.similarity, 0) / results.length;

Working with Embeddings

You can now extract and compare speaker embeddings directly:

const verifier = new SpeakerVerification();
await verifier.initialize('standard-256');

// Extract embeddings from audio
const embedding1 = await verifier.getEmbedding(audio1);
const embedding2 = await verifier.getEmbedding(audio2);

console.log('Embedding 1:', embedding1);
// {
//   embedding: Float32Array(192),  // Normalized speaker vector
//   processingTime: 245            // milliseconds
// }

// Compare pre-computed embeddings
const similarity = verifier.compareEmbeddings(embedding1.embedding, embedding2.embedding);
console.log('Similarity:', similarity); // 0.0 to 1.0

// Store embeddings for later use
const embeddingData = Array.from(embedding1.embedding); // Convert to regular array for storage
localStorage.setItem('speaker1', JSON.stringify(embeddingData));

// Load and use stored embeddings
const storedData = JSON.parse(localStorage.getItem('speaker1'));
const storedEmbedding = new Float32Array(storedData);
const similarity2 = verifier.compareEmbeddings(storedEmbedding, embedding2.embedding);

This is useful for:

  • Building speaker databases
  • Caching embeddings for performance
  • Analyzing speaker characteristics
  • Custom similarity metrics

📝 License

Apache License 2.0

🤝 Credits

Based on NeXt-TDNN architecture for speaker verification.

📚 Citation

If you use this library in your research, please cite:

@INPROCEEDINGS{10447037,
  author={Heo, Hyun-Jun and Shin, Ui-Hyeop and Lee, Ran and Cheon, YoungJu and Park, Hyung-Min},
  booktitle={ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, 
  title={NeXt-TDNN: Modernizing Multi-Scale Temporal Convolution Backbone for Speaker Verification}, 
  year={2024},
  volume={},
  number={},
  pages={11186-11190},
  keywords={Convolution;Speech recognition;Transformers;Acoustics;Task analysis;Speech processing;speaker recognition;speaker verification;TDNN;ConvNeXt;multi-scale},
  doi={10.1109/ICASSP48485.2024.10447037}}