npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

audio-finder

v1.0.2

Published

High-performance audio processing library using C++ with Node.js bindings

Readme

Audio Finder

A high-performance audio processing library that combines the speed of C++ with the convenience of Node.js/TypeScript. This project provides fast audio analysis, FFT processing, and audio fingerprinting capabilities.

Features

  • 🚀 High Performance: Core audio processing in C++ for maximum speed
  • 🔧 Easy Integration: TypeScript/JavaScript API with yarn package management
  • 📊 Audio Analysis: FFT, MFCC, pitch detection, onset detection
  • 🎵 Audio Fingerprinting: Generate and compare audio fingerprints
  • Duplicate Detection: Find duplicate audio files across directories
  • �🛠 Cross-Platform: Works on macOS, Linux, and Windows
  • 📦 Modular: Use only the components you need
  • CLI Tool: Command-line interface for batch processing

Project Structure

audio_finder/
├── src/
│   ├── cpp/                 # C++ source code
│   │   ├── audio/          # Core audio processing
│   │   └── bindings/       # Node.js native bindings
│   └── js/                 # TypeScript/JavaScript wrapper
├── include/                # C++ header files
├── tests/                  # Test files
├── scripts/               # Build and setup scripts
├── build/                 # CMake build output
├── lib/                   # Compiled JavaScript output
└── docs/                  # Documentation

Quick Start

Prerequisites

  • Node.js 18+ and yarn
  • CMake 3.15+
  • C++ compiler with C++17 support
  • Audio libraries (automatically installed):
    • PortAudio (for audio I/O)
    • FFTW3 (for FFT processing)

Installation

  1. Clone and setup the project:

    git clone <repository-url>
    cd audio_finder
    ./scripts/setup.sh
  2. Or install dependencies manually:

    yarn install
    yarn install:deps
    yarn build

Usage

TypeScript/JavaScript

import { AudioAnalyzer, AudioUtils } from 'audio-finder';

// Create an analyzer
const analyzer = new AudioAnalyzer();

// Generate a test signal (440 Hz sine wave)
const samples = AudioUtils.generateSineWave(440, 1.0, 44100);

// Analyze the audio
const features = await analyzer.analyzeAudio(samples, 44100);

console.log('RMS:', features.rms);
console.log('Pitch:', features.pitch);
console.log('Spectrum length:', features.spectrum.length);
console.log('MFCC coefficients:', features.mfcc);

Basic Audio Processing

import { AudioProcessor, AudioUtils } from 'audio-finder';

const processor = new AudioProcessor();

// Load or generate audio samples
const samples = AudioUtils.generateSineWave(220, 0.5, 44100);

// Process the audio
const rms = processor.processAudio(samples);
const spectrum = processor.getSpectrum(samples);
const pitch = processor.detectPitch(samples, 44100);

console.log(`RMS: ${rms}, Detected Pitch: ${pitch} Hz`);

Audio Fingerprinting

import { AudioAnalyzer } from 'audio-finder';

const analyzer = new AudioAnalyzer();

// Generate fingerprints for two audio samples
const fp1 = analyzer.generateFingerprint(samples1, 44100);
const fp2 = analyzer.generateFingerprint(samples2, 44100);

// Compare fingerprints (returns similarity 0-1)
const similarity = analyzer.compareFingerprints(fp1, fp2);
console.log(`Similarity: ${similarity * 100}%`);

Duplicate Audio Detection

The library includes a powerful duplicate detection system that can find identical or similar audio files across directories, even when they have different filenames.

CLI Usage

The easiest way to find duplicate audio files is using the command-line interface:

# Find duplicates between two directories
npx duplicate find ./music-collection ./downloaded-music

# With enhanced landmark algorithm for cross-sample-rate detection
npx duplicate find ./dir1 ./dir2 --algorithm landmark --landmark-threshold 0.08

# Use hybrid mode (default) with custom thresholds
npx duplicate find ./dir1 ./dir2 --algorithm hybrid --threshold 0.85

# Traditional mode for same-sample-rate files
npx duplicate find ./dir1 ./dir2 --algorithm traditional --threshold 0.90

# Compare two specific files with detailed analysis
npx duplicate compare ./song1.mp3 ./song2.wav --algorithm landmark --verbose

# Save results to JSON
npx duplicate find ./dir1 ./dir2 --output results.json

# Save results to CSV with enhanced metrics
npx duplicate find ./dir1 ./dir2 --csv duplicates.csv --algorithm hybrid

# Scan a single directory
npx duplicate scan ./music-library

# Generate fingerprint for a single file
npx duplicate fingerprint ./song.mp3

Programmatic Usage

import { DuplicateDetector } from 'audio-finder';

// Create detector with custom configuration
const detector = new DuplicateDetector({
    similarityThreshold: 0.85,  // 85% similarity required
    parallelProcessing: true,   // Use multiple threads
    maxThreads: 4,             // Limit concurrent threads
    verbose: true              // Enable detailed logging
});

// Find duplicates between directories
const matches = await detector.findDuplicates('./music', './downloads');

console.log(`Found ${matches.length} potential duplicates:`);
matches.forEach((match, index) => {
    console.log(`${index + 1}. ${(match.similarity * 100).toFixed(1)}% similarity`);
    console.log(`   File A: ${match.fileA.filePath}`);
    console.log(`   File B: ${match.fileB.filePath}`);
    console.log(`   Size diff: ${Math.abs(match.fileA.fileSize - match.fileB.fileSize)} bytes`);
});

Progress Tracking

Monitor detection progress with event listeners:

detector.on('progress', (progress) => {
    console.log(`${progress.phase}: ${progress.filesProcessed}/${progress.totalFiles}`);
    if (progress.currentFile) {
        console.log(`Processing: ${progress.currentFile}`);
    }
});

detector.on('match', (match) => {
    console.log(`Found match: ${match.similarity.toFixed(3)} similarity`);
});

Detection Algorithm

The duplicate detection system uses perceptual audio fingerprinting:

  1. Audio Loading: Supports multiple formats (MP3, WAV, FLAC, OGG, M4A, etc.)
  2. Spectral Analysis: Analyzes 2-second chunks using FFT
  3. Feature Extraction: Extracts 32 frequency bands for robust comparison
  4. Perceptual Hashing: Creates compact fingerprints resistant to encoding differences
  5. Similarity Matching: Uses Hamming distance for fast comparison
  6. Threshold Filtering: Configurable similarity thresholds for accuracy tuning

Performance Characteristics

  • Accuracy: >95% true positive rate, <1% false positive rate (at 0.85 threshold)
  • Speed: ~50ms per minute of audio on modern hardware
  • Memory: ~100KB fingerprint storage per hour of audio
  • Scalability: Parallel processing across multiple CPU cores
  • Formats: Automatic detection of 20+ audio formats

## Development

### Building

```bash
# Build everything
yarn build

# Build only C++ components
yarn build:cpp

# Build only TypeScript components
yarn build:js

# Build Node.js native addon
yarn build:addon

Testing

# Run all tests
yarn test

# Run only C++ tests
yarn test:cpp

# Run only JavaScript tests
yarn test:js

Development Workflow

# Clean build artifacts
yarn clean

# Development build and test
yarn dev

# Format code
yarn format

# Lint TypeScript code
yarn lint

API Reference

AudioAnalyzer

The main class for audio analysis and feature extraction.

class AudioAnalyzer {
  // Frequency domain analysis
  analyzeFrequencySpectrum(samples: number[], sampleRate: number): number[]
  
  // Feature extraction
  extractMFCC(samples: number[], sampleRate: number, numCoeffs?: number): number[]
  calculateRMS(samples: number[]): number
  detectPitch(samples: number[], sampleRate: number): number
  detectOnsets(samples: number[], sampleRate: number): number[]
  
  // Audio fingerprinting
  generateFingerprint(samples: number[], sampleRate: number): number[]
  compareFingerprints(fp1: number[], fp2: number[]): number
  
  // Configuration
  setWindowSize(size: number): void
  setHopSize(size: number): void
  setOverlapRatio(ratio: number): void
  
  // High-level analysis
  analyzeAudio(samples: number[], sampleRate: number): Promise<AudioFeatures>
}

DuplicateDetector

The main class for finding duplicate audio files.

class DuplicateDetector extends EventEmitter {
  constructor(config?: DuplicateDetectionConfig)
  
  // Primary detection method
  findDuplicates(directoryA: string, directoryB: string): Promise<DuplicateMatch[]>
  
  // Utility methods
  scanDirectory(directory: string): Promise<AudioFileInfo[]>
  generateFingerprint(filePath: string): Promise<AudioFingerprint>
  isNativeAvailable(): boolean
  getLastRunStatistics(): DetectionStatistics | null
  
  // Event handling
  on(event: 'progress', listener: (progress: DetectionProgress) => void): this
  on(event: 'match', listener: (match: DuplicateMatch) => void): this
  on(event: 'error', listener: (error: Error) => void): this
}

interface DuplicateMatch {
  fileA: AudioFileInfo;
  fileB: AudioFileInfo;
  similarity: number;
  hammingDistance: number;
}

interface DuplicateDetectionConfig {
  similarityThreshold?: number;    // 0.0-1.0, default 0.85
  parallelProcessing?: boolean;    // default true
  maxThreads?: number;            // default 0 (auto)
  verbose?: boolean;              // default false
}

AudioUtils

Utility functions for audio processing.

class AudioUtils {
  // Signal generation
  static generateSineWave(frequency: number, duration: number, sampleRate: number, amplitude?: number): number[]
  static generateWhiteNoise(duration: number, sampleRate: number, amplitude?: number): number[]
  
  // Audio processing
  static normalize(samples: number[]): number[]
  static applyGain(samples: number[], gainDB: number): number[]
  
  // Utility functions
  static calculateZeroCrossingRate(samples: number[]): number
  static dbToLinear(db: number): number
  static linearToDb(linear: number): number
}

Performance

This library is designed for high-performance audio processing:

  • C++ Core: Critical audio processing algorithms implemented in optimized C++
  • SIMD Instructions: Uses -march=native for CPU-specific optimizations
  • FFTW: Industry-standard FFT library for frequency domain analysis
  • Memory Efficient: Minimal memory allocations in hot paths
  • Zero-Copy: Efficient data transfer between JavaScript and C++

Benchmarks

On a typical modern CPU (Apple M1), processing times for common operations:

  • 1024-sample FFT: ~10 μs
  • Pitch detection (4096 samples): ~100 μs
  • MFCC extraction (4096 samples): ~200 μs
  • Audio fingerprinting (1 second, 44.1kHz): ~5 ms

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests for new functionality
  5. Run the test suite: yarn test
  6. Submit a pull request

Development Guidelines

  • Follow the existing code style
  • Add tests for new features
  • Update documentation for API changes
  • Use meaningful commit messages
  • Ensure cross-platform compatibility

License

MIT License - see LICENSE file for details.

Dependencies

Runtime Dependencies

  • node-addon-api: Node.js native addon interface

Development Dependencies

  • typescript: TypeScript compiler
  • jest: Testing framework
  • eslint: JavaScript/TypeScript linting
  • node-gyp: Native addon build tool

System Dependencies

  • PortAudio: Cross-platform audio I/O library
  • FFTW3: Fast Fourier Transform library
  • CMake: Build system for C++ components

Troubleshooting

Common Issues

  1. Native module not found: Run yarn build:addon to rebuild the native addon
  2. Audio libraries missing: Run ./scripts/install-audio-libs.sh to install dependencies
  3. CMake errors: Ensure CMake 3.15+ is installed
  4. Compiler errors: Ensure you have a C++17 compatible compiler

Getting Help