@playground-sessions/pitch-detection-analysis
v0.1.1
Published
Polyphonic pitch detection using CREPE (Convolutional Recurrent Estimators) with spectral harmonic analysis. Detects multiple simultaneous pitches from audio input using deep learning-based fundamental frequency estimation combined with non-negative matri
Readme
@playground-sessions/pitch-detection-analysis
Polyphonic pitch detection using CREPE (Convolutional Recurrent Estimators) with spectral harmonic analysis. Detects multiple simultaneous pitches from audio input using deep learning-based fundamental frequency estimation combined with non-negative matrix factorization (NMF) for source separation.
Installation
npm install @playground-sessions/pitch-detection-analysisUsage
Basic Example
import { PitchDetector } from '@playground-sessions/pitch-detection-analysis';
// Create detector
const detector = new PitchDetector({ maxPolyphony: 4 });
await detector.initialize();
// Process audio buffer
const pitches = await detector.detectFromAudioBuffer(audioBuffer);
console.log(pitches);
// [{ frequency: 440, midi: 69, note: "A4", confidence: 0.95, ... }]Real-time Microphone Input
import { PitchDetector } from '@playground-sessions/pitch-detection-analysis';
const detector = new PitchDetector({
confidenceThreshold: 0.8,
useCrepe: true
});
await detector.initialize();
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
for await (const pitches of detector.detectFromStream(stream)) {
console.log('Detected pitches:', pitches);
}Using AudioWorklet for Better Performance
AudioWorklet provides low-latency processing in a separate thread:
import { PitchDetector } from '@playground-sessions/pitch-detection-analysis';
const detector = new PitchDetector({
useWorklet: true, // Enable AudioWorklet
workletPath: '/node_modules/@playground-sessions/pitch-detection-analysis/src/pitch-worklet.js'
});
await detector.initialize();
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
for await (const pitches of detector.detectFromStream(stream)) {
console.log('Detected pitches:', pitches);
}API
PitchDetector
Main class for pitch detection.
Constructor Options
sampleRate?: number- Audio sample rate (default: 44100)frameSize?: number- Analysis frame size (default: 2048)hopSize?: number- Frame hop size (default: 1024)maxPolyphony?: number- Maximum simultaneous pitches (default: 4, range: 1-6)confidenceThreshold?: number- Minimum confidence for detection (default: 0.7)useNMF?: boolean- Enable NMF source separation (default: true)useCrepe?: boolean- Enable CREPE neural network (default: true)useWorklet?: boolean- Use AudioWorklet for low-latency processing (default: false)modelPath?: string- Path to CREPE model weightsworkletPath?: string- Path to worklet processor file
Methods
async initialize(): Promise<void>- Initialize the detector (must call before use)async detectFromAudioBuffer(buffer: AudioBuffer): Promise<DetectedPitch[]>- Analyze audio bufferasync *detectFromStream(stream: MediaStream): AsyncGenerator<DetectedPitch[]>- Real-time analysisasync processFrame(samples: Float32Array): Promise<DetectedPitch[]>- Process single framedispose(): void- Clean up resources
DetectedPitch
interface DetectedPitch {
frequency: number; // Frequency in Hz
midi: number; // MIDI note number
note: string; // Note name (e.g., "A4")
confidence: number; // Detection confidence (0-1)
clarity: number; // Harmonic clarity (0-1)
timestamp: number; // Time in ms
}Utility Functions
Pitch Conversion:
frequencyToMidi(frequency: number): number- Convert Hz to MIDI notemidiToFrequency(midi: number): number- Convert MIDI note to HzmidiToNoteName(midi: number): string- Convert MIDI note to name
Audio Processing:
mixToMono(left: Float32Array, right: Float32Array): Float32Array- Mix stereo to monocalculateRMS(samples: Float32Array): number- Calculate RMS energynormalizeAudio(samples: Float32Array): Float32Array- Normalize to [-1, 1]applyPreEmphasis(samples: Float32Array, coefficient?: number): Float32Array- Pre-emphasis filter
DSP (Digital Signal Processing):
FFT- Fast Fourier Transform class with forward transform, magnitude, phase, and power spectrumWindowFunction- Window functions (Hann, Hamming, Blackman, Bartlett, Rectangular)SpectralAnalysis- Peak finding, spectral centroid, flux, HPS, autocorrelationnextPowerOfTwo(n: number): number- Get next power of 2isPowerOfTwo(n: number): boolean- Check if number is power of 2
Peak Detection:
PeakDetector- Find fundamental frequencies and spectral peaksfindFundamental(signal, sampleRate, method)- Quick utility to find fundamental frequency- Methods: HPS (Harmonic Product Spectrum), ACF (Autocorrelation)
- Voiced/unvoiced detection and harmonic series generation
TensorFlow.js Integration:
TFJSModelManager- Load and manage TensorFlow.js models for neural network-based pitch detectionTensorUtils- Tensor preprocessing, normalization, and utility functionscreateModelManager(url, options)- Quick helper to create and initialize a model- Backend support: WebGL (GPU), CPU, and WebAssembly
- Automatic tensor memory management with
tf.tidy()
CREPE (Neural Network Pitch Detection):
CREPEModel- CREPE neural network for high-accuracy monophonic pitch detectiondetectPitchCREPE(audio, sampleRate)- Quick utility for CREPE pitch detectionCREPEUtils- Frequency range, MIDI conversion, and CREPE-specific utilities- Supports multiple model sizes (tiny, small, medium, large, full)
- Automatic resampling to 16kHz (CREPE's expected sample rate)
- Sub-bin accuracy with parabolic interpolation
- Optional Viterbi smoothing for temporal coherence
Autocorrelation Fallback
For real-time applications or when CREPE is not available, the package includes a fast autocorrelation-based pitch detector:
import { AutocorrelationDetector, detectPitchAutocorrelation } from '@playground-sessions/pitch-detection-analysis';
// Quick utility function
const frequency = detectPitchAutocorrelation(audioBuffer, 44100);
console.log(`Detected pitch: ${frequency} Hz`);
// Advanced usage with configuration
const detector = new AutocorrelationDetector({
sampleRate: 44100,
minFrequency: 50,
maxFrequency: 2000,
threshold: 0.3,
usePreEmphasis: true,
useCenterClipping: false,
});
const result = detector.detectPitch(audioBuffer);
if (result) {
console.log(`Frequency: ${result.frequency} Hz`);
console.log(`Confidence: ${result.confidence}`);
console.log(`Lag: ${result.lag} samples`);
}Autocorrelation Features
- Fast Performance: Optimized for real-time applications
- Pre-emphasis Filtering: Enhances high-frequency content
- Center Clipping: Reduces noise and improves accuracy
- Octave Error Correction: Handles harmonic relationships
- Sub-sample Accuracy: Parabolic interpolation for precise results
- Batch Processing: Efficient multiple signal analysis
Harmonic Analysis
For complex tones with multiple harmonics, the package includes advanced harmonic analysis:
import { HarmonicAnalyzer, analyzeHarmonics, HarmonicMatching } from '@playground-sessions/pitch-detection-analysis';
// Quick utility function
const result = analyzeHarmonics(audioBuffer, 44100);
if (result) {
console.log(`Fundamental: ${result.fundamental} Hz`);
console.log(`Harmonicity: ${result.harmonicity}`);
console.log(`Harmonics found: ${result.harmonics.length}`);
}
// Advanced usage with configuration
const analyzer = new HarmonicAnalyzer({
sampleRate: 44100,
fftSize: 2048,
maxHarmonics: 8,
useHPS: true,
harmonicTolerance: 0.05,
});
const analysis = analyzer.analyzeHarmonics(audioBuffer);
if (analysis) {
console.log(`Fundamental: ${analysis.fundamental} Hz`);
console.log(`Confidence: ${analysis.confidence}`);
console.log(`Spectral Centroid: ${analysis.spectralCentroid} Hz`);
console.log(`Spectral Rolloff: ${analysis.spectralRolloff} Hz`);
// Access individual harmonics
analysis.harmonics.forEach(harmonic => {
console.log(`Harmonic ${harmonic.harmonicNumber}: ${harmonic.frequency} Hz (strength: ${harmonic.strength})`);
});
}Harmonic Analysis Features
- Harmonic Product Spectrum (HPS): Enhanced fundamental frequency detection
- Harmonic Matching: Identifies harmonic relationships in complex tones
- Spectral Analysis: Advanced spectral feature extraction
- Octave Error Correction: Prevents octave doubling errors
- Harmonicity Calculation: Measures tonal quality
- Spectral Features: Centroid, rolloff, and irregularity analysis
Pitch Tracking
For stable and smooth pitch detection over time, the package includes advanced pitch tracking:
import { PitchTracker, PitchTrackingUtils, trackPitch } from '@playground-sessions/pitch-detection-analysis';
// Quick utility function
const trackedPitch = trackPitch({
frequency: 440,
confidence: 0.8,
clarity: 0.9,
timestamp: Date.now()
});
// Advanced usage with configuration
const tracker = new PitchTracker({
sampleRate: 44100,
smoothingWindow: 5,
medianFilterSize: 3,
outlierThreshold: 0.3,
minConfidence: 0.5,
maxPitchJump: 0.5,
useViterbi: true,
});
// Process pitch detections over time
const result = tracker.processPitch({
frequency: 441,
confidence: 0.85,
clarity: 0.88,
timestamp: Date.now()
});
if (result) {
console.log(`Tracked Frequency: ${result.frequency} Hz`);
console.log(`Confidence: ${result.confidence}`);
console.log(`Stability: ${result.isStable}`);
console.log(`Velocity: ${result.velocity} Hz/frame`);
console.log(`Acceleration: ${result.acceleration} Hz/frame²`);
}
// Analyze pitch trajectory
const pitches = [/* array of TrackedPitch objects */];
const stability = PitchTrackingUtils.calculateStability(pitches, 5);
const jumps = PitchTrackingUtils.detectPitchJumps(pitches, 0.5);
const smoothed = PitchTrackingUtils.smoothTrajectory(pitches, 3);
const stats = PitchTrackingUtils.calculateStatistics(pitches);Pitch Tracking Features
- Temporal Smoothing: Reduces jitter and improves stability
- Median Filtering: Removes outliers and noise
- Outlier Detection: Filters out spurious pitch detections
- Viterbi Algorithm: Advanced tracking with state transitions
- Pitch Velocity: Tracks pitch changes over time
- Pitch Acceleration: Measures rate of pitch change
- Stability Analysis: Determines pitch stability
- Trajectory Smoothing: Smooths pitch trajectories
- Statistical Analysis: Comprehensive pitch statistics
NMF (Non-negative Matrix Factorization)
For polyphonic pitch detection, the package includes advanced NMF algorithms for source separation:
import { NMFAlgorithm, SpectralDictionary, NMFUtils, decomposeNMF } from '@playground-sessions/pitch-detection-analysis';
// Quick utility function
const spectrogram = [/* array of Float32Array magnitude spectra */];
const result = decomposeNMF(spectrogram, {
rank: 4,
maxIterations: 100,
tolerance: 1e-6,
sparsity: 0.1,
smoothness: 0.1,
});
// Advanced usage with configuration
const nmf = new NMFAlgorithm({
rank: 4,
maxIterations: 100,
tolerance: 1e-6,
sparsity: 0.1,
smoothness: 0.1,
useMultiplicative: true,
useAlternating: false,
useKullbackLeibler: false,
useEuclidean: true,
randomSeed: 42,
});
// Perform NMF decomposition
const nmfResult = nmf.decompose(spectrogram);
// Extract pitch components
const pitchComponents = NMFUtils.extractPitchComponents(nmfResult);
// Group components by pitch
const pitchGroups = NMFUtils.groupComponentsByPitch(pitchComponents);
// Learn spectral dictionary from training data
const trainingData = [/* array of spectrogram samples */];
const dictionary = SpectralDictionary.learnDictionary(trainingData, 50);
// Separate sources using dictionary
const separatedSources = NMFUtils.separateSources(spectrogram, dictionary);NMF Features
- Source Separation: Separates multiple simultaneous pitches
- Spectral Dictionary Learning: Learns spectral patterns from training data
- Component Analysis: Extracts individual pitch components with metadata
- Pitch Grouping: Groups components by pitch class
- Similarity Analysis: Calculates component similarity
- Configurable Algorithms: Multiple NMF algorithms (multiplicative, alternating least squares)
- Sparsity Control: Regularization for better separation
- Smoothness Control: Temporal smoothness regularization
- Convergence Control: Configurable tolerance and iteration limits
Development
# Install dependencies
npm install
# Run tests
npm test
# Build
npm run build
# Dev server
npm run devLicense
MIT
