dance-ai
v0.1.0
Published
Real-time musical bar-phase estimation from live audio streams (ONNX, browser & node)
Maintainers
Readme
dance-ai
Real-time musical bar-phase estimation from live audio streams. Runs the dance models via ONNX in the browser (onnxruntime-web) or node (onnxruntime-node).
Per audio frame (≈16.7 ms) the model estimates:
phase— position inside the current bar, in[0, 1)(0 = bar start)barDurationS— predicted bar duration in seconds (i.e. tempo)expectedPhaseError— calibrated uncertainty, in bars
There is deliberately no anticipation input: extrapolate phase forward
yourself under constant tempo using barDurationS (or use the included
PhaseExtrapolator).
Why this package lives in the model repo
Inference must match training exactly: same mel frontend (exact-size DFT —
no power-of-2 padding!), same causal audio context between calls, same level
normalization. The TypeScript implementation mirrors ../streaming.py (the
Python reference) and is verified against golden vectors generated from the
training code (npm test).
Getting a model
# in the dance repo
venv/bin/python export_onnx.py checkpoints/<experiment>/<tag>/<epoch>.pt
# produces <epoch>.onnx + <epoch>.meta.json — serve both with your appUsage (browser)
import * as ort from 'onnxruntime-web'
import {
DancePhaseEstimator,
PhaseExtrapolator,
createAudioProcessorUrl,
AUDIO_PROCESSOR_NAME,
} from 'dance-ai'
const estimator = await DancePhaseEstimator.create({
ort,
model: '/model.onnx',
meta: '/model.meta.json', // hyperparameters travel with the model
})
const clock = new PhaseExtrapolator({ smoothing: 0.5 })
// capture mono audio at estimator.meta.samplerate (24 kHz), e.g. via the
// included AudioWorklet helper:
const audioContext = new AudioContext({ sampleRate: estimator.meta.samplerate })
await audioContext.audioWorklet.addModule(createAudioProcessorUrl())
const node = new AudioWorkletNode(audioContext, AUDIO_PROCESSOR_NAME, {
processorOptions: { frameSize: estimator.meta.frame_size },
})
node.port.onmessage = async (event) => {
const estimates = await estimator.feed(event.data as Float32Array)
for (const e of estimates) {
clock.update(e.phase, e.barDurationS)
}
}
// render loop
function render() {
const phase = clock.phaseAt() // smooth, extrapolated, in [0, 1)
// ...
requestAnimationFrame(render)
}Call estimator.reset() whenever the audio source (re)starts or switches —
it clears the recurrent state, the mel audio context, the level normalizer
and the sample buffer.
Development
npm install
npm run build
# regenerate golden vectors after changing the Python frontend:
(cd .. && venv/bin/python dance-ai/scripts/gen_test_vectors.py)
npm test