dance-ai

v0.1.0

Published

4 days ago

Real-time musical bar-phase estimation from live audio streams (ONNX, browser & node)

0High
0Medium
0Low

felix-niemeyer

music beat-tracking bar-phase tempo onnx realtime audio

dance-ai

Real-time musical bar-phase estimation from live audio streams. Runs the dance models via ONNX in the browser (onnxruntime-web) or node (onnxruntime-node).

Per audio frame (≈16.7 ms) the model estimates:

phase — position inside the current bar, in [0, 1) (0 = bar start)
barDurationS — predicted bar duration in seconds (i.e. tempo)
expectedPhaseError — calibrated uncertainty, in bars

There is deliberately no anticipation input: extrapolate phase forward yourself under constant tempo using barDurationS (or use the included PhaseExtrapolator).

Why this package lives in the model repo

Inference must match training exactly: same mel frontend (exact-size DFT — no power-of-2 padding!), same causal audio context between calls, same level normalization. The TypeScript implementation mirrors ../streaming.py (the Python reference) and is verified against golden vectors generated from the training code (npm test).

Getting a model

# in the dance repo
venv/bin/python export_onnx.py checkpoints/<experiment>/<tag>/<epoch>.pt
# produces <epoch>.onnx + <epoch>.meta.json — serve both with your app

Usage (browser)

import * as ort from 'onnxruntime-web'
import {
  DancePhaseEstimator,
  PhaseExtrapolator,
  createAudioProcessorUrl,
  AUDIO_PROCESSOR_NAME,
} from 'dance-ai'

const estimator = await DancePhaseEstimator.create({
  ort,
  model: '/model.onnx',
  meta: '/model.meta.json',   // hyperparameters travel with the model
})
const clock = new PhaseExtrapolator({ smoothing: 0.5 })

// capture mono audio at estimator.meta.samplerate (24 kHz), e.g. via the
// included AudioWorklet helper:
const audioContext = new AudioContext({ sampleRate: estimator.meta.samplerate })
await audioContext.audioWorklet.addModule(createAudioProcessorUrl())
const node = new AudioWorkletNode(audioContext, AUDIO_PROCESSOR_NAME, {
  processorOptions: { frameSize: estimator.meta.frame_size },
})
node.port.onmessage = async (event) => {
  const estimates = await estimator.feed(event.data as Float32Array)
  for (const e of estimates) {
    clock.update(e.phase, e.barDurationS)
  }
}

// render loop
function render() {
  const phase = clock.phaseAt()   // smooth, extrapolated, in [0, 1)
  // ...
  requestAnimationFrame(render)
}

Call estimator.reset() whenever the audio source (re)starts or switches — it clears the recurrent state, the mel audio context, the level normalizer and the sample buffer.

Development

npm install
npm run build
# regenerate golden vectors after changing the Python frontend:
(cd .. && venv/bin/python dance-ai/scripts/gen_test_vectors.py)
npm test

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

dance-ai

Why this package lives in the model repo

Getting a model

Usage (browser)

Development