npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

chatdio

v1.2.4

Published

Web audio library for conversational AI with mic input, device management, WebSocket streaming, and real-time visualization

Downloads

79

Readme

chatdio

A modern Web Audio library for building conversational AI interfaces. Handles microphone capture, audio playback, device management, WebSocket streaming, and real-time visualization — all with cross-browser support (Chrome, Firefox, Safari).

Features

  • 🎙️ Microphone Capture with echo cancellation, noise suppression, and auto gain control
  • 🔊 Audio Playback with buffering, volume control, and seamless queuing
  • 📱 Device Management with hot-plug detection and automatic fallback
  • 🌐 WebSocket Streaming with auto-reconnection and binary/JSON modes
  • 📊 Real-time Visualization data for level meters and waveforms
  • 🎚️ Sample Rate & Bit Depth conversion (8/16/24/32-bit, 8kHz-48kHz)
  • 🔇 Barge-in Support for interrupting AI responses

Installation

npm install chatdio

Quick Start

import { Chatdio } from 'chatdio';

// Create instance with configuration
const audio = new Chatdio({
  microphone: {
    sampleRate: 16000,
    echoCancellation: true,
    noiseSuppression: true,
  },
  playback: {
    sampleRate: 24000,
    bitDepth: 16,
  },
  websocket: {
    url: 'wss://your-ai-server.com/audio',
    autoReconnect: true,
  },
});

// Initialize (must be called from a user gesture)
document.querySelector('#startBtn')?.addEventListener('click', async () => {
  await audio.initialize();
  
  // Start full-duplex conversation
  await audio.startConversation();
});

// Handle events
audio.on('mic:activity', (data) => {
  console.log('Mic level:', data.volume, 'Speaking:', data.isSpeaking);
});

audio.on('playback:activity', (data) => {
  console.log('Playback level:', data.volume);
});

audio.on('ws:connected', () => {
  console.log('Connected to AI server');
});

audio.on('ws:message', (message) => {
  console.log('Received message:', message);
});

Core Components

Chatdio

The main orchestrator that ties everything together.

const audio = new Chatdio({
  microphone: { /* MicrophoneConfig */ },
  playback: { /* PlaybackConfig */ },
  websocket: { /* WebSocketConfig */ },
  deviceManager: { /* DeviceManagerConfig */ },
  activityAnalyzer: { /* ActivityAnalyzerConfig */ },
});

// Lifecycle
await audio.initialize();      // Initialize (from user gesture)
await audio.startConversation(); // Start mic + websocket
audio.stopConversation();      // Stop mic + playback
audio.dispose();               // Cleanup resources

// Turn management (barge-in / interruption)
const turnId = audio.startTurn();           // Start new turn, interrupt any playing audio
audio.interruptTurn();                       // Interrupt current turn, start new one
audio.interruptTurn(false);                  // Interrupt without starting new turn
audio.getCurrentTurnId();                    // Get current turn ID
audio.clearTurnBuffer(turnId);               // Clear buffered audio for a turn
await audio.playAudioForTurn(data, turnId);  // Play only if turn is current

// Device selection
audio.getInputDevices();       // List microphones
audio.getOutputDevices();      // List speakers
await audio.setInputDevice(deviceId);
await audio.setOutputDevice(deviceId);

// Volume control
audio.setVolume(0.8);
audio.getVolume();

// Mute
audio.setMicrophoneMuted(true);
audio.isMicrophoneMuted();

MicrophoneCapture

Standalone microphone capture with resampling and format conversion.

import { MicrophoneCapture } from 'chatdio';

const mic = new MicrophoneCapture({
  sampleRate: 16000,          // Output sample rate
  echoCancellation: true,
  noiseSuppression: true,
  autoGainControl: true,
  bufferSize: 2048,           // Processing buffer size
});

mic.on('data', (pcmData: ArrayBuffer) => {
  // 16-bit PCM audio data ready to send
  websocket.send(pcmData);
});

mic.on('level', (level: number) => {
  updateMeter(level);
});

await mic.start();
// ...
mic.stop();

AudioPlayback

Buffered audio playback with queue management.

import { AudioPlayback } from 'chatdio';

const playback = new AudioPlayback({
  sampleRate: 24000,
  bitDepth: 16,
  channels: 1,
  bufferAhead: 0.1,  // Buffer ahead time in seconds
});

await playback.initialize();

// Queue audio chunks as they arrive
playback.on('buffer-low', () => {
  console.log('Buffer running low');
});

playback.on('ended', () => {
  console.log('Finished playing all audio');
});

// Queue PCM data
await playback.queueAudio(pcmArrayBuffer);

// Control playback
playback.pause();
await playback.resume();
playback.stop();
playback.setVolume(0.8);

AudioDeviceManager

Device enumeration with change detection.

import { AudioDeviceManager } from 'chatdio';

const deviceManager = new AudioDeviceManager({
  autoFallback: true,    // Auto-switch on device disconnect
  pollInterval: 1000,    // Fallback polling interval
});

await deviceManager.initialize();

// List devices
deviceManager.getInputDevices();
deviceManager.getOutputDevices();

// Select devices
await deviceManager.setInputDevice(deviceId);
await deviceManager.setOutputDevice(deviceId);

// Listen for changes
deviceManager.on('devices-changed', (devices) => {
  updateDeviceList(devices);
});

deviceManager.on('device-disconnected', (device) => {
  console.log('Device disconnected:', device.label);
});

// Check Safari compatibility
if (!deviceManager.isOutputSelectionSupported()) {
  console.log('Output selection not supported (Safari)');
}

WebSocketBridge

WebSocket connection with auto-reconnection.

import { WebSocketBridge } from 'chatdio';

const ws = new WebSocketBridge({
  url: 'wss://ai-server.com/audio',
  autoReconnect: true,
  maxReconnectAttempts: 5,
  reconnectDelay: 1000,
  binaryMode: true,
  
  // Custom message wrapping
  wrapOutgoingAudio: (data) => {
    return JSON.stringify({
      type: 'audio',
      data: btoa(String.fromCharCode(...new Uint8Array(data))),
    });
  },
  
  // Custom message parsing
  parseIncomingAudio: (event) => {
    const msg = JSON.parse(event.data);
    if (msg.type === 'audio') {
      return base64ToArrayBuffer(msg.data);
    }
    return null;
  },
});

ws.on('connected', () => console.log('Connected'));
ws.on('disconnected', (code, reason) => console.log('Disconnected:', reason));
ws.on('reconnecting', (attempt) => console.log('Reconnecting...', attempt));
ws.on('audio', (data) => playback.queueAudio(data));
ws.on('message', (msg) => console.log('Message:', msg));

await ws.connect();
ws.sendAudio(pcmData);
ws.sendMessage({ type: 'transcript', text: 'Hello' });
ws.disconnect();

ActivityAnalyzer

Real-time audio analysis for visualizations.

import { ActivityAnalyzer, VisualizationUtils } from 'chatdio';

const analyzer = new ActivityAnalyzer({
  fftSize: 256,
  smoothingTimeConstant: 0.8,
  updateInterval: 50,  // ms
});

// Connect to an audio node
analyzer.connect(micCapture.getAnalyzerNode());
analyzer.start();

// Listen for activity updates
analyzer.on('activity', (data) => {
  // data.volume - RMS volume (0-1)
  // data.peak - Peak level with decay (0-1)
  // data.frequencyData - Uint8Array for spectrum
  // data.timeDomainData - Uint8Array for waveform
  // data.isSpeaking - Voice activity detection
  
  drawWaveform(data.timeDomainData);
  drawSpectrum(data.frequencyData);
});

analyzer.on('speaking-start', () => console.log('Started speaking'));
analyzer.on('speaking-stop', () => console.log('Stopped speaking'));

// Utility functions for visualization
const bands = analyzer.getFrequencyBands(8);  // Get 8 frequency bands
const waveformPath = VisualizationUtils.createWaveformPath(data.timeDomainData, 200, 50);
const barHeights = VisualizationUtils.createBarHeights(data.frequencyData, 16, 100);

Events

Chatdio Events

| Event | Payload | Description | |-------|---------|-------------| | mic:start | - | Microphone started | | mic:stop | - | Microphone stopped | | mic:data | ArrayBuffer | PCM audio data | | mic:activity | AudioActivityData | Mic visualization data | | mic:error | Error | Microphone error | | playback:start | - | Playback started | | playback:stop | - | Playback stopped | | playback:ended | - | All queued audio finished | | playback:activity | AudioActivityData | Playback visualization data | | playback:error | Error | Playback error | | ws:connected | - | WebSocket connected | | ws:disconnected | code, reason | WebSocket disconnected | | ws:reconnecting | attempt | Reconnection attempt | | ws:audio | ArrayBuffer | Audio received from server | | ws:message | unknown | Non-audio message received | | ws:error | Error | WebSocket error | | device:changed | AudioDevice[] | Device list changed | | device:input-changed | AudioDevice \| null | Input device changed | | device:output-changed | AudioDevice \| null | Output device changed | | device:disconnected | AudioDevice | Device disconnected | | turn:started | turnId, previousTurnId | New turn started | | turn:interrupted | turnId | Turn was interrupted (barge-in) | | turn:ended | turnId | Turn ended normally |

Turn Management (Barge-in)

Turn management allows you to handle conversation interruptions cleanly. When the user speaks while the AI is responding (barge-in), you can:

  1. Stop current playback immediately
  2. Clear any buffered audio
  3. Ignore any late-arriving audio from the interrupted turn
// Start a conversation turn when AI begins responding
const turnId = audio.startTurn();
console.log('Started turn:', turnId);

// When user interrupts (detected via voice activity or button)
audio.on('mic:activity', (data) => {
  if (data.isSpeaking && audio.isPlaybackActive()) {
    // User is speaking while AI is talking - barge-in!
    const { interruptedTurnId, newTurnId } = audio.interruptTurn();
    console.log('Interrupted turn:', interruptedTurnId);
    console.log('New turn:', newTurnId);
  }
});

// Server sends audio with turn ID
audio.on('ws:message', async (message) => {
  if (message.type === 'audio') {
    // Only play if turn matches - old audio is automatically ignored
    const played = await audio.playAudioForTurn(message.data, message.turnId);
    if (!played) {
      console.log('Ignored audio from old turn:', message.turnId);
    }
  }
});

// Listen for turn events
audio.on('turn:started', (turnId, previousTurnId) => {
  console.log('Turn started:', turnId, 'Previous:', previousTurnId);
});

audio.on('turn:interrupted', (turnId) => {
  console.log('Turn interrupted:', turnId);
  // Notify server to stop generating audio for this turn
  audio.sendMessage({ type: 'interrupt', turnId });
});

audio.on('turn:ended', (turnId) => {
  console.log('Turn ended naturally:', turnId);
});

Server-Side Turn ID Support

When your server sends audio, include a turnId in JSON messages:

{
  "type": "audio",
  "data": "base64_encoded_audio...",
  "turnId": "turn_123456789_1"
}

Or use a custom parser to extract the turn ID:

const audio = new Chatdio({
  websocket: {
    url: 'wss://your-server.com/audio',
    parseIncomingAudio: (event) => {
      const msg = JSON.parse(event.data);
      if (msg.type === 'audio') {
        return {
          data: base64ToArrayBuffer(msg.audio),
          turnId: msg.turn_id,  // Your server's turn ID field
        };
      }
      return null;
    },
  },
});

Type Definitions

interface AudioFormat {
  sampleRate: 8000 | 16000 | 22050 | 24000 | 44100 | 48000;
  bitDepth: 8 | 16 | 24 | 32;
  channels: 1 | 2;
}

interface AudioDevice {
  deviceId: string;
  label: string;
  kind: 'audioinput' | 'audiooutput';
  isDefault: boolean;
}

interface AudioActivityData {
  volume: number;
  peak: number;
  frequencyData: Uint8Array;
  timeDomainData: Uint8Array;
  isSpeaking: boolean;
}

type ConnectionState = 'disconnected' | 'connecting' | 'connected' | 'reconnecting' | 'error';

Browser Compatibility

| Feature | Chrome | Firefox | Safari | |---------|--------|---------|--------| | Mic Capture | ✅ | ✅ | ✅ | | Echo Cancellation | ✅ | ✅ | ✅ | | Audio Playback | ✅ | ✅ | ✅ | | Output Device Selection | ✅ | ✅ | ❌ | | Device Change Detection | ✅ | ✅ | Via polling |

Notes

  • User Gesture Required: initialize() and startMicrophone() must be called from a user interaction (click, touch) in Safari and Firefox
  • Safari Output: Output device selection (setSinkId) is not supported in Safari; audio plays through the default device
  • Echo Cancellation: Browser implementations vary; Chrome generally has the best echo cancellation
  • Sample Rates: Native sample rate depends on the audio device; resampling is done in JavaScript when needed

iOS Compatibility

iOS Safari has strict requirements for audio playback. To ensure audio works on iPhone/iPad:

  1. Call unlockAudio() from a user gesture (click/touch handler):
// IMPORTANT: Call this directly from a button click or touch event
startButton.addEventListener('click', async () => {
  await audio.initialize();
  await audio.unlockAudio();  // Unlocks iOS audio
  await audio.startConversation();
});
  1. Why this is needed: iOS Safari requires audio to be "unlocked" by playing audio directly in response to a user gesture. The unlockAudio() method plays a tiny silent buffer which enables subsequent programmatic audio playback.

  2. Common pitfall: If you initialize audio on page load or from a non-user-gesture context (like a setTimeout or Promise resolution), audio playback will fail silently on iOS.

  3. The unlockAudio() method:

    • Resumes the AudioContext if suspended
    • Plays a silent buffer to unlock iOS audio
    • Starts the audio element if using output device selection
    • Should be called once per session, from a user gesture

License

MIT