npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

browser-voice

v0.1.0

Published

Browser-first voice capture, playback, and WebSocket transport for realtime apps.

Readme

browser-voice

Browser-first voice capture, playback, and WebSocket transport for realtime apps.

This package is a small TypeScript library for:

  • capturing microphone audio in the browser
  • sending audio over plain WebSocket in configurable formats
  • receiving remote audio over plain WebSocket in configurable formats
  • playing remote audio with Web Audio controls
  • exchanging custom JSON/control events alongside audio

It is intentionally simpler than a full realtime media SDK:

  • no room / participant / publication model
  • no WebRTC signaling
  • no SFU assumptions
  • no vendor-specific backend contract

Status

Experimental, but usable.

The API is designed to stay small and explicit. Expect additive evolution while the transport and browser-compatibility edges continue to harden.

Inspiration

This library is inspired by, and honestly vibe-coded from, LiveKit's WebSocket / voice implementation patterns, especially the browser audio, transport, and recovery ideas in livekit-client.

It is not a port of LiveKit, and it does not depend on LiveKit at runtime.

Attribution and license notes

This package is licensed under Apache-2.0.

Because the implementation is heavily inspired by, and in a few places adapted from, upstream open-source work, the package also includes:

  • NOTICE
  • CREDITS.md
  • THIRD_PARTY_NOTICES.md

These files document upstream attributions and third-party notices, including:

  • LiveKit client SDK inspiration / Apache-2.0 notice context
  • ts-debounce MIT attribution for the adapted debounce helper

Why this package exists

Use browser-voice when you want browser voice behavior inspired by larger realtime SDKs without adopting a full room/signaling model.

Typical use cases:

  • browser client -> custom .NET / Node / Python voice backend over WebSocket
  • raw PCM16 or JSON audio payloads instead of WebRTC
  • custom control events mixed with audio on the same socket
  • Azure / OpenAI / bespoke realtime backends

Features

  • microphone capture defaults tuned for voice
  • autoplay-safe AudioContext handling
  • explicit startAudio() playback unlock flow
  • optional pre-connect audio buffering
  • automatic capture recovery for ended / missing microphone tracks
  • debounced media-device observation
  • configurable incoming and outgoing audio formats
  • reconnect backoff for plain WebSocket sessions
  • backpressure-aware frame queueing
  • optional capture-side noise suppression processor
  • voice activity tracking
  • remote playback analyser for visualizers
  • remote playback gain / EQ / limiter hooks
  • custom JSON event send / receive support

Install

npm install browser-voice tslib

or

pnpm add browser-voice tslib

Runtime requirements

  • modern browser with:
    • MediaDevices.getUserMedia
    • WebSocket
    • AudioContext
  • Node.js >= 20.19.0 for package development, docs generation, and demo tooling

This is a browser-first library. It is not intended to capture or play audio directly in Node.js.

Quick start

import {
  NoiseSuppressionProcessor,
  PcmAudioPlayer,
  VoiceCapture,
  VoiceWebSocket,
} from 'browser-voice';

const capture = new VoiceCapture({
  autoRecover: true,
  preConnectBufferMs: 1500,
  processor: new NoiseSuppressionProcessor(),
  targetSampleRate: 24000,
  targetChannelCount: 1,
});

const player = new PcmAudioPlayer({
  initialBufferMs: 120,
});

const voiceSocket = new VoiceWebSocket({
  url: 'wss://example.com/voice',
  capture,
  player,
  autoReconnect: true,
  incomingAudioFormat: 'raw-pcm16',
  incomingAudioFormatOptions: {
    sampleRate: 24000,
    channels: 1,
  },
  outgoingAudioFormat: 'raw-pcm16',
  onJsonEvent: ({ parsed }) => {
    console.log('control event', parsed);
  },
});

await voiceSocket.connect();
await player.startAudio();
await capture.start();

voiceSocket.sendJsonEvent({
  type: 'ping',
  timestamp: Date.now(),
});

Transport formats

Incoming audio formats

VoiceWebSocket supports:

  • framed-pcm16
  • raw-pcm16
  • json
  • auto

Outgoing audio formats

VoiceWebSocket supports:

  • framed-pcm16
  • raw-pcm16
  • json

Default framed PCM layout

The default framed-pcm16 binary format is:

  • 4 bytes: sample rate (uint32, little-endian)
  • 2 bytes: channel count (uint16, little-endian)
  • 2 bytes: reserved flags (uint16, little-endian)
  • remaining bytes: signed PCM16 payload

raw-pcm16

Use raw-pcm16 when the backend expects plain binary PCM16 with no custom library header.

Important:

  • outgoing browser audio is sent as raw PCM16 bytes only
  • incoming binary audio is interpreted using incomingAudioFormatOptions
  • text / JSON messages are treated as non-audio and routed to onJsonEvent

json

Expected audio JSON examples:

{
  "sampleRate": 24000,
  "channels": 1,
  "pcm16": [100, -200, 300]
}

or

{
  "sampleRate": 24000,
  "channels": 1,
  "pcm16Base64": "..."
}

Non-audio JSON is routed to onJsonEvent.

auto

auto is for mixed transports. It will:

  • try framed PCM first for binary audio
  • accept JSON-wrapped audio
  • ignore or route non-audio JSON to onJsonEvent
  • fall back to raw PCM16 when incomingAudioFormatOptions are configured

Custom control / data events

You can send custom JSON to the backend independently of the audio format:

voiceSocket.sendJsonEvent({
  type: 'assistant.reset',
  correlationId: '123',
});

You can also send plain text:

voiceSocket.sendTextMessage('ping');

On inbound messages, use:

const socket = new VoiceWebSocket({
  // ...
  onJsonEvent: ({ format, parsed, rawText }) => {
    console.log(format, parsed, rawText);
  },
});

Audio behavior notes

Silence and server VAD

If your backend depends on server-side VAD, do not drop silent outgoing frames unless you are also manually committing turns.

skipSilentFrames can reduce bandwidth, but it can also prevent backends like Azure server VAD from detecting end-of-speech.

Noise suppression

The library uses two layers:

  • browser-native constraints such as echoCancellation, noiseSuppression, autoGainControl, and voiceIsolation
  • an optional NoiseSuppressionProcessor that applies lightweight browser-side filtering / gating / compression

The built-in processor is not a full acoustic echo canceller. It complements browser voice processing; it does not replace it.

Playback effects

PcmAudioPlayer supports:

  • setGain() / setVolume()
  • setEqualizer() for low / mid / high EQ shelves
  • setLimiter() for a built-in limiter / dynamics-compressor setup
  • setProcessorChain() for custom AudioNode[]
  • getAnalyser() for visualizers

Example:

const player = new PcmAudioPlayer();

player.setEqualizer({
  lowDb: 2,
  midDb: -1,
  highDb: 1,
});

player.setLimiter({
  enabled: true,
  threshold: -8,
  ratio: 12,
});

const analyser = player.getAnalyser();
const volume = analyser.getVolume();
const bars = analyser.getFrequencyBands(16);

Demo

Run the local demo server:

pnpm demo

Then open the printed URL in your browser.

The demo includes:

  • transport format selection
  • sample rate / channel configuration
  • mic capture
  • playback unlock
  • playback visualizer
  • gain / EQ / limiter controls
  • custom JSON event sender
  • mixed text + binary relay through the demo server

API overview

Main exports:

  • VoiceCapture
  • VoiceWebSocket
  • PcmAudioPlayer
  • AudioPlaybackManager
  • PlaybackAnalyser
  • NoiseSuppressionProcessor
  • VoiceActivityDetector
  • observeMediaDevices
  • BackoffStrategy
  • debounce

Development

pnpm build
pnpm run docs:api
pnpm lint
pnpm test
pnpm coverage

Generated API docs are written to docs/api/.

Coverage reports are written to coverage/.

If demo behavior changes, also run:

node --check packages/browser-voice/demo/server.mjs
node --check packages/browser-voice/demo/public/app.js

License

Apache-2.0