npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@pinecall/voice-core

v0.2.6

Published

Framework-agnostic WebRTC voice session client for Pinecall agents

Readme


Table of Contents


Install

npm install @pinecall/voice-core

Zero runtime dependencies. Browser-only (requires RTCPeerConnection, getUserMedia).


Quick Start

import { VoiceSession } from "@pinecall/voice-core";

const session = new VoiceSession({ agent: "mara" });

// React-style: subscribe to all state changes
session.subscribe(() => {
  const { status, phase, messages } = session.getState();
  console.log(status, phase, messages);
});

// Or event-style: listen to specific events
session.addEventListener("message", (e) => {
  console.log("New message:", e.detail.message);
});

session.addEventListener("event", (e) => {
  // Raw DataChannel event from the server
  console.log("Raw event:", e.detail);
});

await session.connect();

// Later...
session.disconnect();

API Reference

new VoiceSession(options)

Creates a new voice session instance. Does not connect automatically.

interface VoiceSessionOptions {
  /** Agent ID to connect to */
  agent: string;
  /**
   * Pinecall API base URL for token exchange.
   * Default: "https://voice.pinecall.io"
   * Only override for self-hosted deployments.
   */
  server?: string;
  /**
   * Session config overrides sent in the WebRTC offer body.
   * Use this for per-session voice, STT, language, turn detection, and greeting.
   * Format follows Pinecall's shortcut syntax (resolved server-side).
   *
   * @example
   * ```ts
   * config: {
   *   voice: "elevenlabs:EXAVITQu4vr4xnSDxMaL",
   *   stt: { provider: "deepgram", model: "nova-3", language: "es" },
   *   language: "es",
   *   turnDetection: "smart_turn",
   *   greeting: "¡Hola! ¿En qué puedo ayudarte?",
   * }
   * ```
   */
  config?: Record<string, unknown>;
  /** Metadata passed to the agent (available in call.metadata server-side) */
  metadata?: Record<string, unknown>;
}

Methods

session.connect(): Promise<void>

Initiates the WebRTC connection. Full flow:

  1. Fetches a short-lived token from GET /webrtc/token?agent_id=<agent>
  2. Fetches ICE servers from GET /webrtc/ice-servers (falls back to Google STUN)
  3. Requests microphone access (getUserMedia)
  4. Creates RTCPeerConnection, adds mic track, creates DataChannel
  5. Generates SDP offer, gathers ICE candidates
  6. Sends offer to POST /webrtc/offer with token
  7. Sets remote SDP answer → connection established

State transitions: idleconnectingconnected (or error).

await session.connect();

session.disconnect(): void

Closes the WebRTC connection, stops the microphone, clears timers. State returns to idle. Messages are preserved.

session.disconnect();

session.toggleMute(): void

Toggles the microphone. When muted, the audio track is disabled and a { action: "mute" } message is sent to the server via DataChannel so it stops processing audio server-side too.

session.toggleMute();

session.setMuted(muted: boolean): void

Explicit mute/unmute control.

session.setMuted(true);  // mute
session.setMuted(false); // unmute

session.getState(): Readonly<VoiceSessionState>

Returns the current state snapshot. The reference is stable — it only changes when state mutates (safe for useSyncExternalStore).

const { status, phase, messages, isMuted, duration } = session.getState();

session.subscribe(listener): () => void

Subscribes to all state changes. Returns an unsubscribe function. Designed for React's useSyncExternalStore.

const unsub = session.subscribe(() => {
  console.log(session.getState());
});

// Later:
unsub();

session.destroy(): void

Disconnects, clears all subscribers, and makes the instance unusable. Call this on component unmount.

session.destroy();

session.configure(config): void

Sends a mid-call configuration update via DataChannel. Use this for live language/voice/STT switching during an active call. The server will hot-swap providers without disconnecting.

// Switch to Spanish mid-call
session.configure({
  voice: "elevenlabs:h2cd3gvcqTp3m65Dysk7",
  stt: { provider: "deepgram", model: "nova-3", language: "es" },
  language: "es",
});

session.updateOptions(patch): void

Updates session options before the next connect() call. Has no effect on an already-connected session — use configure() for that.

// Pre-connect: user selected Spanish in the UI
session.updateOptions({
  config: {
    voice: "elevenlabs:h2cd3gvcqTp3m65Dysk7",
    language: "es",
    greeting: "¡Hola!",
  },
});

await session.connect(); // will use the Spanish config

State

interface VoiceSessionState {
  /** Connection status */
  status: "idle" | "connecting" | "connected" | "error";
  /** Error message (when status is "error") */
  error: string | null;
  /** Whether the microphone is muted */
  isMuted: boolean;
  /** Current call phase — what the conversation is doing right now */
  phase: "idle" | "listening" | "speaking" | "pause" | "thinking";
  /** Whether the user is currently speaking (VAD/STT active) */
  userSpeaking: boolean;
  /** Whether the agent is currently speaking (TTS playing) */
  agentSpeaking: boolean;
  /** Call duration in seconds (updates every second) */
  duration: number;
  /** Full conversation transcript — user + bot messages */
  messages: TranscriptMessage[];
  /** Idle warning countdown — seconds until timeout (null if no warning active) */
  idleWarning: number | null;
}

Call Phases

| Phase | Meaning | Triggered by | |-------|---------|-------------| | idle | Not in a call | Initial state, after disconnect | | listening | Mic is hot, waiting for speech | Connection established, after bot finishes, after turn.resumed | | speaking | Agent is speaking (TTS playing) | First bot.word event | | thinking | Processing user input, waiting for LLM | user.message (STT final), turn.end | | pause | Turn detection pause — user may still be talking | turn.pause (brief silence detected) |

Transcript Messages

interface TranscriptMessage {
  /** Unique ID (timestamp-based) */
  id: number;
  /** Who said it */
  role: "user" | "bot";
  /** The text content */
  text: string;
  /** User only: STT is still processing (partial result) */
  isInterim?: boolean;
  /** Bot only: TTS is currently playing this message */
  speaking?: boolean;
  /** Bot only: user interrupted before the message finished */
  interrupted?: boolean;
  /** Bot only: server-assigned ID for word-by-word tracking */
  messageId?: string;
}

Message lifecycle — User:

  1. user.speaking → creates message with isInterim: true, text updates as STT refines
  2. user.message → sets isInterim: false with final text

Message lifecycle — Bot:

  1. bot.speaking → creates empty message with speaking: true
  2. bot.word (×N) → text builds word-by-word as TTS plays each word
  3. bot.finished → sets speaking: false, optionally replaces text with final version
  4. bot.interrupted → sets speaking: false, interrupted: true (user barged in)

Events (EventTarget)

VoiceSession extends EventTarget. You can listen to typed custom events:

"status" — Connection status changed

session.addEventListener("status", (e: CustomEvent) => {
  console.log(e.detail.status); // "idle" | "connecting" | "connected" | "error"
});

"phase" — Call phase changed

session.addEventListener("phase", (e: CustomEvent) => {
  console.log(e.detail.phase); // "idle" | "listening" | "speaking" | "pause" | "thinking"
});

"message" — Transcript message added or updated

Fires when a new message is added or an existing one is updated (partial STT, word-by-word bot text).

session.addEventListener("message", (e: CustomEvent) => {
  const msg = e.detail.message; // TranscriptMessage
  if (msg.role === "user" && !msg.isInterim) {
    console.log("User said:", msg.text);
  }
  if (msg.role === "bot" && !msg.speaking) {
    console.log("Bot finished saying:", msg.text);
  }
});

"error" — An error occurred

session.addEventListener("error", (e: CustomEvent) => {
  console.error("Voice error:", e.detail.error);
});

"change" — Any state change

Fires on every state mutation. The full state is in e.detail.state.

session.addEventListener("change", (e: CustomEvent) => {
  const state = e.detail.state; // VoiceSessionState
});

"event" — Raw DataChannel event

This is the power-user event. Every JSON message from the server's DataChannel is forwarded as-is. Use this to access events that the state machine doesn't expose — like tool calls, function results, audio metrics, or custom events your agent emits.

session.addEventListener("event", (e: CustomEvent) => {
  const raw = e.detail; // any — the raw JSON from the server
  console.log(raw.event, raw);
});

DataChannel Protocol

The WebRTC DataChannel ("events", ordered) carries JSON messages between client and server. The client sends pings and mute/unmute commands. The server sends the following events:

Speech Detection (STT)

| Event | Fields | Description | |-------|--------|-------------| | speech.started | — | User started physically speaking (VAD detected voice) | | speech.ended | — | User stopped speaking (VAD silence) | | user.speaking | text | STT partial/interim result — text may change | | user.message | text | STT final result — text is locked, turn is over |

Turn Detection

| Event | Fields | Description | |-------|--------|-------------| | turn.pause | — | Brief silence detected — user might still be talking | | turn.end | — | Silence confirmed — user's turn is over, LLM starts | | turn.resumed | — | User started speaking again during the pause |

Bot Speech (TTS)

| Event | Fields | Description | |-------|--------|-------------| | bot.speaking | message_id, text | TTS generation started. text has the full response but the widget intentionally starts empty and builds word-by-word. | | bot.word | message_id, word, word_index | A single word was spoken by TTS. Arrives in real-time as audio plays. | | bot.finished | message_id, text | TTS completed normally. text has the final complete response. | | bot.interrupted | message_id | User barged in — TTS was cut short. |

Audio Metrics

| Event | Fields | Description | |-------|--------|-------------| | audio.metrics | source, is_speech, level | Server-side audio analysis. source is "user" or "bot". |

LLM / Tool Events (via "event" listener)

These events are not processed by the state machine but are forwarded through the "event" listener. They come from the Pinecall pipeline's LLM handler:

| Event | Fields | Description | |-------|--------|-------------| | llm.thinking | — | LLM started generating a response | | llm.tool_call | tool_name, arguments, call_id | LLM requested a tool/function call | | llm.tool_result | call_id, result | Tool execution result sent back to LLM | | llm.response | text, finish_reason | LLM finished generating (text may be empty if tool-only) | | llm.error | error | LLM error occurred |

Session Limits

| Event | Fields | Description | |-------|--------|-------------| | session.idle_warning | remaining_seconds | User hasn't spoken — call will timeout in remaining_seconds. Triggers idleWarning state. | | session.timeout | reason | Session timed out ("idle_timeout" or "max_duration"). Client auto-disconnects. |

Example — Monitoring tool calls:

session.addEventListener("event", (e) => {
  const { event, tool_name, arguments: args, result } = e.detail;

  if (event === "llm.tool_call") {
    console.log(`Agent calling ${tool_name}(${JSON.stringify(args)})`);
  }
  if (event === "llm.tool_result") {
    console.log(`Tool result:`, result);
  }
});

Client → Server Messages

The client sends these through the DataChannel:

| Message | Format | Description | |---------|--------|-------------| | Ping | "ping" (string) | Keepalive, sent every 1s | | Mute | { "action": "mute" } | Stop processing user audio server-side | | Unmute | { "action": "unmute" } | Resume processing user audio | | Configure | { "action": "configure", ...config } | Hot-swap voice, STT, language, or turn detection mid-call | | Inject Text | { "action": "inject_text", "text": "..." } | Send text as if the user spoke it (for tool UI interactions) | | Set Context | { "action": "set_context", "key": "...", "value": "..." } | Inject/update keyed context in the LLM prompt |


Usage Patterns

Vanilla JavaScript

import { VoiceSession } from "@pinecall/voice-core";

const session = new VoiceSession({ agent: "florencia" });

// UI binding
const btn = document.getElementById("call-btn")!;
const transcript = document.getElementById("transcript")!;

btn.onclick = async () => {
  if (session.getState().status === "connected") {
    session.disconnect();
    btn.textContent = "Start Call";
  } else {
    await session.connect();
    btn.textContent = "End Call";
  }
};

session.addEventListener("message", (e) => {
  const msg = e.detail.message;
  const div = document.createElement("div");
  div.className = msg.role;
  div.textContent = `${msg.role}: ${msg.text}`;
  transcript.appendChild(div);
});

session.addEventListener("phase", (e) => {
  document.body.dataset.phase = e.detail.phase;
});

React with useSyncExternalStore

import { useSyncExternalStore, useCallback, useState, useEffect } from "react";
import { VoiceSession } from "@pinecall/voice-core";

function useVoiceSession(agent: string) {
  const [session] = useState(() => new VoiceSession({ agent }));

  const state = useSyncExternalStore(
    useCallback((cb) => session.subscribe(cb), [session]),
    () => session.getState(),
  );

  useEffect(() => () => session.destroy(), [session]);

  return { ...state, session };
}

Vue 3 Composable

import { ref, onUnmounted } from "vue";
import { VoiceSession } from "@pinecall/voice-core";

export function useVoiceSession(agent: string) {
  const session = new VoiceSession({ agent });
  const state = ref(session.getState());

  session.subscribe(() => {
    state.value = session.getState();
  });

  onUnmounted(() => session.destroy());

  return { state, session };
}

Svelte Store

import { readable } from "svelte/store";
import { VoiceSession } from "@pinecall/voice-core";

export function createVoiceSession(agent: string) {
  const session = new VoiceSession({ agent });

  const state = readable(session.getState(), (set) => {
    return session.subscribe(() => set(session.getState()));
  });

  return { state, session };
}

WebRTC Connection Flow


TypeScript Types

All types are exported from the package:

import type {
  VoiceSessionOptions,
  VoiceSessionState,
  SessionStatus,    // "idle" | "connecting" | "connected" | "error"
  CallPhase,        // "idle" | "listening" | "speaking" | "pause" | "thinking"
  TranscriptMessage,
} from "@pinecall/voice-core";