npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@babelbeez/sdk

v0.4.1

Published

> Build your own UI for the Babelbeez AI voice agent – custom buttons, call controls, and chat views – while we handle realtime audio, OpenAI Realtime, and connection lifecycle.

Downloads

283

Readme

Babelbeez Headless SDK (@babelbeez/sdk)

Build your own UI for the Babelbeez AI voice agent – custom buttons, call controls, and chat views – while we handle realtime audio, OpenAI Realtime, and connection lifecycle.

The Babelbeez Headless SDK gives you low‑level, event‑driven control of a Babelbeez Voice Agent from the browser.

Note: Using this SDK requires a Babelbeez account and a configured Voice Agent. Sign up at https://www.babelbeez.com.

Use this SDK when you want to:

  • Replace the default widget with your own button or call UI
  • Show live transcripts in your app
  • Combine voice + text input in a single experience
  • Orchestrate human handoffs (email / WhatsApp) from your own components
  • Let Babelbeez attach remote MCP tools to the live session during backend session initialization when your Voice Agent is configured with MCP plug-ins

If you just want a drop‑in chat button, use the standard Babelbeez embed instead. This SDK is for developers who want full control over the UX.


Installation

npm install @babelbeez/sdk

Requirements

  • A Babelbeez account and at least one configured Voice Agent
  • Modern browser with WebRTC and microphone support
  • Page served over HTTPS (or localhost) so the browser will allow mic access

Getting your publicChatbotId

In the Babelbeez Dashboard, open your Voice Agent and go to Settings → Embed. Copy the Public Chatbot ID – you’ll pass this into the SDK.


Quick Start: custom start/stop button

import { BabelbeezClient } from '@babelbeez/sdk';

// 1. Create the client
const client = new BabelbeezClient({
  publicChatbotId: 'YOUR_PUBLIC_CHATBOT_ID',
});

let currentState:
  | 'idle'
  | 'loading'
  | 'active'
  | 'speaking'
  | 'finalizing'
  | 'error'
  | 'rag-retrieval' = 'idle';

// 2. Listen for state changes to drive your UI
client.on('buttonState', (state) => {
  // state: 'idle' | 'loading' | 'active' | 'speaking' | 'finalizing' | 'error' | 'rag-retrieval'
  console.log('Agent state:', state);
  currentState = state;
  updateMyButton(state); // implement this in your own UI
});

// 3. Listen for live transcripts (user + agent)
client.on('transcript', ({ role, text, isFinal }) => {
  // role: 'user' | 'agent'
  console.log(`${role}:`, text, isFinal ? '(final)' : '(partial)');
  appendMessageToChat(role, text, isFinal); // your own renderer
});

// 4. Wire up your button to connect / disconnect
const startStopButton = document.getElementById('voice-button')!;

startStopButton.addEventListener('click', async () => {
  if (currentState === 'active' || currentState === 'speaking' || currentState === 'rag-retrieval') {
    // Immediate explicit stop for UI end-call actions
    await client.hardDisconnect('user_button_click');
  } else if (currentState === 'idle' || currentState === 'error') {
    try {
      await client.connect(); // Browser will request microphone access
    } catch (err) {
      console.error('Failed to connect:', err);
    }
  }
});

You’re responsible for implementing updateMyButton and appendMessageToChat in your own DOM or framework components.


Example: building a transcript view

The SDK emits streaming transcripts for both the user and the agent, including partial and final messages. You can use this to build a chat‑like view.

const transcriptContainer = document.getElementById('messages');
let currentLine: HTMLDivElement | null = null;

client.on('transcript', ({ role, text, isFinal }) => {
  // role is 'user' or 'agent'

  const roleAttr = role;

  // 1. Start a new line when role changes or previous line was final
  if (!currentLine || currentLine.dataset.role !== roleAttr || currentLine.dataset.final === 'true') {
    currentLine = document.createElement('div');
    currentLine.dataset.role = roleAttr;
    currentLine.className = role === 'user' ? 'message-user' : 'message-agent';
    transcriptContainer!.appendChild(currentLine);
  }

  // 2. Update the text content with the latest transcript
  currentLine.textContent = text;

  // 3. Mark final utterances
  currentLine.dataset.final = String(isFinal);

  // 4. Auto-scroll
  transcriptContainer!.scrollTop = transcriptContainer!.scrollHeight;
});

Style .message-user and .message-agent in CSS to match your design system.


Example: hybrid text + voice

You can send text input into the same live voice session. The agent will respond via audio, and you’ll still get transcript events.

// e.g. on form submit
async function handleTextSubmit(message: string) {
  try {
    await client.sendUserText(message);
  } catch (err) {
    console.error('Failed to send text message:', err);
  }
}

Note: The voice session must be active (after connect()) for sendUserText to take effect.


Example: handling human handoff

If your agent is configured for human handoff, the SDK will emit events so you can present your own email/WhatsApp UI.

client.on('handoff:show', ({ summaryText, waLink }) => {
  // summaryText: short description of the conversation / request
  // waLink: WhatsApp deeplink if configured, otherwise null
  openHandoffModal({ summaryText, waLink });
});

client.on('handoff:hide', ({ outcome }) => {
  // outcome: 'email_submitted' | 'whatsapp_submitted' | 'cancelled'
  closeHandoffModal(outcome);
});

// When the user submits your handoff form
async function submitHandoff(email: string, consent: boolean) {
  await client.handleHandoffSubmit({ email, consent });
}

// When the user cancels or chooses WhatsApp instead
async function cancelHandoff(viaWhatsapp: boolean) {
  await client.handleHandoffCancel({ viaWhatsapp });
}

The agent behavior, wording, and when handoff is triggered are all configured in the Babelbeez Dashboard.


API Reference

new BabelbeezClient(config)

Create a new client instance.

import { BabelbeezClient } from '@babelbeez/sdk';

const client = new BabelbeezClient({
  publicChatbotId: 'YOUR_PUBLIC_CHATBOT_ID',
});

Config

interface BabelbeezClientConfig {
  publicChatbotId: string;
  proxyInitializeUrl?: string;
  apiBaseUrl?: string;
}
  • publicChatbotId (string, required) – The public ID of the Voice Agent from the Babelbeez Dashboard.
  • proxyInitializeUrl (string, optional) – Override the Babelbeez /initialize-chat endpoint, useful for local development or internal/self-hosted routing.
  • apiBaseUrl (string, optional) – Override the base URL used for related SDK API calls such as RAG search and human handoff submission. Defaults to the origin derived from proxyInitializeUrl.

Methods

connect(): Promise<void>

Initializes the session via Babelbeez, requests microphone access from the user, and opens a realtime connection to the OpenAI Realtime API.

  • Emits buttonState: 'loading' while connecting.
  • If the Voice Agent has MCP plug-ins configured, Babelbeez encodes the hosted MCP tool definitions during backend session initialization.
  • The /initialize-chat response may also include browser-safe MCP metadata for SDK awareness/diagnostics, but the SDK does not treat that metadata as the source of truth for browser-side MCP tool registration.
  • During the live session, the SDK observes MCP lifecycle events and preserves backend-minted MCP tool state when the runtime performs browser-originated session updates.
  • On success, emits buttonState: 'active' and session:start.
  • On failure (e.g. mic denied), emits an error event and buttonState: 'error'.
await client.connect();

disconnect(reason?: string): Promise<void>

Ends the current session and sends a final usage + transcript summary to Babelbeez. This method remains available as a compatibility wrapper.

await client.disconnect();
// or
await client.disconnect('user_button_click');
  • reason (optional) – String reason used for analytics and backend handling.
    • For explicit UI end-call actions, prefer hardDisconnect('user_button_click').

hardDisconnect(reason?: string): Promise<void>

Immediately ends the current session without waiting for assistant audio to finish.

await client.hardDisconnect('user_button_click');

Use this for explicit end-call buttons where the user expects an immediate stop.


gracefulDisconnect(reason?: string): Promise<void>

Triggers the configured AI goodbye turn and then ends the session after that goodbye audio finishes.

await client.gracefulDisconnect('ai_end_conversation_tool');

This is primarily intended for SDK-managed end-of-call flows.


initializeAudio(): void

Optional helper to unlock the browser AudioContext in response to a user gesture (click/tap), which can help avoid autoplay restrictions in some environments.

// e.g. on a user click before connecting
client.initializeAudio();

sendUserText(text: string): Promise<void>

Sends a user text message into the active voice session – useful for hybrid chat + voice interfaces.

await client.sendUserText('Hello, do you have pricing for teams?');
  • If the agent is currently speaking, the SDK will attempt to interrupt the response before sending the new message.

handleHandoffSubmit(payload): Promise<void>

Notify Babelbeez when the user submits your human handoff form.

await client.handleHandoffSubmit({
  email: '[email protected]',
  consent: true,
});
  • email (string) – The user’s email address.
  • consent (boolean) – Whether the user consented to be contacted.

handleHandoffCancel(options?): Promise<void>

Notify Babelbeez when the user cancels the handoff form or switches to WhatsApp.

await client.handleHandoffCancel({ viaWhatsapp: true });
  • viaWhatsapp (boolean, optional) – Pass true if the user opted to continue via WhatsApp (using the provided waLink). In that case, the SDK will end the voice session after a goodbye.

Events

The client extends a simple EventEmitter interface. Subscribe with client.on(event, listener) and unsubscribe with client.off(event, listener).

Core events

client.on('buttonState', (state) => { /* ... */ });
client.on('transcript', (event) => { /* ... */ });
client.on('error', (event) => { /* ... */ });
client.on('session:start', (event) => { /* ... */ });
client.on('session:end', (event) => { /* ... */ });
client.on('handoff:show', (event) => { /* ... */ });
client.on('handoff:hide', (event) => { /* ... */ });

buttonState

export type BabelbeezButtonState =
  | 'idle'
  | 'loading'
  | 'active'
  | 'speaking'
  | 'finalizing'
  | 'error'
  | 'rag-retrieval';

Use this to drive your call control UI (start/stop button, spinners, etc.).

transcript

export interface BabelbeezTranscriptEvent {
  role: 'user' | 'agent';
  text: string;
  isFinal: boolean;
}
  • Multiple events are emitted per utterance.
  • isFinal: true marks the end of a user or agent turn.

session:start

export interface BabelbeezSessionStartEvent {
  chatbotId: string;
  config: unknown; // snapshot of chatbot configuration
}

Fired when the WebRTC session is fully established and active.

session:end

export interface BabelbeezSessionEndEvent {
  reason: string;
}

Fired when the session terminates (user disconnect, timeout, error, agent‑initiated close, etc.).

error

export type BabelbeezErrorSeverity = 'info' | 'warning' | 'error';

export interface BabelbeezErrorEvent {
  code: string;
  message: string;
  severity: BabelbeezErrorSeverity;
  fatal?: boolean;
}
  • When fatal === true, the session has been terminated.
  • Use severity to decide how aggressively to update your UI or prompt the user.

handoff:show

export interface BabelbeezHandoffShowEvent {
  summaryText: string;
  waLink: string | null; // WhatsApp deeplink if configured
}

Fired when the AI decides a human is needed. Use this to show your own form/modal.

handoff:hide

export type BabelbeezHandoffHideOutcome =
  | 'email_submitted'
  | 'whatsapp_submitted'
  | 'cancelled';

export interface BabelbeezHandoffHideEvent {
  outcome: BabelbeezHandoffHideOutcome;
}

Fired when the handoff flow is completed or cancelled.


Usage notes

  • The SDK is browser‑first and assumes access to navigator.mediaDevices for microphone input.
  • Always provide clear UX affordances (e.g. "Start call" / "End call") that map to connect() and disconnect().
  • For best results, prompt the user before accessing the microphone and explain what the agent will do.
  • When a Voice Agent is configured with a greeting, the SDK triggers this greeting immediately after connect() completes using an out-of-band Realtime response.create turn.
  • When MCP plug-ins are configured for the Voice Agent, hosted MCP tool definitions are prepared by Babelbeez during backend session initialization and encoded into the Realtime session setup.
  • Browser-safe MCP descriptors returned by /initialize-chat are intended for SDK awareness/diagnostics and future runtime alignment, not as the browser-side source of truth for hosted MCP tool registration.
  • During remote MCP calls, the SDK reuses the rag-retrieval button state so hosts can present a consistent thinking/loading UI for both RAG and hosted MCP activity.

Further reading

For a full walkthrough of building a custom button and UI, see the guide:

Headless embed: use your own chat button
https://www.babelbeez.com/resources/help/for-developers/headless-embed-custom-button.html


License

MIT


Changelog

See CHANGELOG.md for release notes.