@babelbeez/sdk

v0.1.2

Published

6 days ago

> Build your own UI for the Babelbeez AI voice agent – custom buttons, call controls, and chat views – while we handle realtime audio, OpenAI Realtime, and connection lifecycle.

Babelbeez Headless SDK (`@babelbeez/sdk`)

Build your own UI for the Babelbeez AI voice agent – custom buttons, call controls, and chat views – while we handle realtime audio, OpenAI Realtime, and connection lifecycle.

The Babelbeez Headless SDK gives you low‑level, event‑driven control of a Babelbeez Voice Agent from the browser.

Note: Using this SDK requires a Babelbeez account and a configured Voice Agent. Sign up at https://www.babelbeez.com.

Use this SDK when you want to:

Replace the default widget with your own button or call UI
Show live transcripts in your app
Combine voice + text input in a single experience
Orchestrate human handoffs (email / WhatsApp) from your own components

If you just want a drop‑in chat button, use the standard Babelbeez embed instead. This SDK is for developers who want full control over the UX.

Installation

npm install @babelbeez/sdk

Requirements

A Babelbeez account and at least one configured Voice Agent
Modern browser with WebRTC and microphone support
Page served over HTTPS (or localhost) so the browser will allow mic access

Getting your `publicChatbotId`

In the Babelbeez Dashboard, open your Voice Agent and go to Settings → Embed. Copy the Public Chatbot ID – you’ll pass this into the SDK.

Quick Start: custom start/stop button

import { BabelbeezClient } from '@babelbeez/sdk';

// 1. Create the client
const client = new BabelbeezClient({
  publicChatbotId: 'YOUR_PUBLIC_CHATBOT_ID',
});

let currentState: 'idle' | 'loading' | 'active' | 'speaking' | 'error' | 'rag-retrieval' = 'idle';

// 2. Listen for state changes to drive your UI
client.on('buttonState', (state) => {
  // state: 'idle' | 'loading' | 'active' | 'speaking' | 'error' | 'rag-retrieval'
  console.log('Agent state:', state);
  currentState = state;
  updateMyButton(state); // implement this in your own UI
});

// 3. Listen for live transcripts (user + agent)
client.on('transcript', ({ role, text, isFinal }) => {
  // role: 'user' | 'agent'
  console.log(`${role}:`, text, isFinal ? '(final)' : '(partial)');
  appendMessageToChat(role, text, isFinal); // your own renderer
});

// 4. Wire up your button to connect / disconnect
const startStopButton = document.getElementById('voice-button')!;

startStopButton.addEventListener('click', async () => {
  if (currentState === 'active' || currentState === 'speaking' || currentState === 'rag-retrieval') {
    // Play nice goodbye UX (triggers configured farewell in Babelbeez)
    await client.disconnect('user_button_click');
  } else if (currentState === 'idle' || currentState === 'error') {
    try {
      await client.connect(); // Browser will request microphone access
    } catch (err) {
      console.error('Failed to connect:', err);
    }
  }
});

You’re responsible for implementing updateMyButton and appendMessageToChat in your own DOM or framework components.

Example: building a transcript view

The SDK emits streaming transcripts for both the user and the agent, including partial and final messages. You can use this to build a chat‑like view.

const transcriptContainer = document.getElementById('messages');
let currentLine: HTMLDivElement | null = null;

client.on('transcript', ({ role, text, isFinal }) => {
  // role is 'user' or 'agent'

  const roleAttr = role;

  // 1. Start a new line when role changes or previous line was final
  if (!currentLine || currentLine.dataset.role !== roleAttr || currentLine.dataset.final === 'true') {
    currentLine = document.createElement('div');
    currentLine.dataset.role = roleAttr;
    currentLine.className = role === 'user' ? 'message-user' : 'message-agent';
    transcriptContainer!.appendChild(currentLine);
  }

  // 2. Update the text content with the latest transcript
  currentLine.textContent = text;

  // 3. Mark final utterances
  currentLine.dataset.final = String(isFinal);

  // 4. Auto-scroll
  transcriptContainer!.scrollTop = transcriptContainer!.scrollHeight;
});

Style .message-user and .message-agent in CSS to match your design system.

Example: hybrid text + voice

You can send text input into the same live voice session. The agent will respond via audio, and you’ll still get transcript events.

// e.g. on form submit
async function handleTextSubmit(message: string) {
  try {
    await client.sendUserText(message);
  } catch (err) {
    console.error('Failed to send text message:', err);
  }
}

Note: The voice session must be active (after connect()) for sendUserText to take effect.

Example: handling human handoff

If your agent is configured for human handoff, the SDK will emit events so you can present your own email/WhatsApp UI.

client.on('handoff:show', ({ summaryText, waLink }) => {
  // summaryText: short description of the conversation / request
  // waLink: WhatsApp deeplink if configured, otherwise null
  openHandoffModal({ summaryText, waLink });
});

client.on('handoff:hide', ({ outcome }) => {
  // outcome: 'email_submitted' | 'whatsapp_submitted' | 'cancelled'
  closeHandoffModal(outcome);
});

// When the user submits your handoff form
async function submitHandoff(email: string, consent: boolean) {
  await client.handleHandoffSubmit({ email, consent });
}

// When the user cancels or chooses WhatsApp instead
async function cancelHandoff(viaWhatsapp: boolean) {
  await client.handleHandoffCancel({ viaWhatsapp });
}

The agent behavior, wording, and when handoff is triggered are all configured in the Babelbeez Dashboard.

API Reference

`new BabelbeezClient(config)`

Create a new client instance.

import { BabelbeezClient } from '@babelbeez/sdk';

const client = new BabelbeezClient({
  publicChatbotId: 'YOUR_PUBLIC_CHATBOT_ID',
});

Config

interface BabelbeezClientConfig {
  publicChatbotId: string;
}

publicChatbotId (string, required) – The public ID of the Voice Agent from the Babelbeez Dashboard.

Methods

`connect(): Promise<void>`

Initializes the session via Babelbeez, requests microphone access from the user, and opens a realtime connection to the OpenAI Realtime API.

Emits buttonState: 'loading' while connecting.
On success, emits buttonState: 'active' and session:start.
On failure (e.g. mic denied), emits an error event and buttonState: 'error'.

await client.connect();

`disconnect(reason?: string): Promise<void>`

Gracefully ends the current session and sends a final usage + transcript summary to Babelbeez.

await client.disconnect();
// or
await client.disconnect('user_button_click');

reason (optional) – String reason used for analytics and backend handling.
- Passing 'user_button_click' triggers the configured goodbye message before disconnecting.

`initializeAudio(): void`

Optional helper to unlock the browser AudioContext in response to a user gesture (click/tap), which can help avoid autoplay restrictions in some environments.

// e.g. on a user click before connecting
client.initializeAudio();

`sendUserText(text: string): Promise<void>`

Sends a user text message into the active voice session – useful for hybrid chat + voice interfaces.

await client.sendUserText('Hello, do you have pricing for teams?');

If the agent is currently speaking, the SDK will attempt to interrupt the response before sending the new message.

`handleHandoffSubmit(payload): Promise<void>`

Notify Babelbeez when the user submits your human handoff form.

await client.handleHandoffSubmit({
  email: '[email protected]',
  consent: true,
});

email (string) – The user’s email address.
consent (boolean) – Whether the user consented to be contacted.

`handleHandoffCancel(options?): Promise<void>`

Notify Babelbeez when the user cancels the handoff form or switches to WhatsApp.

await client.handleHandoffCancel({ viaWhatsapp: true });

viaWhatsapp (boolean, optional) – Pass true if the user opted to continue via WhatsApp (using the provided waLink). In that case, the SDK will end the voice session after a goodbye.

Events

The client extends a simple EventEmitter interface. Subscribe with client.on(event, listener) and unsubscribe with client.off(event, listener).

Core events

client.on('buttonState', (state) => { /* ... */ });
client.on('transcript', (event) => { /* ... */ });
client.on('error', (event) => { /* ... */ });
client.on('session:start', (event) => { /* ... */ });
client.on('session:end', (event) => { /* ... */ });
client.on('handoff:show', (event) => { /* ... */ });
client.on('handoff:hide', (event) => { /* ... */ });

`buttonState`

export type BabelbeezButtonState =
  | 'idle'
  | 'loading'
  | 'active'
  | 'speaking'
  | 'error'
  | 'rag-retrieval';

Use this to drive your call control UI (start/stop button, spinners, etc.).

`transcript`

export interface BabelbeezTranscriptEvent {
  role: 'user' | 'agent';
  text: string;
  isFinal: boolean;
}

Multiple events are emitted per utterance.
isFinal: true marks the end of a user or agent turn.

`session:start`

export interface BabelbeezSessionStartEvent {
  chatbotId: string;
  config: unknown; // snapshot of chatbot configuration
}

Fired when the WebRTC session is fully established and active.

`session:end`

export interface BabelbeezSessionEndEvent {
  reason: string;
}

Fired when the session terminates (user disconnect, timeout, error, agent‑initiated close, etc.).

`error`

export type BabelbeezErrorSeverity = 'info' | 'warning' | 'error';

export interface BabelbeezErrorEvent {
  code: string;
  message: string;
  severity: BabelbeezErrorSeverity;
  fatal?: boolean;
}

When fatal === true, the session has been terminated.
Use severity to decide how aggressively to update your UI or prompt the user.

`handoff:show`

export interface BabelbeezHandoffShowEvent {
  summaryText: string;
  waLink: string | null; // WhatsApp deeplink if configured
}

Fired when the AI decides a human is needed. Use this to show your own form/modal.

`handoff:hide`

export type BabelbeezHandoffHideOutcome =
  | 'email_submitted'
  | 'whatsapp_submitted'
  | 'cancelled';

export interface BabelbeezHandoffHideEvent {
  outcome: BabelbeezHandoffHideOutcome;
}

Fired when the handoff flow is completed or cancelled.

Usage notes

The SDK is browser‑first and assumes access to navigator.mediaDevices for microphone input.
Always provide clear UX affordances (e.g. "Start call" / "End call") that map to connect() and disconnect().
For best results, prompt the user before accessing the microphone and explain what the agent will do.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Babelbeez Headless SDK (@babelbeez/sdk)

Installation

Getting your publicChatbotId

Quick Start: custom start/stop button

Example: building a transcript view

Example: hybrid text + voice

Example: handling human handoff

API Reference

new BabelbeezClient(config)

Methods

connect(): Promise<void>

disconnect(reason?: string): Promise<void>

initializeAudio(): void

sendUserText(text: string): Promise<void>

handleHandoffSubmit(payload): Promise<void>

handleHandoffCancel(options?): Promise<void>