@babelbeez/sdk
v0.1.2
Published
> Build your own UI for the Babelbeez AI voice agent – custom buttons, call controls, and chat views – while we handle realtime audio, OpenAI Realtime, and connection lifecycle.
Maintainers
Keywords
Readme
Babelbeez Headless SDK (@babelbeez/sdk)
Build your own UI for the Babelbeez AI voice agent – custom buttons, call controls, and chat views – while we handle realtime audio, OpenAI Realtime, and connection lifecycle.
The Babelbeez Headless SDK gives you low‑level, event‑driven control of a Babelbeez Voice Agent from the browser.
Note: Using this SDK requires a Babelbeez account and a configured Voice Agent. Sign up at https://www.babelbeez.com.
Use this SDK when you want to:
- Replace the default widget with your own button or call UI
- Show live transcripts in your app
- Combine voice + text input in a single experience
- Orchestrate human handoffs (email / WhatsApp) from your own components
If you just want a drop‑in chat button, use the standard Babelbeez embed instead. This SDK is for developers who want full control over the UX.
Installation
npm install @babelbeez/sdkRequirements
- A Babelbeez account and at least one configured Voice Agent
- Modern browser with WebRTC and microphone support
- Page served over HTTPS (or
localhost) so the browser will allow mic access
Getting your publicChatbotId
In the Babelbeez Dashboard, open your Voice Agent and go to Settings → Embed. Copy the Public Chatbot ID – you’ll pass this into the SDK.
Quick Start: custom start/stop button
import { BabelbeezClient } from '@babelbeez/sdk';
// 1. Create the client
const client = new BabelbeezClient({
publicChatbotId: 'YOUR_PUBLIC_CHATBOT_ID',
});
let currentState: 'idle' | 'loading' | 'active' | 'speaking' | 'error' | 'rag-retrieval' = 'idle';
// 2. Listen for state changes to drive your UI
client.on('buttonState', (state) => {
// state: 'idle' | 'loading' | 'active' | 'speaking' | 'error' | 'rag-retrieval'
console.log('Agent state:', state);
currentState = state;
updateMyButton(state); // implement this in your own UI
});
// 3. Listen for live transcripts (user + agent)
client.on('transcript', ({ role, text, isFinal }) => {
// role: 'user' | 'agent'
console.log(`${role}:`, text, isFinal ? '(final)' : '(partial)');
appendMessageToChat(role, text, isFinal); // your own renderer
});
// 4. Wire up your button to connect / disconnect
const startStopButton = document.getElementById('voice-button')!;
startStopButton.addEventListener('click', async () => {
if (currentState === 'active' || currentState === 'speaking' || currentState === 'rag-retrieval') {
// Play nice goodbye UX (triggers configured farewell in Babelbeez)
await client.disconnect('user_button_click');
} else if (currentState === 'idle' || currentState === 'error') {
try {
await client.connect(); // Browser will request microphone access
} catch (err) {
console.error('Failed to connect:', err);
}
}
});You’re responsible for implementing updateMyButton and appendMessageToChat in your own DOM or framework components.
Example: building a transcript view
The SDK emits streaming transcripts for both the user and the agent, including partial and final messages. You can use this to build a chat‑like view.
const transcriptContainer = document.getElementById('messages');
let currentLine: HTMLDivElement | null = null;
client.on('transcript', ({ role, text, isFinal }) => {
// role is 'user' or 'agent'
const roleAttr = role;
// 1. Start a new line when role changes or previous line was final
if (!currentLine || currentLine.dataset.role !== roleAttr || currentLine.dataset.final === 'true') {
currentLine = document.createElement('div');
currentLine.dataset.role = roleAttr;
currentLine.className = role === 'user' ? 'message-user' : 'message-agent';
transcriptContainer!.appendChild(currentLine);
}
// 2. Update the text content with the latest transcript
currentLine.textContent = text;
// 3. Mark final utterances
currentLine.dataset.final = String(isFinal);
// 4. Auto-scroll
transcriptContainer!.scrollTop = transcriptContainer!.scrollHeight;
});Style .message-user and .message-agent in CSS to match your design system.
Example: hybrid text + voice
You can send text input into the same live voice session. The agent will respond via audio, and you’ll still get transcript events.
// e.g. on form submit
async function handleTextSubmit(message: string) {
try {
await client.sendUserText(message);
} catch (err) {
console.error('Failed to send text message:', err);
}
}Note: The voice session must be active (after
connect()) forsendUserTextto take effect.
Example: handling human handoff
If your agent is configured for human handoff, the SDK will emit events so you can present your own email/WhatsApp UI.
client.on('handoff:show', ({ summaryText, waLink }) => {
// summaryText: short description of the conversation / request
// waLink: WhatsApp deeplink if configured, otherwise null
openHandoffModal({ summaryText, waLink });
});
client.on('handoff:hide', ({ outcome }) => {
// outcome: 'email_submitted' | 'whatsapp_submitted' | 'cancelled'
closeHandoffModal(outcome);
});
// When the user submits your handoff form
async function submitHandoff(email: string, consent: boolean) {
await client.handleHandoffSubmit({ email, consent });
}
// When the user cancels or chooses WhatsApp instead
async function cancelHandoff(viaWhatsapp: boolean) {
await client.handleHandoffCancel({ viaWhatsapp });
}The agent behavior, wording, and when handoff is triggered are all configured in the Babelbeez Dashboard.
API Reference
new BabelbeezClient(config)
Create a new client instance.
import { BabelbeezClient } from '@babelbeez/sdk';
const client = new BabelbeezClient({
publicChatbotId: 'YOUR_PUBLIC_CHATBOT_ID',
});Config
interface BabelbeezClientConfig {
publicChatbotId: string;
}publicChatbotId(string, required) – The public ID of the Voice Agent from the Babelbeez Dashboard.
Methods
connect(): Promise<void>
Initializes the session via Babelbeez, requests microphone access from the user, and opens a realtime connection to the OpenAI Realtime API.
- Emits
buttonState: 'loading'while connecting. - On success, emits
buttonState: 'active'andsession:start. - On failure (e.g. mic denied), emits an
errorevent andbuttonState: 'error'.
await client.connect();disconnect(reason?: string): Promise<void>
Gracefully ends the current session and sends a final usage + transcript summary to Babelbeez.
await client.disconnect();
// or
await client.disconnect('user_button_click');reason(optional) – String reason used for analytics and backend handling.- Passing
'user_button_click'triggers the configured goodbye message before disconnecting.
- Passing
initializeAudio(): void
Optional helper to unlock the browser AudioContext in response to a user gesture (click/tap), which can help avoid autoplay restrictions in some environments.
// e.g. on a user click before connecting
client.initializeAudio();sendUserText(text: string): Promise<void>
Sends a user text message into the active voice session – useful for hybrid chat + voice interfaces.
await client.sendUserText('Hello, do you have pricing for teams?');- If the agent is currently speaking, the SDK will attempt to interrupt the response before sending the new message.
handleHandoffSubmit(payload): Promise<void>
Notify Babelbeez when the user submits your human handoff form.
await client.handleHandoffSubmit({
email: '[email protected]',
consent: true,
});email(string) – The user’s email address.consent(boolean) – Whether the user consented to be contacted.
handleHandoffCancel(options?): Promise<void>
Notify Babelbeez when the user cancels the handoff form or switches to WhatsApp.
await client.handleHandoffCancel({ viaWhatsapp: true });viaWhatsapp(boolean, optional) – Passtrueif the user opted to continue via WhatsApp (using the providedwaLink). In that case, the SDK will end the voice session after a goodbye.
Events
The client extends a simple EventEmitter interface. Subscribe with client.on(event, listener) and unsubscribe with client.off(event, listener).
Core events
client.on('buttonState', (state) => { /* ... */ });
client.on('transcript', (event) => { /* ... */ });
client.on('error', (event) => { /* ... */ });
client.on('session:start', (event) => { /* ... */ });
client.on('session:end', (event) => { /* ... */ });
client.on('handoff:show', (event) => { /* ... */ });
client.on('handoff:hide', (event) => { /* ... */ });buttonState
export type BabelbeezButtonState =
| 'idle'
| 'loading'
| 'active'
| 'speaking'
| 'error'
| 'rag-retrieval';Use this to drive your call control UI (start/stop button, spinners, etc.).
transcript
export interface BabelbeezTranscriptEvent {
role: 'user' | 'agent';
text: string;
isFinal: boolean;
}- Multiple events are emitted per utterance.
isFinal: truemarks the end of a user or agent turn.
session:start
export interface BabelbeezSessionStartEvent {
chatbotId: string;
config: unknown; // snapshot of chatbot configuration
}Fired when the WebRTC session is fully established and active.
session:end
export interface BabelbeezSessionEndEvent {
reason: string;
}Fired when the session terminates (user disconnect, timeout, error, agent‑initiated close, etc.).
error
export type BabelbeezErrorSeverity = 'info' | 'warning' | 'error';
export interface BabelbeezErrorEvent {
code: string;
message: string;
severity: BabelbeezErrorSeverity;
fatal?: boolean;
}- When
fatal === true, the session has been terminated. - Use
severityto decide how aggressively to update your UI or prompt the user.
handoff:show
export interface BabelbeezHandoffShowEvent {
summaryText: string;
waLink: string | null; // WhatsApp deeplink if configured
}Fired when the AI decides a human is needed. Use this to show your own form/modal.
handoff:hide
export type BabelbeezHandoffHideOutcome =
| 'email_submitted'
| 'whatsapp_submitted'
| 'cancelled';
export interface BabelbeezHandoffHideEvent {
outcome: BabelbeezHandoffHideOutcome;
}Fired when the handoff flow is completed or cancelled.
Usage notes
- The SDK is browser‑first and assumes access to
navigator.mediaDevicesfor microphone input. - Always provide clear UX affordances (e.g. "Start call" / "End call") that map to
connect()anddisconnect(). - For best results, prompt the user before accessing the microphone and explain what the agent will do.
Further reading
For a full walkthrough of building a custom button and UI, see the guide:
Headless embed: use your own chat button
https://www.babelbeez.com/resources/help/for-developers/headless-embed-custom-button.html
License
MIT
