vocal-call-sdk

v2.0.2

Published

a month ago

A JavaScript SDK that provides a complete voice calling interface with WebSocket communication, audio recording/playback, and automatic UI management.

Downloads

289

0High
0Medium
0Low

anshu-924

sdk voice-call websocket webrtc javascript audio vocal real-time voice-ai

VocalCallSDK

A JavaScript SDK for real-time voice calls with intelligent audio processing and WebSocket communication.

Installation

import { VocalCallSDK } from './dist/vocalcallsdk.js';

Basic Usage

const sdk = new VocalCallSDK({
    agentId: 'your-agent-uuid',           // Required: Get from vocallabs.ai
    callId: 'unique-call-id',             // Required: Get from arc.vocallabs.ai
    inactiveText: "Start Call",           // Optional: Button text when idle
    activeText: "End Call",               // Optional: Button text when active
    size: 'large',                        // Optional: 'small', 'medium', 'large'
    className: 'custom-button-class',     // Optional: Additional CSS classes
    container: '#call-button-container',  // Required for renderButton()
    config: {
        endpoints: {
            websocket: 'wss://call.vocallabs.ai/ws/'  // Optional: Custom WebSocket URL
        },
        audio: {
            userInputSampleRate: 32000,     // Optional: User microphone sample rate
            agentOutputSampleRate: 24000,   // Optional: Agent audio sample rate (24k recommended)
            echoCancellation: true,         // Optional: Enable echo cancellation
            noiseSuppression: true          // Optional: Enable noise suppression
        }
    }
});

// Render the call button in the specified container
sdk.renderButton();

Configuration Parameters

Required Parameters

agentId: Agent identifier from vocallabs.ai
callId: Unique identifier for each call session

Optional Parameters

inactiveText: Button text when idle (default: "Talk to Assistant")
activeText: Button text when recording (default: "Listening...")
size: Button size - "small", "medium", "large" (default: "medium")
className: Additional CSS classes for the button (default: "")
container: DOM container selector for button rendering (required for renderButton())

Configuration Object

The config object supports the following options:

`config.endpoints`

websocket: Custom WebSocket URL (default: "wss://call.vocallabs.ai/ws/")

`config.audio`

userInputSampleRate: Microphone sample rate (default: 32000)
agentOutputSampleRate: Agent audio sample rate - supports 48k, 24k, 16k (default: 24000)
echoCancellation: Enable echo cancellation (default: true)
noiseSuppression: Enable noise suppression (default: true)

Event Handling

The SDK provides several event hooks for monitoring call status and handling errors:

sdk.on('onCallStart', () => {
    console.log('Call started');
})
.on('onCallEnd', (reason) => {
    console.log('Call ended:', reason);
    // Possible reasons: 'user', 'agent', 'server_initiated', 'connection_timeout', etc.
})
.on('onStatusChange', (status) => {
    console.log('Status changed:', status);
    // Status object includes: status, isRecording, isConnected, lastDisconnectReason
})
.on('onError', (error) => {
    console.error('SDK Error:', error);
});

// Remove event listeners
sdk.off('onCallStart', callStartHandler);

Available Events

onCallStart: Fired when a call begins and WebSocket connection is established
onCallEnd: Fired when a call ends, includes reason parameter
onStatusChange: Fired when SDK status changes (connecting, connected, error, idle)
onError: Fired when an error occurs

API Methods

Core Methods

renderButton(container?): Render the call button in the specified container
startCall(): Programmatically start a call
endCall(): Programmatically end a call (only works if currently recording)
getStatus(): Get current SDK status object
destroy(): Clean up resources and remove event listeners

Status Object

The getStatus() method returns an object with the following properties:

{
    status: 'idle' | 'connecting' | 'connected' | 'error',
    isRecording: boolean,
    isConnected: boolean,
    lastDisconnectReason: string | null
}

Event Management

on(event, callback): Add event listener (returns SDK instance for chaining)
off(event, callback): Remove specific event listener

How It Works

The SDK provides a complete real-time voice communication system with intelligent audio processing and WebSocket-based communication.

Architecture Overview

WebSocket Connection: Establishes real-time bidirectional communication with the Vocallabs voice service
Audio Capture: Captures user microphone input with configurable sample rates and audio processing
Real-Time Processing: Processes and transmits audio data in real-time chunks
Agent Response: Receives and plays back agent audio responses with automatic buffering
Call Management: Handles call state, disconnection reasons, and cleanup

Key Features

Modern Audio Processing:

Uses AudioWorkletNode for modern browsers with automatic fallback to ScriptProcessorNode
Configurable sample rates (32kHz user input, 24kHz agent output by default)
Built-in echo cancellation and noise suppression
Automatic audio normalization and gain control

Intelligent Connection Management:

Automatic reconnection handling
Connection timeout detection (8 seconds)
Graceful disconnect with reason tracking
Page unload protection to properly close connections

Real-Time Audio Streaming:

Low-latency audio transmission using WebSocket
Buffered playback for smooth agent responses
Automatic audio queue management
Cross-browser compatibility

User Experience:

Responsive button UI with status indicators
Visual feedback for connection states
Configurable button sizes and text
Accessibility support with ARIA labels

WebSocket Protocol

The SDK communicates using a structured WebSocket protocol:

Connection: wss://call.vocallabs.ai/ws/?agent={agentId}_{callId}_web_{sampleRate}
Events: JSON-based event system for call control and media streaming
Audio Format: Base64-encoded 16-bit PCM audio data
Status Tracking: Real-time call status and hangup source reporting

Advanced Configuration

Audio Settings

const sdk = new VocalCallSDK({
    agentId: 'your-agent-id',
    callId: 'your-call-id',
    container: '#call-button',
    config: {
        audio: {
            userInputSampleRate: 32000,    // User microphone sample rate
            agentOutputSampleRate: 24000,  // Agent audio sample rate (24k/16k/48k)
            echoCancellation: true,        // Microphone echo cancellation
            noiseSuppression: true         // Microphone noise suppression
        }
    }
});

Custom Styling

The SDK automatically applies Tailwind CSS classes for styling. You can customize the appearance by:

Using the className parameter:

const sdk = new VocalCallSDK({
    // ... other options
    className: 'custom-call-button'
});

Overriding default styles:

.vocal-call-wrapper button {
    /* Your custom styles */
}

Button Sizes

Available button sizes with their default styling:

small: px-3 py-1 text-sm rounded-md
medium: px-4 py-2 text-base rounded-lg (default)
large: px-6 py-3 text-lg rounded-xl

Error Handling

The SDK provides comprehensive error handling:

sdk.on('onError', (error) => {
    console.error('VocalCallSDK Error:', error);
    
    // Handle specific error types
    if (error.type === 'microphone_access_denied') {
        // Show user-friendly message about microphone permissions
    } else if (error.type === 'connection_failed') {
        // Handle connection issues
    }
});

sdk.on('onCallEnd', (reason) => {
    // Handle different disconnect reasons
    switch (reason) {
        case 'user':
            console.log('User ended the call');
            break;
        case 'agent':
            console.log('Agent ended the call');
            break;
        case 'connection_timeout':
            console.log('Connection timed out');
            break;
        case 'page_unload':
            console.log('Page was refreshed/closed during call');
            break;
    }
});

Browser Compatibility

The SDK supports all modern browsers with WebRTC capabilities:

Chrome 66+ (recommended)
Firefox 60+
Safari 12+
Edge 79+

Features:

Automatic fallback from AudioWorkletNode to ScriptProcessorNode for older browsers
WebSocket support with automatic reconnection
MediaDevices API for microphone access

Best Practices

Always handle errors: Implement onError event handlers for graceful error handling
Check microphone permissions: Ensure users grant microphone access before starting calls
Provide visual feedback: Use the status events to show connection state to users
Clean up resources: Call destroy() when the component is unmounted
Test across browsers: Verify functionality across different browser versions

Troubleshooting

Common Issues

"Microphone access denied"

Ensure HTTPS is used (required for microphone access)
Check browser microphone permissions
Verify the site isn't blocked from accessing media

"Connection timeout"

Check network connectivity
Verify the WebSocket URL is accessible
Ensure firewall doesn't block WebSocket connections

"No audio from agent"

Check audio output devices
Verify browser audio permissions
Test with different audio sample rates