@voxdiscover/voiceserver
v0.1.0
Published
Framework-agnostic TypeScript SDK for Voxdiscover voice agents
Downloads
96
Maintainers
Readme
@voxdiscover/voiceserver
Framework-agnostic TypeScript SDK for Voice_server voice agents. Provides session token authentication, Daily.js WebRTC integration, and typed events for building voice-enabled applications.
Installation
npm install @voxdiscover/voiceserver @daily-co/daily-js
# or
pnpm add @voxdiscover/voiceserver @daily-co/daily-js
# or
yarn add @voxdiscover/voiceserver @daily-co/daily-jsNote: @daily-co/daily-js is a peer dependency and must be installed separately.
Quick Start
1. Obtain Session Token
First, obtain a session token from your backend (which calls Voice_server's session API):
// Your backend endpoint
const response = await fetch('https://voiceserver.voxdiscover.com/api/voice-session', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ userId: 'user_123' }),
});
const { token } = await response.json();2. Initialize SDK
import { VoiceAgent } from '@voxdiscover/voiceserver';
const agent = new VoiceAgent({ token });3. Subscribe to Events
// Connection state changes
agent.on('connection:state', (state) => {
console.log('Connection state:', state);
// states: 'connecting' | 'connected' | 'reconnecting' | 'disconnected' | 'failed'
});
// Transcripts (streaming)
agent.on('transcript:interim', ({ text, speaker }) => {
console.log(`[interim] ${speaker}: ${text}`);
});
agent.on('transcript:final', ({ text, speaker }) => {
console.log(`[final] ${speaker}: ${text}`);
});
// Errors
agent.on('connection:error', (error) => {
console.error('Connection error:', error.message);
if (error.context?.suggestion) {
console.log('Suggestion:', error.context.suggestion);
}
});4. Connect to Voice Agent
try {
await agent.connect();
console.log('Connected! State:', agent.state);
} catch (error) {
console.error('Failed to connect:', error);
}5. Control Audio
// Mute microphone
agent.mute();
// Unmute microphone
agent.unmute();6. Disconnect
await agent.disconnect();API Reference
VoiceAgent
Main SDK class for managing voice conversations.
Constructor
new VoiceAgent(config: VoiceAgentConfig)Config options:
token(required): Session token from backendbaseUrl(optional): Backend base URL for validation (default:https://voiceserver.voxdiscover.com)reconnection(optional):enabled(default:true): Enable automatic reconnectionmaxAttempts(default:5): Max reconnection attempts
Properties
state: Current connection state (read-only)'connecting'- Establishing connection'connected'- Successfully connected'reconnecting'- Attempting to reconnect'disconnected'- Not connected'failed'- Connection failed
Methods
connect(): Promise<void>- Connect to voice session (validates token, joins Daily room, starts remote audio)disconnect(): Promise<void>- Disconnect and cleanup resources (leaves room, releases audio elements)mute(): void- Mute microphoneunmute(): void- Unmute microphone
Events
Subscribe to events using agent.on(event, callback):
Connection events:
connection:state-(state: ConnectionState) => voidconnection:error-(error: VoiceAgentError) => void
Transcript events:
transcript:interim-(data: TranscriptData) => void- Partial transcripts (not emitted by all agent types)transcript:final-(data: TranscriptData) => void- One event per completed turn; emitted in real-time as each user or agent turn finishes
Audio events:
audio:muted-() => voidaudio:unmuted-() => void
Session events:
session:expiring-(expiresIn: number) => void- 5 minutes before expiration
Error Handling
The SDK provides typed error classes for different scenarios:
import {
VoiceAgentError,
TokenExpiredError,
TokenInvalidError,
ConnectionFailedError,
PermissionDeniedError,
} from '@voxdiscover/voiceserver';
try {
await agent.connect();
} catch (error) {
// Pattern 1: instanceof checks
if (error instanceof TokenExpiredError) {
console.log('Token expired, requesting new session...');
// Request new token from backend
} else if (error instanceof PermissionDeniedError) {
console.log('Microphone permission denied');
// Show permission request UI
}
// Pattern 2: code property checks
if (error.code === 'CONNECTION_FAILED' && error.retryable) {
console.log('Retryable error, will auto-reconnect');
}
// Access error details
console.log('Message:', error.message);
console.log('Suggestion:', error.context?.suggestion);
console.log('Retryable:', error.retryable);
}Error types:
TokenExpiredError- Session token expired (non-retryable)TokenInvalidError- Token malformed or invalid (non-retryable)ConnectionFailedError- WebRTC connection failed (retryable)PermissionDeniedError- Microphone permission denied (non-retryable)NetworkError- Network error during API call (retryable)
Complete Example
import { VoiceAgent, TokenExpiredError } from '@voxdiscover/voiceserver';
async function startVoiceCall() {
// 1. Get session token from your backend
const { token } = await fetch('/api/voice-session', {
method: 'POST',
body: JSON.stringify({ agentId: 'support-agent' }),
}).then(r => r.json());
// 2. Initialize agent
const agent = new VoiceAgent({
token,
baseUrl: 'https://voiceserver.voxdiscover.com',
reconnection: { enabled: true, maxAttempts: 5 },
});
// 3. Set up event listeners
agent.on('connection:state', (state) => {
updateUI({ connectionState: state });
});
agent.on('transcript:final', ({ text, speaker }) => {
addMessageToChat({ speaker, text });
});
agent.on('connection:error', async (error) => {
if (error instanceof TokenExpiredError) {
// Refresh token and reconnect
const { token: newToken } = await refreshSession();
// Create new agent with fresh token
await startVoiceCall();
} else {
showError(error.message, error.context?.suggestion);
}
});
// 4. Connect
try {
await agent.connect();
showUI('connected');
} catch (error) {
showUI('error', error.message);
}
return agent;
}
// Usage in UI event handlers
document.getElementById('startCall').addEventListener('click', async () => {
const agent = await startVoiceCall();
document.getElementById('muteBtn').addEventListener('click', () => {
agent.mute();
});
document.getElementById('endCall').addEventListener('click', async () => {
await agent.disconnect();
});
});Analytics Hooks
The SDK provides standardized analytics hooks for integrating with observability platforms like Segment, DataDog, or PostHog. Analytics hooks emit lifecycle and error events only (not transcripts or audio events) to keep analytics data clean.
Registering an Analytics Callback
import { VoiceAgent } from '@voxdiscover/voiceserver';
const agent = new VoiceAgent({ token });
agent.onAnalyticsEvent((event) => {
console.log('Analytics event:', event.eventType, {
sessionId: event.sessionId,
agentId: event.agentId,
userId: event.userId,
timestamp: event.timestamp,
});
});Event Types
| Event Type | When Emitted |
|------------|-------------|
| session_started | WebRTC connection established (joined Daily room) |
| session_ended | Session disconnected (explicit disconnect) |
| connection_failed | Connection error (token invalid, network failure, etc.) |
| agent_swap_completed | Agent hot-swap completed successfully |
| agent_swap_failed | Agent hot-swap failed |
| error | Categorized SDK error requiring developer attention |
Event Payload Structure
interface AnalyticsEvent {
timestamp: number; // Unix timestamp in milliseconds
eventType: string; // One of the event types above
sessionId: string; // Session identifier from token
agentId?: string; // Agent identifier from token
userId?: string; // User identifier from session context
customContext?: Record<string, any>; // Context from session creation
error?: {
code: string; // Programmatic error code
message: string; // Human-readable error description
retryable: boolean; // Whether operation can be retried
};
}Integration with Segment
import Analytics from 'analytics';
import segmentPlugin from '@analytics/segment';
// Initialize Segment
const analytics = Analytics({
app: 'my-voice-app',
plugins: [
segmentPlugin({
writeKey: 'YOUR_SEGMENT_WRITE_KEY',
}),
],
});
// Register analytics callback
const agent = new VoiceAgent({ token });
agent.onAnalyticsEvent((event) => {
analytics.track(event.eventType, {
session_id: event.sessionId,
agent_id: event.agentId,
user_id: event.userId,
timestamp: event.timestamp,
// Error details (only present on failure events)
...(event.error && {
error_code: event.error.code,
error_message: event.error.message,
error_retryable: event.error.retryable,
}),
// Custom context (from session creation)
...event.customContext,
});
});
await agent.connect();Integration with DataDog / PostHog
Any analytics platform that accepts key-value event properties works the same way:
agent.onAnalyticsEvent((event) => {
// PostHog example
posthog.capture(event.eventType, {
distinct_id: event.userId,
session_id: event.sessionId,
agent_id: event.agentId,
$timestamp: new Date(event.timestamp).toISOString(),
});
// DataDog example
datadogRum.addAction(event.eventType, {
session_id: event.sessionId,
user_id: event.userId,
});
});Multiple Callbacks
Multiple callbacks can be registered - all receive every event:
// Log to console
agent.onAnalyticsEvent((event) => {
console.log('[Voice Analytics]', event.eventType, event.sessionId);
});
// Send to Segment
agent.onAnalyticsEvent((event) => {
analytics.track(event.eventType, { session_id: event.sessionId });
});
// Send to custom backend
agent.onAnalyticsEvent((event) => {
fetch('/api/analytics', {
method: 'POST',
body: JSON.stringify(event),
});
});IMPORTANT: Read-Only Callbacks
Analytics callbacks MUST be read-only. Do NOT call SDK methods (connect, disconnect, mute, etc.) inside a callback. Calling SDK methods from within an analytics callback creates a circular event chain that triggers the circuit breaker and disables analytics for the remainder of the session.
Do NOT do this:
// WRONG: Calling SDK methods inside analytics callback
agent.onAnalyticsEvent((event) => {
if (event.eventType === 'session_ended') {
agent.connect(); // This will trigger another analytics event -> infinite loop
}
});Do this instead:
// CORRECT: React to events outside the callback
agent.on('connection:state', (state) => {
if (state === 'disconnected') {
handleReconnect(); // Handle reconnection in event listener, not analytics callback
}
});
agent.onAnalyticsEvent((event) => {
// Read-only: only forward events to external services
myAnalytics.track(event.eventType, { session_id: event.sessionId });
});TypeScript Support
The SDK is written in TypeScript and includes full type definitions. All types are exported:
import type {
VoiceAgentConfig,
ConnectionState,
TranscriptData,
VoiceAgentEvents,
VoiceAgentErrorCode,
} from '@voxdiscover/voiceserver';Internal Architecture Notes
Headless Mode Audio
VoiceAgent uses DailyIframe.createCallObject() (headless mode), which does not auto-play
remote audio. The SDK manages this internally: a track-started handler creates an <Audio>
element per remote participant and pipes the incoming track through it. No additional setup is
needed in your application.
Transcript Delivery
Transcripts are streamed in real-time over Daily's app-message channel. Each completed turn
(user or agent) triggers one transcript:final event. The server uses Pipecat's
OutputTransportMessageFrame API to broadcast each turn to all room participants.
Browser Support
- Chrome 90+
- Firefox 88+
- Safari 14+
- Edge 90+
Requirements:
- WebRTC support
- getUserMedia API support
- ES2022 features
License
MIT
