@voxera/sdk-core

v1.0.13

Published

2 days ago

Core SDK for Voxera Voice Platform

0High
0Medium
0Low

voxera

voxera voice-ai webrtc sdk

Voxera SDK — Core

Core TypeScript SDK for Voxera Voice Platform. Platform-agnostic client library that powers all framework-specific SDKs (React, React Native, iOS, Android, Flutter).

Platform Endpoints

| Service | URL | |---------|-----| | Media Server (WebSocket) | wss://media.voxera-voice.com | | Demo | https://demo.voxera-voice.com |

Pass the Media Server URL as serverUrl when creating a client. Get your API key at app.voxera-voice.com.

Meeting Modes

AI Meeting (`ai-meeting`)

A real-time voice conversation with an AI assistant. The AI listens via speech-to-text (STT), processes through a configurable LLM (OpenAI, Anthropic, Ollama), and responds with natural speech via text-to-speech (TTS). Supports multiple participants talking to the same AI in a shared room.

Capabilities:

Real-time voice AI conversation (bidirectional audio over WebRTC/mediasoup)
Configurable AI persona via dashboard Agent settings (system prompt, model, temperature, max tokens)
Multiple TTS providers (OpenAI, ElevenLabs, Azure) with voice selection
Multiple STT providers (Google, OpenAI Whisper)
Live transcription of all speakers
Video and screen sharing with optional AI vision (enableVideoAI)
Ask AI — send the current transcript to AI and get a spoken response
Text message input alongside voice
AI-generated meeting summaries with action items and key topics
AI-generated meeting minutes (title, attendees, duration, sections)
Bookmarks — mark key moments and flag as action items

Normal Meeting (`normal-meeting`)

A multi-participant audio/video meeting without AI voice. Participants talk to each other via WebRTC. AI features are available in text-only form.

Capabilities:

Multi-participant real-time audio and video (WebRTC/mediasoup SFU)
Host controls — mute/unmute participant, mute all, remove participant, transfer host, lock room, end meeting
Waiting room — enable/disable, admit or deny individual participants, admit all
Live transcription (toggle on/off by host)
Ask AI (text) — send a prompt to AI, receive a streamed text response (no voice)
AI-generated meeting summaries and minutes
Bookmarks
Screen sharing

Common Features (Both Modes)

WebRTC via mediasoup — SFU-based low-latency audio/video transport with TURN fallback
Room management — create rooms with codes, join by code, display names, participant list
Real-time events — participant join/leave, mute status, host changes, room lock, transcription entries
Webhook integration — receive server-side events on session start/end, messages, errors
Usage tracking — minutes consumed, session history, breakdowns by period
Configurable — override AI provider, voice, language, video resolution per session

Installation

npm install @voxera/sdk-core

Quick Start

import { MayaVoiceClient } from '@voxera/sdk-core';

const client = new MayaVoiceClient({
  appKey: 'your-app-key',                    // from https://app.voxera-voice.com
  serverUrl: 'wss://media.voxera-voice.com',
  agentId: 'your-agent-id',                  // created in the dashboard
  onConnectionStatusChange: (status) => console.log('Connection:', status),
  onMessage: (msg) => console.log('Message:', msg),
  onTranscript: (text, isFinal) => console.log('Transcript:', text),
});

// Connect and start AI conversation
await client.connect();
await client.startConversation();

// Later
await client.endConversation();
client.disconnect();

Multi-Participant Room

// Host creates a room
await client.connect();
const room = await client.createRoom('Alice', 'Team Standup', 'normal-meeting');
console.log('Room code:', room.roomCode); // share this code

// Another user joins
await otherClient.connect();
await otherClient.joinRoom(room.roomCode, 'Bob');

// Host controls
await client.muteParticipant(sessionId, targetClientId);
await client.lockRoom(sessionId, true);
await client.toggleTranscription(sessionId, true);

// AI features (text-only in normal-meeting mode)
await client.askAiText(sessionId, 'Summarize what was discussed');
await client.generateSummary(sessionId);
await client.generateMinutes(sessionId);
await client.addBookmark(sessionId, 'Key decision made', true);

Configuration

`MayaVoiceConfig`

| Field | Type | Required | Description | |-------|------|----------|-------------| | appKey | string | Yes | API key from the dashboard | | serverUrl | string | Yes | wss://media.voxera-voice.com | | agentId | string | No | Agent ID (created in the dashboard). Defines AI persona, model, voice, tools, and endpoints | | userId | string | No | Identify the end-user for analytics | | metadata | Record<string, string> | No | Custom key-value pairs passed to agent endpoints | | videoConfig | VideoConfig | No | Camera/video settings | | screenShareConfig | ScreenShareConfig | No | Screen share settings | | connectionOptions | ConnectionOptions | No | Reconnect/timeout settings |

`VideoConfig`

| Field | Type | Default | Description | |-------|------|---------|-------------| | enabled | boolean | false | Enable video on connect | | width | number | 640 | Video width | | height | number | 480 | Video height | | frameRate | number | 30 | Frames per second | | facingMode | string | user | user (front) · environment (back) | | enableVideoAI | boolean | false | Send video frames to AI pipeline |

`ConnectionOptions`

| Field | Type | Default | Description | |-------|------|---------|-------------| | autoReconnect | boolean | true | Auto-reconnect on disconnect | | reconnectAttempts | number | 5 | Max reconnect attempts | | reconnectDelay | number | 1000 | Delay between attempts (ms) | | timeout | number | 10000 | Connection timeout (ms) | | iceServers | RTCIceServer[] | — | Custom ICE/TURN servers |

Events

| Event | Payload | Description | |-------|---------|-------------| | connection:status | ConnectionStatus | idle → connecting → connected → disconnected | | conversation:status | ConversationStatus | idle → starting → active → ending → ended | | speaking:status | SpeakingStatus | none · user · ai | | message | ConversationMessage | New chat message | | transcript | (text, isFinal) | Real-time speech transcription | | participant:joined | RoomParticipant | Someone joined the room | | participant:left | RoomParticipant | Someone left the room | | participants:updated | RoomParticipant[] | Full participant list update | | you:muted | — | You were muted by the host | | you:removed | — | You were removed from the room | | host:changed | — | Host was transferred | | meeting:ended | — | Meeting ended by host | | room:locked | boolean | Room lock status changed | | transcription:toggled | boolean | Transcription toggled by host | | transcription:live | TranscriptionEntry | Live transcription entry | | ask-ai:started | — | AI voice response started | | ask-ai:processing | — | AI is processing | | ask-ai:cancelled | — | AI voice response cancelled | | ask-ai-text:started | — | AI text response started | | ask-ai-text:chunk | string | Streamed AI text chunk | | ask-ai-text:response | string | Complete AI text response | | summary:generated | MeetingSummary | AI summary ready | | minutes:generated | MeetingMinutes | AI minutes ready | | bookmark:added | MeetingBookmark | Bookmark created | | bookmark:removed | string | Bookmark removed (ID) | | waiting-room:updated | WaitingRoomEntry[] | Waiting room list changed | | waiting-room:toggled | boolean | Waiting room enabled/disabled |

Methods

Connection

| Method | Description | |--------|-------------| | connect() | Connect to the media server | | disconnect() | Disconnect and clean up | | connectSocketOnly() | Connect socket without WebRTC (for room setup) | | setupRoomWebRTC() | Initialize WebRTC after joining a room |

Conversation

| Method | Description | |--------|-------------| | startConversation() | Start voice conversation | | endConversation() | End voice conversation | | sendMessage(content) | Send a text message |

Media

| Method | Description | |--------|-------------| | setMuted(muted) | Mute/unmute microphone | | enableVideo() | Start camera | | disableVideo() | Stop camera | | toggleVideo() | Toggle camera on/off | | startScreenShare() | Start screen sharing | | stopScreenShare() | Stop screen sharing | | toggleScreenShare() | Toggle screen sharing | | getStats() | Get WebRTC connection stats |

Host Controls (requires host role)

| Method | Description | |--------|-------------| | muteParticipant(sessionId, targetId) | Mute a participant | | muteAll(sessionId) | Mute all participants | | unmuteAll(sessionId) | Unmute all participants | | removeParticipant(sessionId, targetId) | Remove a participant | | lockRoom(sessionId, locked) | Lock/unlock the room | | endMeeting(sessionId) | End the meeting for everyone | | transferHost(sessionId, targetId) | Transfer host role | | toggleTranscription(sessionId, enabled) | Enable/disable transcription | | enableWaitingRoom(sessionId, enabled) | Enable/disable waiting room | | admitParticipant(sessionId, targetId) | Admit from waiting room | | denyParticipant(sessionId, targetId) | Deny from waiting room | | admitAll(sessionId) | Admit all waiting participants |

AI Features

| Method | Description | |--------|-------------| | askAi(sessionId) | Ask AI to respond with voice (ai-meeting) | | cancelAskAi(sessionId) | Cancel ongoing AI voice response | | askAiText(sessionId, prompt?) | Ask AI for a streamed text response | | generateSummary(sessionId) | Generate AI meeting summary | | generateMinutes(sessionId) | Generate AI meeting minutes | | addBookmark(sessionId, label, isActionItem?) | Add a bookmark | | removeBookmark(sessionId, bookmarkId) | Remove a bookmark | | getBookmarks(sessionId) | Get all bookmarks | | getTranscript(sessionId) | Get full transcript | | getSummaries(sessionId) | Get all summaries | | getMinutes(sessionId) | Get all minutes |

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme