@unisphere/models-sdk-js

v1.9.1

Published

2 months ago

A TypeScript SDK for integrating Kaltura's real-time avatar streaming into your web application.

0High
0Medium
0Low

eransakalkaltura

diamond_darrell

Kaltura Avatar SDK

A TypeScript SDK for integrating Kaltura's real-time avatar streaming into your web application.

Features

🎭 Simple API - Create sessions and control avatars with just a few lines of code
🎥 WebRTC Streaming - Real-time avatar video
🗣️ Text-to-Speech - Send text for the avatar to speak, with LLM streaming support
🎵 Audio Support - Send MP3 files for the avatar to speak

Installation

npm install @unisphere/models-sdk-js

Authentication

You'll need a Kaltura Session (KS) to authenticate with the Avatar API.

Generate a KS: Follow the Kaltura Session Creation Guide

⚠️ Security Warning For production applications, never expose your Kaltura Session in client-side code. Use the backend-created session pattern where the KS stays on your server.

Quick Start

The SDK supports two initialization patterns depending on where the session is created.

Option 1: Frontend-Only (demos, prototypes)

The SDK creates the session using your Kaltura Session directly from the browser.

import { KalturaAvatarSession } from '@unisphere/models-sdk-js';

const session = new KalturaAvatarSession('your-kaltura-session', {
  baseUrl: 'https://api.avatar.us.kaltura.ai/v1/avatar-session',
});

await session.createSession({
  avatarId: 'avatar-123',
  voiceId: 'voice-456',           // optional
  videoContainerId: 'avatar-container', // auto-attaches video element
});

await session.sayText('Hello from Kaltura Avatar!');
await session.endSession();

Option 2: Backend-Created Session (recommended for production)

Your backend creates the session and passes the credentials to the frontend. The Kaltura Session never leaves your server.

Backend (Node.js):

const response = await fetch('https://api.avatar.us.kaltura.ai/v1/avatar-session/create', {
  method: 'POST',
  headers: {
    Authorization: `ks ${process.env.AVATAR_KS}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    clientId: 'kaltura-avatar-sdk',
    visualConfig: { id: 'avatar-123' },
    voiceConfig: { id: 'voice-456', modelId: 'eleven_flash_v2_5' },
  }),
});

const { sessionId, token } = await response.json();
// Send sessionId and token to the frontend

Frontend:

import { KalturaAvatarSession } from '@unisphere/models-sdk-js';

const { sessionId, token } = await fetch('/api/create-avatar-session').then(r => r.json());

// No apiKey needed — backend already created the session
const session = new KalturaAvatarSession(undefined, {
  baseUrl: 'https://api.avatar.us.kaltura.ai/v1/avatar-session',
});

await session.initSession(
  { sessionId, token },
  { videoContainerId: 'avatar-container' },
);

await session.sayText('Hello from Kaltura Avatar!');
await session.endSession();

HTML Setup (both options):

<div id="avatar-container" style="width: 512px; height: 512px;"></div>

API Reference

Constructor

new KalturaAvatarSession(apiKey?: string, config?: AvatarConfig)

apiKey (optional) — Your Kaltura Session (KS). Pass it for FE-only flow; omit it when backend creates sessions.
config — Optional configuration:
- baseUrl — Avatar API base URL
- logLevel — 'debug' | 'info' | 'warn' | 'error' (default: 'info')
- retryConfig — Retry/backoff settings for API calls
- unloadBeacon — Auto end session on page unload (default: true)

`createSession(options)`

Creates a session using the SDK (requires apiKey). Establishes WebRTC connection and optionally auto-attaches video.

await session.createSession({
  avatarId: 'avatar-123',
  voiceId: 'voice-456',           // optional
  videoContainerId: 'avatar-container', // optional, auto-attaches video
});

`initSession(session, options?)`

Initializes a pre-created session from the backend. No apiKey required.

await session.initSession(
  { sessionId: 'session-123', token: 'jwt-token' },
  { videoContainerId: 'avatar-container' }, // optional
);

`attachAvatar(containerId)`

Manually attaches the avatar video to a container div (creates a <video> element inside it).

session.attachAvatar('avatar-container');

`sayText(text, turnId?, isFinal?)`

Makes the avatar speak the provided text.

// Simple usage
await session.sayText('Hello, how can I help you?');

// LLM streaming — use the same turnId for all chunks, isFinal: true on the last one
const turnId = `turn-${Date.now()}`;
await session.sayText('Hello, ', turnId, false);
await session.sayText('how can I help you?', turnId, true);

| Parameter | Type | Default | Description | | --------- | ------- | ------- | ----------- | | text | string | — | Text to speak | | turnId | string | auto | Identifies a speech turn. Use the same ID for all chunks of one turn. | | isFinal | boolean | true | false signals more chunks are coming; true triggers speech. |

`sayAudio(audioFile, turnId, duration)`

Makes the avatar speak from an MP3 file or Blob. Audio must be MP3 at 44.1 kHz.

const arrayBuffer = await audioBlob.arrayBuffer();
const audioCtx = new AudioContext();
const decoded = await audioCtx.decodeAudioData(arrayBuffer);
await audioCtx.close();
const duration = decoded.duration;

const audioFile = new File([audioBlob], 'speech.mp3', { type: 'audio/mpeg' });
const turnId = `turn-${Date.now()}`;
await session.sayAudio(audioFile, turnId, duration);

`interrupt()`

Interrupts the avatar's current speech immediately.

await session.interrupt();

`endSession()`

Ends the session and cleans up all resources (stops keep-alive, disconnects WebRTC, ends backend session).

await session.endSession();

⚠️ Always call endSession() when you're done. Sessions left open consume backend resources and count against your usage. The SDK automatically sends a best-effort beacon on page unload, but calling endSession() explicitly is the only reliable guarantee.

State Methods

session.getSessionId();      // string | null
session.getSessionState();   // IDLE | CREATING | READY | ENDED | ERROR
session.getConnectionState(); // DISCONNECTED | CONNECTING | CONNECTED | FAILED | CLOSED

Events

session.on('stateChange', (state: SessionState) => { ... });
session.on('connectionChange', (state: ConnectionState) => { ... });
session.on('speakingStart', () => { /* avatar started speaking */ });
session.on('speakingEnd', () =>   { /* avatar stopped speaking */ });
session.on('error', (error: AvatarError) => {
  console.error(error.code, error.message);
});

Complete Example

import { KalturaAvatarSession, SessionState, ConnectionState, AvatarError } from '@unisphere/models-sdk-js';

const session = new KalturaAvatarSession('your-kaltura-session', {
  baseUrl: 'https://api.avatar.us.kaltura.ai/v1/avatar-session',
  logLevel: 'info',
});

session.on('stateChange', (state: SessionState) => {
  console.log('State:', state);
});

session.on('connectionChange', (state: ConnectionState) => {
  console.log('Connection:', state);
});

session.on('speakingStart', () => console.log('Avatar speaking...'));
session.on('speakingEnd', () =>   console.log('Avatar done speaking'));

session.on('error', (error: AvatarError) => {
  console.error('Error:', error.code, error.message);
});

try {
  await session.createSession({
    avatarId: 'avatar-123',
    voiceId: 'voice-456',
    videoContainerId: 'avatar-container',
  });

  console.log('Session ready:', session.getSessionId());

  await session.sayText('Welcome to Kaltura Avatar!');

  // Interrupt if needed
  await session.interrupt();

  // Send an audio file (MP3, 44.1 kHz)
  const turnId = `turn-${Date.now()}`;
  const audioFile = new File([audioBlob], 'speech.mp3', { type: 'audio/mpeg' });
  const duration = 3.0; // seconds — use AudioContext.decodeAudioData() in real usage
  await session.sayAudio(audioFile, turnId, duration);

  await session.endSession();
} catch (error) {
  if (error instanceof AvatarError) {
    console.error('Avatar error:', error.code, error.message);
  }
}

Browser Support

Chrome/Edge 80+, Firefox 75+, Safari 14+. Requires WebRTC.

License

AGPL-3.0

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme